Functionlist - Different results with different line endings

Bernd Peters-Hagemann

Hello everybody.

I try to write a parser for universe basic files with a UNIX line ending which should list the labels. Some labels are found, some not. The only difference between the labels is that some directly end with a line ending and the other are followed by a comment. The labels followed by a comment are listed, the other not. After changing the line ending from Unix (LF) to Windows (CR + LF) all labels are listed. Here is an example:

label1:\n

Label not listed.

label1:     ;* comment\n

Label listed.

Now change the line ending to Windows:

label1:\r\n

Label listed.

label1:     ;* comment\r\n

Label listed.

Can somebody confirm this behavior? is there a way to list labels directly followed by a UNIX line ending?

Sincerely Bernd

MAPJe71

Yes there is a way. It depends how you implemented the parser. So it might be helpful if you post your parser.

Bernd Peters-Hagemann

I copied the parser from BAT-files and change it:

<parser
    id         ="universe-basic"
    displayName="UniVerse Basic"
    commentExpr="(^\*.*?$)"
>
    <function
        mainExpr="^[\w.]+\:"
    >
    <!--
        <functionName>
            <nameExpr expr=".*:" />
        </functionName>
    -->
    </function>
</parser>

I only use the mainExpr to list the labels. Maybe this is also a reason the not listed labels.

MAPJe71

Escape of colon (\:) is not wrong but not required either. Can’t find anything else.

My take:

			<parser
				displayName="UniVerse BASIC"
				id         ="universe_basic"
				commentExpr="(?x)                                               # Utilize inline comments (see `RegEx - Pattern Modifiers`)
						(?m-s:
							(?:^|;)                                             # at start-of-line or after end-of-statement
							\h*                                                 # optional leading whitespace
							(?-i:REM\b|\x24\x2A|[\x21\x2A])                     # Single Line Comment 1..4
							.*$                                                 # whatever, until end-of-line
						)
					|	(?:\x22[^\x22\r\n]*\x22)                                # String Literal - Double Quoted
					|	(?:\x27[^\x27\r\n]*\x27)                                # String Literal - Single Quoted
					|	(?:\x5C[^\x5C\r\n]*\x5C)                                # String Literal - Backslash Quoted
					"
			>
				<function
					mainExpr="(?x)                                              # Utilize inline comments (see `RegEx - Pattern Modifiers`)
							(?m-i)^                                             # case-sensitive, NO leading whitespace
							(?:
								\d+\b(?=:?)                                     # completely numeric label, colon optional + discarded
							|	[A-Za-z_][\w.$%]*(?=:)                          # alphanumeric label, colon required + discarded
							)
						"
				/>
			</parser>

Bernd Peters-Hagemann

Many thanks for your solution. Your Parser works fine.

It also gives me a lot of stuff to study. I never heard of Pattern Modifiers before. I looked for Pattern Modifiers and found a tutorial on regular-expressions.info. I’ll try to work it through and learn more about Regular Expressions.

Sincerely Bernd