Functionlist - Different results with different line endings
-
Hello everybody.
I try to write a parser for universe basic files with a UNIX line ending which should list the labels. Some labels are found, some not. The only difference between the labels is that some directly end with a line ending and the other are followed by a comment. The labels followed by a comment are listed, the other not. After changing the line ending from Unix (LF) to Windows (CR + LF) all labels are listed. Here is an example:
label1:\n
Label not listed.
label1: ;* comment\n
Label listed.
Now change the line ending to Windows:
label1:\r\n
Label listed.
label1: ;* comment\r\n
Label listed.
Can somebody confirm this behavior? is there a way to list labels directly followed by a UNIX line ending?
Sincerely Bernd
-
Yes there is a way. It depends how you implemented the parser. So it might be helpful if you post your parser.
-
I copied the parser from BAT-files and change it:
<parser id ="universe-basic" displayName="UniVerse Basic" commentExpr="(^\*.*?$)" > <function mainExpr="^[\w.]+\:" > <!-- <functionName> <nameExpr expr=".*:" /> </functionName> --> </function> </parser>
I only use the mainExpr to list the labels. Maybe this is also a reason the not listed labels.
-
Escape of colon (
\:
) is not wrong but not required either. Can’t find anything else.My take:
<parser displayName="UniVerse BASIC" id ="universe_basic" commentExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?m-s: (?:^|;) # at start-of-line or after end-of-statement \h* # optional leading whitespace (?-i:REM\b|\x24\x2A|[\x21\x2A]) # Single Line Comment 1..4 .*$ # whatever, until end-of-line ) | (?:\x22[^\x22\r\n]*\x22) # String Literal - Double Quoted | (?:\x27[^\x27\r\n]*\x27) # String Literal - Single Quoted | (?:\x5C[^\x5C\r\n]*\x5C) # String Literal - Backslash Quoted " > <function mainExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?m-i)^ # case-sensitive, NO leading whitespace (?: \d+\b(?=:?) # completely numeric label, colon optional + discarded | [A-Za-z_][\w.$%]*(?=:) # alphanumeric label, colon required + discarded ) " /> </parser>
-
Many thanks for your solution. Your Parser works fine.
It also gives me a lot of stuff to study. I never heard of Pattern Modifiers before. I looked for Pattern Modifiers and found a tutorial on regular-expressions.info . I’ll try to work it through and learn more about Regular Expressions.
Sincerely Bernd