Purpose of <nameExpr> in functionList.xml
-
Hi gang - I might be missing something simple so thought it worthy of a quick question.
I’ve successfully defined a custom language and added a parser for pulling out functions in functionList.xml. It is displaying the functions nicely - but I am trying to tweak sorting the functions.
We’re using the convention of starting the name with an underscore (_) if the function is a ‘private’ function that should only be called within that module (file), and without an underscore if the function is ‘public’. The functions are freely mixed in order in the module.
What I’d like to do is be able to sort the function list alphanumerically ignoring the leading underscore of the function name (if present).
I thought that’s what <nameExpr> would allow me to do - allow me to match just the function name ignoring the leading underscore (if present) - but it doesn’t seem to work like that. Am I missing something fundamental?
My parser in functionList.xml:
<parser displayName="CiCode" id ="cicode_function" commentExpr="(?x)(?m-s:;(?!\$PATH).*?$)" > <function mainExpr="(?x) (?:FUNCTION\s+)\K\w+\(.*?\)" /> <functionName> <nameExpr expr="\_*\K.*" /> </functionName> </parser>
The “_*\K” is what I think should be discarding the leading underscore (if present) from the function name. (In fact it will discard any number of leading underscores - which is fine.)
Some sample code of a function with leading underscore:
INT FUNCTION _KRN_Alarm_InitCountQue(STRING sAlarmGroup = "") INT iQueTmp; INT iType; INT iData;
And without a leading underscore:
INT FUNCTION KRN_Alarm_AcknowledgeAllEnabled(INT iMonitor = -1) INT bEnabled;
When I show the function list and sort it alphanumerically all the functions with leading underscores appear before the functions without leading underscores. They should be mixed without regard for the leading underscore.
Have I misunderstood what <nameExpr> is used for? Or have I got the regex wrong?
Notepad++ v7.7.1
-
Hello, @rossjparker and All,
I’m not a specialist of
XML/HTML
but I think that you could simplify your parser as below :<parser displayName="CiCode" id ="cicode_function" commentExpr="(?-si);(?!\$PATH).*" > <function mainExpr="(?-i)FUNCTION\s+\K\w+?\(.*?\)" > <functionName> <nameExpr expr="_?\K.+" /> </functionName> </function> </parser>
Note that I changed all the regexes, as well as the line after
mainExpr
, from\>
to>
Now, if you do not have any additional extraction, from the result of the
mainExpr
regex, you could even use the minimum syntax, below :<parser displayName="CiCode" id="cicode_function" commentExpr="(?-si);(?!\$PATH).*" > <function mainExpr="(?-i)FUNCTION\s+_?\K\w+?\(.*?\)" /> </parser>
That I tested, in a simple text file, containing your examples of functions, with the syntax :
<association id="cicode_function" langID="0" /> <!-- L_TEXT -->
added in the
associationMap
nodeand… everything went fine ;-))
I assumed that the key word
FUNCTION
is always uppercase. If not the case, simply modify the(?-i)
modifier into(?i)
, before wordFUNCTION
, in the regex !Best Regards,
guy038
-
@guy038 - thanks so much for taking the time to reply. You sorted my problem out!
It was the line where you said:
Note that I changed all the regexes, as well as the line after
mainExpr
, from\>
to>
That was the key for me. I completely failed to spot my error where I terminated the
<function>
element prematurely by closing the tag with\>
meaning the subsequent<functionName>
and<nameExpr>
elements were not being evaluated at all! Hence my confusion about the purpose of<nameExpr>
as it didn’t seem to matter what I put in there; it had no effect on the function parser.For the record, here is what my parser looks like now:
<parser displayName="CiCode" id ="cicode_function" commentExpr="" > <function mainExpr="(?x) # free-spacing for commenting (?-i) # ignore case FUNCTION\s+ # CiCode functions marked by keyword 'function' (any case) \K # discard everything matched so far - 'function' keyword not part of function name \w+? # function name itself (non-greedy match) \(.*?\) # function parameter list surrounded by brackets " /> </parser>
Everything is working as I expect it to now - the function list is displaying nicely. I wasn’t actually able to achieve my original goal (display function names with the leading underscore (if present), but sort them alpha-numerically without regard for any leading underscores), but that’s fine - I can live with that. :)
Thank you also for the case-insensitive tip - yes it can be written as
function
orFUNCTION
.Interesting footnote: you will see the
commentExpr
string is empty. I had originally just copied it from another parser section as a starting template and not modified it. I don’t need it for my work (I have never seen CiCode files with comments between theFUNCTION
keyword and the function name, although technically it is possible). But if you delete thecommentExpr
line entirely the whole parser stops working!This is not what the documentation on https://notepad-plus-plus.org/features/function-list.html says:
comment
(sic): Optional. you can make a RE in this attribute in order to identify comment zones. The identified zones will be ignored by search.`
Who maintains the documentation pages? Is there any way for the community to contribute and fix errors on the documentation pages?
Thank you once again @guy038.
-
Hi, @rossjparker and All,
Pleased that everything went right ;-)) I just want to add two remarks !
First, I would like to point out that :
-
The syntax
(?-i)
modifier means that the subsequent part of the regex is treated in a NON-insensitive way ( So, sensitive ! ) -
Conversely, the
(?i)
modifier supposes that the subsequent part of the regex is treated in a insensitive way
So your line, regarding case, should be, as below :
(?i) # ignore case : (i)nsensitive
Now, you said :
(I have never seen CiCode files with comments between the FUNCTION keyword and the function name, although technically it is possible)
But, on the contrary, the
commentExpr
attribute allows to prevent from detecting any function embedded inside comment blocks !For instance; let’s imagine this sample text, below :
INT FUNCTION _KRN_Alarm_InitCountQue(STRING sAlarmGroup = "") INT iQueTmp; INT iType; INT iData; ; ---------------------------------------------------------- INT ; FUNCTION _KRN_Test_Ross() is NOT used anymore ! ; ---------------------------------------------------------- INT function KRN_Alarm_AcknowledgeAllEnabled(INT iMonitor = -1) INT bEnabled;
- With your present line, below :
commentExpr=""
You should see
3
function names, displayed in the Function List panel, although one of them is located inside a comment line- Whereas, with the line, below :
commentExpr="(?-si);(?!\$PATH).*"
Only
2
functions are, correctly, displayed, as expected !
Of course, if, in the sample test, you delete the comment symbol
;
only, in the line between the two comment lines of dashes, you should see, again, the function_KRN_Test_Ross()
!
Note that the present comment regex,
(?-si);(?!\$PATH).*
, means that a comment zone :-
Is a mono-line zone, due to the
(?-s)
modifier -
Begins with a semi-colon
;
symbol, at any position of current line -
Is followed with a range, possibly null, (
.*
) of standard characters, due to the(?-s)
modifier -
But ONLY IF the string
$PATH
, with this exact case( ?-i)
, does not appear right after the;
, due to the negative look-ahead(?!\$PATH)
Cheers
guy038
-
-
Thanks once again @guy038. Your valuable suggestions/corrections have been incorporated once again:
<parser displayName="CiCode" id ="cicode_function" commentExpr="(?x) # free-spacing for commenting (?s:\x2F\x2A.*?\x2A\x2F) # Multi Line Comment: /* ... */ | (?m-s:\x2F{2}.*$) # Single Line Comment: // ... | (?m-s:\!.*$) # Single Line Comment: ! ... | (?s:\x22(?:[^\x22\x5C]|\x5C.)*\x22) # String Literal - Double Quoted | (?s:\x27(?:[^\x27\x5C]|\x5C.)*\x27) # String Literal - Single Quoted " > <function mainExpr="(?x) # free-spacing for commenting (?i) # ignore case FUNCTION\s+ # CiCode functions marked by keyword 'function' (any case) \K # discard everything matched so far - 'function' keyword not part of function name \w+? # function name itself (non-greedy match) \(.*?\) # function parameter list surrounded by brackets " /> </parser>
This is my latest version of the parser with a proper
commentExpr
parser. (CiCode supports C-style comments with the addition of!<comment-to-end-of-line>
being synonymous with//<comment-to-end-of-line>
.One of my files does indeed have several commented functions - these are now correctly ignored. Thanks again.