Trouble making functionlist parser for user defined language MOX
-
I have a custom user defined language called MOX. I am trying to add the parser for it in the
functionList.xml
. The problem is that it is only showing the first function in my file.I’m sure I’m just missing something basic, but I’m not sure if it’s in my regex or in my understanding of how the NP++ function list works.
My additions to
functionList.xml
:<association id= "mox_syntax" userDefinedLangName="MOX" /> <association id= "mox_syntax" ext=".mox" /> <!-- ========================================================= [ MOX ] --> <parser id ="mox_syntax" displayName="MOX" > <function mainExpr="(^|\s+)(Function|Method|Macro) (\w+)(\.\w+)*\((.*)\)" > <functionName> <nameExpr expr="([A-Za-z_]?\w*(\.\w+)*\s*\()" /> </functionName> </function> </parser>
Sample MOX code:
Method Blah1() [New] dynamicVar = "string" [New] dynamicVar2 = 4 [New] result = Bar2(dynamicVar, dynamicVar2) End Method Function Bar2(pParam3, pParam2) Foo3 Return pParam3 && pParam2 End Function Macro Foo3() End Macro
Function list display:
sample.mox |---> Blah1(
-
Based on your parser and source example I presumed the name of a function, method and macro can contain (base/parent-)class-name(s).
I adapted your parser and added some comment.<parser displayName="MOX" id ="mox_syntax" > <function mainExpr="(?x) # free-spacing (see `RegEx - Pattern Modifiers`) (?m) # ^ and $ match at line breaks ^\h* # optional leading whitespace at start-of-line (?i:FUNCTION|METHOD|MACRO) # start-of indicator, case-insensitive \s+ # trailing whitespace required \K # discard text matched so far (?:(?&VALID_ID)\.)* # optional class identifier prefix (?'VALID_ID' # valid identifier, use as subroutine [A-Za-z_]\w* # valid character combination for identifiers ) \s*\( # start-of-parameter-list indicator [^)]* # parameter list \) # end-of-parameter-list indicator " > <functionName> <nameExpr expr="(?x) # free-spacing (see `RegEx - Pattern Modifiers`) (?:(?&VALID_ID)\.)* # optional class identifier prefix \K # discard text matched so far (?'VALID_ID' # valid identifier, use as subroutine [A-Za-z_]\w* # valid character combination for identifiers ) " /> </functionName> <className> <nameExpr expr="(?x) # free-spacing (see `RegEx - Pattern Modifiers`) (?'VALID_ID' # valid identifier, use as subroutine [A-Za-z_]\w* # valid character combination for identifiers ) (?:\.(?&VALID_ID))* # optional sub-class-name(s) (?=\.) # exclude last (sub-)class-method-name separator " /> </className> </function> </parser>
-
Thank you so much, that’s brilliant! (And yes, correct assumption regarding base class names.)
Looking at other parsers I’ve noticed the
commentExpr
attribute, and I was wondering what that does? Does it allow functions that are commented out to still show up in the function list? Or does it do the opposite? -
The
commentExpr
attribute is intended to have the parser discard the text it matches.
See also my comment in this topic. -
So, for example, if this language only has single line comments denoted by a single quote at the beginning of the line, like so:
Method Blah() 'Single line comment End Method 'Commented out method: 'Method Blah2() 'End Method
Then it’s unnecessary to add the
commentExpr
attribute, because the'Method Blah2()
line doesn’t match the parser for functions anyways (because it’s matching on the beginning of a line or leading whitespace, so any character other than those don’t match)… right? -
because it’s matching on the beginning of a line or leading white space
That’s the explanation of the
(^|\s+)(Function...
in the main expression of your parser (presuming the^
matches at line breaks):
^
: match at the beginning of a line
|
: OR
\s+
: leading white space (i.e. space, tab, newline, carriage return, vertical tab)
Function...
: followed by the keyword(s)But without the
commentExpr
attribute it can still match out-commented functions e.g.:' Function ExampleFunction() End Function
The
ExampleFunction()
does not match^Function...
but it does match the alternative expression track\s+Function...
.With a “
(?m-s)^'.*$
” set ascommentExpr
attribute it will discard all lines starting with a single quote and thus not match the aboveExampleFunction()
.The explanation of the
(?m)^\h*(Function...
in the main expression of my parser:
(?m)
: have the^
match at line breaks
^
: match at the beginning of line AND (i.e. one expression track)
\h*
: optional leading white space (i.e. space, tab)
Function...
: followed by the keyword(s)
This will prevent the aboveExampleFunction()
from being a match even without thecommentExpr
attribute set. -
First of all, thanks so much for all of your help!
I have a couple more questions:- Does the Notepad++ function list only support one level of hierarchy? (I’m sort of assuming this is true, but if it does support multiple levels that would be awesome!)
ie. If I have functions in this sort of format:
…I get a functionlist that looks like this:Method Blah() [New] dynamicVar = "string" [New] dynamicVar2 = 4 [New] result = Blah.Bar(dynamicVar, dynamicVar2) End Method Function Blah.Bar(pParam3, pParam2) Blah.Bar.Foo1 Return pParam3 && pParam2 End Function Method Blah.Bar.Foo1() End Method Method Blah.Bar.Foo2() End Method Macro Blah.Bar.Foo3() End Macro Method Blah.Bar.Foo3.Test1() End Method Method Blah.Bar.Foo3.Test2() End Method
sample.mox |--> Blah |--> Bar |--> Blah.Bar |--> Foo1 |--> Foo2 |--> Foo3 |--> Blah.Bar.Foo3 |--> Test1 |--> Test2
- I’m looking into getting this language parser into the Notepad++ functionlist source code. How can I do this?
- Does the Notepad++ function list only support one level of hierarchy? (I’m sort of assuming this is true, but if it does support multiple levels that would be awesome!)
-
- Well actually two levels: class and method/function (using C++ terminology);
- With a pull-request for the Notepad++ GitHub repository or I could add it to my Function List Update 3.
I get: