Trouble making functionlist parser for user defined language MOX



  • I have a custom user defined language called MOX. I am trying to add the parser for it in the functionList.xml. The problem is that it is only showing the first function in my file.

    I’m sure I’m just missing something basic, but I’m not sure if it’s in my regex or in my understanding of how the NP++ function list works.

    My additions to functionList.xml:

    <association id=	"mox_syntax"				userDefinedLangName="MOX"		/>
    <association id=	"mox_syntax"				ext=".mox"						/>
    
    <!-- ========================================================= [ MOX ] -->
    			
    <parser
    	id         ="mox_syntax"
    	displayName="MOX"
    >
    	<function
    		mainExpr="(^|\s+)(Function|Method|Macro) (\w+)(\.\w+)*\((.*)\)"
    	>
    		<functionName>
    			<nameExpr expr="([A-Za-z_]?\w*(\.\w+)*\s*\()" />
    		</functionName>
    	</function>
    </parser>
    

    Sample MOX code:

    Method Blah1()
    	[New] dynamicVar = "string"
    	[New] dynamicVar2 = 4
    	[New] result = Bar2(dynamicVar, dynamicVar2)
    End Method
    
    Function Bar2(pParam3, pParam2)
    	Foo3
    	Return pParam3 && pParam2
    End Function
    
    Macro Foo3()
    End Macro
    

    Function list display:

    sample.mox
      |--->  Blah1(
    


  • @bookwyrm12

    Based on your parser and source example I presumed the name of a function, method and macro can contain (base/parent-)class-name(s).
    I adapted your parser and added some comment.

    <parser
        displayName="MOX"
        id         ="mox_syntax"
    >
        <function
            mainExpr="(?x)                       # free-spacing (see `RegEx - Pattern Modifiers`)
                    (?m)                         # ^ and $ match at line breaks
                    ^\h*                         # optional leading whitespace at start-of-line
                    (?i:FUNCTION|METHOD|MACRO)   # start-of indicator, case-insensitive
                    \s+                          # trailing whitespace required
                    \K                           # discard text matched so far
                    (?:(?&amp;VALID_ID)\.)*      # optional class identifier prefix
                    (?'VALID_ID'                 # valid identifier, use as subroutine
                        [A-Za-z_]\w*             # valid character combination for identifiers
                    )
                    \s*\(                        # start-of-parameter-list indicator
                    [^)]*                        # parameter list
                    \)                           # end-of-parameter-list indicator
                "
        >
            <functionName>
                <nameExpr expr="(?x)             # free-spacing (see `RegEx - Pattern Modifiers`)
                        (?:(?&amp;VALID_ID)\.)*  # optional class identifier prefix
                        \K                       # discard text matched so far
                        (?'VALID_ID'             # valid identifier, use as subroutine
                            [A-Za-z_]\w*         # valid character combination for identifiers
                        )
                    "
                />
            </functionName>
            <className>
                <nameExpr expr="(?x)             # free-spacing (see `RegEx - Pattern Modifiers`)
                        (?'VALID_ID'             # valid identifier, use as subroutine
                            [A-Za-z_]\w*         # valid character combination for identifiers
                        )
                        (?:\.(?&amp;VALID_ID))*  # optional sub-class-name(s)
                        (?=\.)                   # exclude last (sub-)class-method-name separator
                    "
                />
            </className>
        </function>
    </parser>
    


  • @MAPJe71

    Thank you so much, that’s brilliant! (And yes, correct assumption regarding base class names.)

    Looking at other parsers I’ve noticed the commentExpr attribute, and I was wondering what that does? Does it allow functions that are commented out to still show up in the function list? Or does it do the opposite?



  • @bookwyrm12

    The commentExpr attribute is intended to have the parser discard the text it matches.
    See also my comment in this topic.



  • @MAPJe71

    So, for example, if this language only has single line comments denoted by a single quote at the beginning of the line, like so:

    Method Blah()
        'Single line comment
    End Method
    
    'Commented out method:
    'Method Blah2()
    'End Method
    

    Then it’s unnecessary to add the commentExpr attribute, because the 'Method Blah2() line doesn’t match the parser for functions anyways (because it’s matching on the beginning of a line or leading whitespace, so any character other than those don’t match)… right?



  • @bookwyrm12

    because it’s matching on the beginning of a line or leading white space

    That’s the explanation of the (^|\s+)(Function... in the main expression of your parser (presuming the ^ matches at line breaks):
    ^ : match at the beginning of a line
    | : OR
    \s+ : leading white space (i.e. space, tab, newline, carriage return, vertical tab)
    Function... : followed by the keyword(s)

    But without the commentExpr attribute it can still match out-commented functions e.g.:

    '   Function ExampleFunction()
        End Function
    

    The ExampleFunction() does not match ^Function... but it does match the alternative expression track \s+Function....

    With a “(?m-s)^'.*$” set as commentExpr attribute it will discard all lines starting with a single quote and thus not match the above ExampleFunction().

    The explanation of the (?m)^\h*(Function... in the main expression of my parser:
    (?m) : have the ^ match at line breaks
    ^ : match at the beginning of line AND (i.e. one expression track)
    \h* : optional leading white space (i.e. space, tab)
    Function... : followed by the keyword(s)
    This will prevent the above ExampleFunction() from being a match even without the commentExpr attribute set.



  • @MAPJe71

    First of all, thanks so much for all of your help!
    I have a couple more questions:

    1. Does the Notepad++ function list only support one level of hierarchy? (I’m sort of assuming this is true, but if it does support multiple levels that would be awesome!)
      ie. If I have functions in this sort of format:
      Method Blah()
          [New] dynamicVar = "string"
          [New] dynamicVar2 = 4
          [New] result = Blah.Bar(dynamicVar, dynamicVar2)
      End Method
      
      Function Blah.Bar(pParam3, pParam2)
          Blah.Bar.Foo1
          Return pParam3 && pParam2
      End Function
      
      Method Blah.Bar.Foo1()
      End Method
      
      Method Blah.Bar.Foo2()
      End Method
      
      Macro Blah.Bar.Foo3()
      End Macro
      
      Method Blah.Bar.Foo3.Test1()
      End Method
      
      Method Blah.Bar.Foo3.Test2()
      End Method
      
      …I get a functionlist that looks like this:
      sample.mox
        |--> Blah
              |--> Bar
        |--> Blah.Bar
              |--> Foo1
      		|--> Foo2
      		|--> Foo3
        |--> Blah.Bar.Foo3
              |--> Test1
      		|--> Test2
      
    2. I’m looking into getting this language parser into the Notepad++ functionlist source code. How can I do this?


  • @bookwyrm12

    1. Well actually two levels: class and method/function (using C++ terminology);
    2. With a pull-request for the Notepad++ GitHub repository or I could add it to my Function List Update 3.

    I get: