Trouble making a functionList parser for MATLAB



  • I fail to spot the mistakes, which break my <parser> for MATLAB

    System: Windows7, Notepad++ v7.3.1(32-bit).

    After a great deal of experimenting and cutting down, these are my additions to functionList.xml. (To avoid word-wrap in the preview, I have truncated some comments.)

    Results so far

    m-file function list
    function_scalar_return.m displays “[ out, val ] = function_scalar_return( val )” as expected
    function_multi_return.m displays “[ out, val ] = function_multi_return( val )” as expected
    function_no_return_comments.m displays “function_no_return( val )” and "function_no_return_C2( val )"
    class_basic_no_return.m displays only the filename, “class_basic_no_return.m”; nothing of the content
    class_basic_handle.m displays only the filename, “class_basic_handle.m”; nothing of the content

    .
    With only the <function> block in the <parser>, functionList honors multi-line-comments; “function_no_return_C2( val )” is not shown!

    I have “verified” the effect of all regular expressions with TextFX Quick Find/Replace. Is there a better way? It would be valuable to be able to dump the text-buffer to a log-file after each “operation”.

    I have an awkward feeling that I’m missing something basic and need help.

    Best regards
    Per

    <association langID="44" id="matlab" /> 
    
    <parser 
        id          = "matlab" 
        displayName = "Matlab Node" 
        commentExpr = "(?s:%\{.*?%\})|(?m-s:%.*$)"             
    >
        <classRange
            mainExpr = "(?x)(?s-m)          # '.' matches new line; '^','$' match beginning ...
                        classdef\h+\K       # match until 'classdef' followed by space and ...
                        [A-Za-z][\w_]+      # match valid matlab name
                        .*?$                # match whatever, up to the end of the text buffer
                        "
        >
            <className>
                <nameExpr   
                    expr = "(?x)(?m-s)          # '.' doesn't match new line; '^','$' match ...
                            ^[A-Za-z][\w_]+     # match valid matlab name
                            " 
                />
            </className>
            <function
                mainExpr = "(?x)(?s-m)      # '.' matches new line; '^','$' match beginning ...
                            methods\K       # match until first 'methods' and discard it 
                            .+$             # match whatever, until end of text buffer
                            " 
            >
                <functionName>
                    <funcNameExpr   
                        expr = "(?x)(?m-s)      # '.' doesn't match new line; '^','$' match ...
                                ^\h{2,}         # match at least two leading whitespaces 
                                function        # match 'function'
                                \h+\K           # match whitespaces and discard what's ...
                                [^%;]+          # match whatever, until '%',';' or the ...
                                "
                    /> 
                </functionName>
            </function>
        </classRange>
        <function
            mainExpr = "(?x)(?m-s)      # '.' doesn't match new line; '^','$' match ...
                        ^\h*function    #  match 'function' preceeded by optional whitespace
                        \h+\K           # match whitespaces and discard what's matched ...
                        .*$             # math the rest of the line
                        "
        >
            <functionName>
                <nameExpr 
                    expr = "(?x)(?m-s)  # '.' doesn't match new line; '^','$' match begin...
                            [A-Za-z[]  # match the first char of a valid function signature  
                            [^%;]*+     # match whatever, until '%',';' or the end of the line
                            "
                />
            </functionName>
        </function>
    </parser>
    

    Some MATLAB files, which I use for testing

    function_scalar_return.m

    function    [ out, val ] = function_scalar_return( val )
        out = 17;
    end        
    

    function_multi_return.m

    function    [ out, val ] = function_multi_return( val )
        out = 17;
    end        
    

    function_no_return_comments.m

    function    function_no_return_comments( val )
        [~]=17;
    end        
    % function    function_no_return_C1( val )
    %     [~]=17;
    % end        
    %{
    function    function_no_return_C2( val )
        [~]=17;
    end        
    %}
    

    class_basic_no_return.m

    classdef    class_basic_no_return
        methods
            function    no_return( this )
            end        
        end
    end
    

    class_basic_handle.m

    classdef    class_basic_handle <  handle
        properties
            prop
        end
        methods
            function this = basic_handle( val )
                this.prop = val;
            end
            function    no_return( this )
            end        
            function    out = scalar_return( this )
                out = 17;
            end
        end
    end


  • Now I have updated to Notepad++ v7.3.3 (32-bit) and pasted my Matlab stuff into the new functionList.xml. However, that didn’t change anything regarding the behavior of my <parser> :-(

    Best Regards
    Per



  • Nice job!
    I’ll have a look.



  • @Per-Isakson

    This should take care of displaying the classes as branches in the tree, haven’t figured out the block comment problem yet i.e. function_no_return_C2( val ) incorrectly being displayed.

    <parser
        displayName="Matlab Node"
        id         ="matlab"
        commentExpr="(?x)                        # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                        (?s:%\{.*%\})            # Multi Line Comment
                    |	(?m-s:%.*$)              # Single Line Comment
                    "
    >
        <classRange
            mainExpr="(?x)                       # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                    (?m)                         # ^ and $ match at line breaks
                    ^classdef\h+                 # start-of-class indicator at start-of-line
                    (?s:.*?)                     # whatever, until...
                    ^end                         # end-of-class indicator at start-of-line
                "
        >
            <className>
                <nameExpr expr="(?x)             # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                        classdef\h+              # start-of-class indicator
                        \K                       # keep the text matched so far, out of the overall match
                        [A-Za-z]\w*              # valid identifier i.e. class name
                    "
                />
            </className>
            <function
                mainExpr="(?x)                   # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                        (?m)                     # ^ and $ match at line breaks
                        ^(?'INDENT'\h{2,})       # at least two whitespaces at start-of-line; store for back-reference
                        function\h+              # start-of-function indicator
                        \K                       # keep the text matched so far, out of the overall match
                        (?s:.*?)                 # whatever, until...
                        ^\k'INDENT'end\b         # end-of-function indicator with equal indent
                    "
            >
                <functionName>
                    <funcNameExpr expr="(?x)     # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                            (?m-s)               # ^ and $ match at line breaks; dot does not
                            .*                   # whatever, up till...
                            (?=;|$)              # semi-colon or end-of-line
                        "
                    />
                </functionName>
            </function>
        </classRange>
        <function
            mainExpr="(?x)                       # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                    (?m-s)                       # ^ and $ match at line breaks; dot does not
                    ^\h*                         # optional leading whitespace at start-of-line
                    function\h+                  # start-of-function indicator
                    \K                           # keep the text matched so far, out of the overall match
                    .*$                          # match the rest of the line
                "
        />
    </parser>
    

    @Claudia-Frank created a Python script to be used with the PythonScript plugin. It enables you to test a regular expression in Notepad++ jumping through every match.
    I’ve used RegEx Tester until I bought a license for RegexBuddy.

    For “class” definitions the main expression has to match the complete class i.e. including body.

    The parser first determines all the comment regions. Skipping these comment regions it searches for regions matching the class expression. Any remaining regions (i.e. not matching comment or class) are searched for (global) functions (or methods defined outside a class).
    Each class region is searched for its class name and methods.



  • @Per-Isakson

    just in case you are looking for the script.
    Here is the more user-friendly copy and paste version (not splitted)

    Cheers
    Claudia



  • @MAPJe71

    Thank you very much for your prompt and detailed answer. Your code works perfectly. Together with your explanation, it helped me better understand the working of functionList, which in turn effectively helped me to spot my mistakes.



  • @Claudia-Frank

    Thank you for the link.

    How can I earn two points to be allowed to send two post directly after each other.

    Cheers
    Per



  • @MAPJe71
    Will you take care of creating a functionlist PR for matlab out of this for N++?



  • @Per-Isakson glad to hear it works!

    Your code works perfectly.

    Does this mean the “block comment problem” is no longer an issue for you?

    @chcg Yes, I intend to include it in PR “Function List Update 3”.



  • @MAPJe71

    The “block comment problem” is still there, but it is NOT an issue to me. This problem exists even in the MATLAB IDE. Commenting out code is an anti-pattern. Thus, put it near the bottom of your priority list.

    “Yes, I intend to include it in PR “Function List Update 3””

    I like that! However, this MATLAB <parser> is an exercise. I guess, it will not be very helpful in serious work. Let me make a more complete version.

    A “full” <parser> as the first step was obviously a too big an undertaking for me. Now I intend to make a “full” <parser>. However, I see some problems.

    • MATLAB syntax allows continuation lines. Any line may be split into two or more lines. I have failed to handle this with MATLAB regular expressions. I’m not tempted to try again. (Minor issue.)
    • MATLAB syntax allows definition of class methods (/functions) in separate files. I don’t use that, but it is frequently used by The Mathworks and many users. I cannot see any possibility to handle this.
    • Indentation is not significant; end is used to close nearly all blocks; and ordinary functions may be defined at the end of class definition files. Together these three pose a major challenge. (My ^\h{2,} was a quick and dirty way to handle this.)

    I think, I will use the functionList in two different scenarios, which ask for different versions of the <parser>:

    • Navigation in code, which I know fairly well, e.g. my own old code. In this case I would prefer the functionList to show names only, which is cleaner and fits in a narrow plane.
    • Analysis of code, which I download from the Internet. In this case I think the full signature is more useful. (Like in my current exercise.)

    Conclusion: I’ll be back with a new version of the <parser>.



  • @Per-Isakson

    The “block comment problem” … put it near the bottom of your priority list.

    For me it’s not a MATLAB parser problem only it’s more a FunctionList engine problem …

    1. why does it work for other parsers (or at least seems to)?
    2. why does the comment expression work correctly with MATLAB source in RegexBuddy and with ClaudiaFrank’s RegexTester script in Notepad++ but apparently not with Notepad++ Function List engine?
    3. why does the comment expression work correctly when the classRange part of the parser is omitted?

    this MATLAB <parser> is an exercise … Let me make a more complete version.

    I expect “Function List Update 3” PR not to be the last one, so it’s possible to include it in a later update/PR or you could do your own PR.
    Either way I’d appreciate it when you keep me in the loop :)

    MATLAB syntax allows continuation lines.

    Could you give an example?

    MATLAB syntax allows definition of class methods (/functions) in separate files.

    Not supported as FunctionList operates per file.

    Indentation is not significant …

    That will make it a challenge to determine the end of a class definition! Determining the end of a function/method definition may not be necessary.

    I’ll be back with a new version of the <parser>.

    Looking forward to it!



  • @MAPJe71

    The “block comment problem” disappeared as a result of changes of the <parser>, which have no obvious coupling to comments. Strange!

    MATLAB syntax allows continuation lines. Could you give an example?

    function ...
        WeirdUseOfContinuationLine ( ...
        str )
        disp( str )
        % This runs as expexted.
    end
    

    I guess this could be handled with regex, but I will not try.



  • @Per-Isakson

    The “block comment problem” disappeared as a result of changes of the <parser>

    Strange indeed.
    I presume you applied the changes, could you post the updated parser (even if it’s not the final one)?

    I guess this could be handled with regex

    Possibly.

    but I will not try.

    Wise, keep it as challenge for an other revision :-)



  • Hello @guy038 and @MAPJe71

    Thank you for your answers, especially on Nested Pairs and recursion. Now I have a <parser>, which doesn’t rely on the indentation of the Matlab code and produces correct results for my first set of test files. However, keywords embedded in comments and in string values make the <parser> fail.

    A new <parser> and some test files

    <parser 
        id          = "matlab" 
        displayName = "Matlab Node" 
        commentExpr =  "(?x)                    # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                        (?s:%\{.*%\})           # Multi Line Comment
                        |(?m-s:\h*%.*$)         # Single Line Comment
                        "
    >    
        <classRange
            mainExpr = "(?x)(?s)                                    # dot matches new line
                        (?-i)                                       # case sensitive
                        (                                           # --- open 1st group 
                            \b                                      # word boundary  
                            (                                       # --- open 2nd group
                                    classdef                        # keywords that open a 
                                |   e(?:numeration|vents)           # Balanced Construct
                                |   f(?:or|unction)                 # that is closed by 
                                |   if                              # 'end'.
                                |   methods                         #
                                |   p(?:arfor|roperties)            #
                                |   switch                          #
                                |   try                             #
                                |   while                           #
                            )                                       # --- close 2nd group
                            \b                                      # word boundary 
                        )                                           # --- close 1st group
                                            
                        (                                           # open 3rd capturing group    
                            (?:                                     # open non-capturing group
                                (?!                                 # negative look-ahead; if not a keyword
                                    \b                              # word boundary  
                                    (                               # --- open 4th group
                                            classdef                # keywords, of the 2nd group + 'end',
                                        |   e(?:numeration|vents)   # which must NOT occur,
                                        |   f(?:or|unction)         # at ANY position, of the
                                        |   if                      # present sub-block, till
                                        |   methods                 # its associated 'end' closing word
                                        |   p(?:arfor|roperties)    #
                                        |   switch                  #
                                        |   try                     #
                                        |   while                   #
                                        |   end                     #
                                    )                               # --- close 4th group
                                    \b                              # word boundary 
                                ).                                  # then one character
                            )+                                      # repeat 
                            |                                       # until keyword found
                            (?R)                                    # recurse RE from start
                        )+                                          # repeat 3rd group
                        \bend\b                                     #      
                        "
        >
            <className>
                <nameExpr 
                    expr = "(?x)                    # ok, 10:07 2017-03-25
                            ^\h*                    # optional whitespace
                            classdef                # start-of-class indicator
                            \h+                     # whitespace
                            (                       #
                                \([^\)]*?\)         # (ClassAttributes)
                                \h+                 # whitespace
                            )?                      #
                            \K                      # discarge what's matched so far 
                            [A-Za-z]\w*             # valid identifier, i.e. class name
                            "
                />
            </className>
            <function
                mainExpr = "(?x)                    # ok, 10:22 2017-03-25, [2]
                            (?m)                    # ^ and $ match at line breaks
                            (   
                                (^\h*)              # optional whitespace starting line
                                |                   # or
                                ([,;]\h*)           # separator before 'function'
                            )       
                            function                # start-of-function indicator
                            \h+                     # whitespace
                            \K                      # discard what's matched so far 
                            (                       # open optional left hand side, LHS
                                (                   #
                                    (\w+)           # single variable name
                                    |               # or
                                    (\[[\w,\h]*])  # one or more variable names in brackets 
                                )                   #
                                \h*=\h*             # 
                            )?                      # close optional left hand side
                            [A-Za-z]\w*\b           # function name
                            \h*                     #      
                            (                       # open optional argument list
                                (\([\h,\w~]*\))     # pair of parentheses with optional arguments 
                                |                   # or
                                (                   #  
                                    \([\h,\w~]*?    # opening parenthisis and optional arguments
                                    (?=\h*\.{3})    # and closed by continuation line
                                )                   #
                            )?                      # close optional argument list
                            "
            >
                <functionName>
                    <funcNameExpr 
                        expr = "(?x)                # ok, 12:07 2017-03-25
                                (?m)                # ^ and $ match at line breaks
                                (?-s)               # single-line  
                                (                   # open optional left hand side, LHS
                                ((\w+)|             # single variable name
                                (\[[\w,\h]*]))     # one or more variable names in brackets 
                                \h*=\h*             # 
                                )?                  # close optional left hand side
                                [A-Za-z]\w*\b       # function name
                                \h*                 #      
                                (                   # open optional argument list
                                    (\([\h,\w~]*\)) # pair of parenthisis with optional arguments
                                    |               # or
                                    (\([\h,\w~]*?   # lazy to avoid trailing whitespace 
                                    (?=\h*\.{3}))   # continuation line
                                )?                  # close optional argument list
                        "
                    />
                </functionName>
            </function>
        </classRange>
        <function
            mainExpr = "(?x)            # [3]
                        ^\h*            # optional whitespace
                        function        # start-of-function indicator 
                        \h+             # whitespaces
                        (?-s:.*)        # the rest of the line
                        "
        >
            <functionName>
                <nameExpr 
                    expr = "(?x)                    #
                            function                # start-of-function indicator 
                            \h+                     # whitespace        
                            \K                      # discard what's matched so far
                            (                       # open optional left hand side, LHS
                                (                   #
                                    (\w+)           # single variable name
                                    |               # or
                                    \[[\w,\h]*]    # one or more variable names in brackets 
                                )                   #
                                \h*=\h*             # 
                            )?                      # close optional left hand side
                            [A-Za-z]\w*             # function name
                            \b\h*                   #      
                            (                       # open optional input argument list
                                \([\h,\w~]*\)       # pair of parenthisis with optional arguments or
                                |                   # or     
                                (                   #
                                    \([\h,\w]*?     # lazy to avoid trailing whitespace 
                                    (?=\h*\.{3})    # 
                                )                   #
                            )?                      # close optional argument list
                            "
                />
            </functionName>
        </function>
    </parser>
    

    function_with_keyword_in_comment.m. The <parser> fails with this test file. Only the file name is displayed in Function List.

    function    [ out, val ] = function_with_keyword_in_comment( val )
        out = 17;
        % comment with keyword: while
    end
    

    class_with_keyword_in_block_comment.m. The <parser> fails with this test file. Only the file name is displayed in Function List.

    classdef    class_with_keyword_in_block_comment
        methods
            function    no_return( this )
                %{
                    comment with keyword: while
                %}
                a = 17;
            end        
        end
    end
    

    class_with_keyword_in_string.m. The <parser> fails with this test file. Only the file name is displayed in Function List.

    classdef    class_with_keyword_in_string
        methods
            function    no_return( this )
                str = 'string with keyword: while'; 
                a = 17;
            end        
        end
    end
    

    OneLineClasses.m. The <parser> displays the expected result in Function List

    classdef OneLineClasses,methods,function OneLineMethod(~,str),disp(str),end,function [q]=OneLineMethod(str),disp(str),end,end,end
    

    Update the editor window first and then update the Function List

    Problem: “Browsing” through open files, Next Taband Previous Tab, is halted for seconds by files that the <parser> fails to process.
    Proposal: Let Next Taband Previous Tab interrupt the <parser> (if working), update the editor window, and then update Function List. While the <parser> is working the update icon could rotate to indicate ongoing work.

    More than one Matlab <parser>

    In some use cases I would like to show the full function signature in the Function List and in other only the function name. It’s straight forward to modify the current <parser> to only display function names. However, I cannot think of a good way to switch between the two variants.

    Keywords embedded in comments and in string values

    The problem with keywords in comments and string values makes the current <parser> nearly worthless for practical work. I have many “for” and “end” in my comments and string values.

    There is a simple solution, replace parser.classRange.mainExpr by something simple that requires that files honor the recommended format, e.g. (?m)^classdef\h+(?s:.*?)^end\b. That’s good enough for me. And I have done a useful exercise with RE.



  • Update the editor window first and then update the Function List

    I suspect the recursive part of classRange.mainExpr is giving the RE engine a hard time :-)

    More than one Matlab <parser>

    Based on current Function List implementation I can think of three solutions: define a <parser> for each use-case and switch between them by …

    1. changing the association; this requires a restart of NP++ to take effect;
    2. swapping their parser id, this also requires a restart of NP++ to take effect;
    3. utilizing file extension association and changing the extension of the file.

    Keywords embedded in comments and in string values

    The trick here is to handle literate strings as comment regions.
    e.g.

    commentExpr="(?x)                                    # free spacing
    				(?s:(?s:%\{.*%\})                    # Multi Line Comment
    			|	(?m-s:%.*$)                          # Single Line Comment
    			|	(?s:\x22(?:[^\x22\x5C]|\x5C.)*\x22)  # String Literal - Double Quoted
    			|	(?s:\x27(?:[^\x27\x5C]|\x5C.)*\x27)  # String Literal - Single Quoted
    			"
    


  • @MAPJe71, thanks for your answer.

    The trick here is to handle literate strings as comment regions.

    Yes, but I fail to make it work. I copy&pasted your commentExpr value into the <parser> that I posted yesterday. None of the “keyword in comment and string” test files give the expected result in the Function List, i.e. the same result as yesterday.

    Speculation: commentExpr has no effect in my <parser>.

    1. The “block-comment-problem” is back, i.e. function_no_return_C2 is displayed
    2. With classRange.mainExpr using recursion skipping strings values and comments is needed.


  • @Per-Isakson

    There is definitely something weird going with the comment expression. The single line comment and both double and single quoted string literals are detected just perfectly but the multi line comment keeps failing.

    Oh just noticed that the comment expression I posted has a superfluous (?s: at the start of the “Multi Line Comment”-part.



  • @MAPJe71

    The single line comment and both double and single quoted string literals are detected just perfectly …

    That is not how I interpret what I see om my side. I have no debugging means to see how NPP uses the value of commentExpr. (Or I missed it.) Thus, I resort to indirect tests.

    With

    • your new value of commentExpr (corrected),
    • classRange.mainExpr using recursion,
    • the Matlab files attached and
    • Notepad++ v7.3.3 (32-bit), Build time : Mar 8 2017 - 03:37:37, Admin mode : OFF, Local Conf mode : OFF, OS : Windows 7 (64-bit), Plugins : …

    Function List displays a row of dots, a document icon and the file name. The evaluation, which obviously fails, takes a few seconds.

    With the keyword, while, replaced by _while the Function List displays the expected results without delay.

    Do you see a different behavior?

    I’m out of ideas, are there other test I could do?

    function    [ out, val ] = function_with_keyword_in_string( val )
        out = 17;
        str = 'string with keyword: while'; 
    end
    
    function    [ out, val ] = function_with_keyword_in_comment( val )
        out = 17;
        % comment with keyword: while
    end
    
    classdef    class_with_keyword_in_comment
        methods
            function    no_return( this )
                % comment with keyword: while
                a = 17;
            end        
        end
    end
    
    classdef    class_with_keyword_in_string
        methods
            function    no_return( this )
                str = 'string with keyword: while'; 
                a = 17;
            end        
        end
    end
    


  • This is what I get …

    with the following <parser> …

    <parser
    	displayName="MATLAB - MATrix LABoratory"
    	id         ="matlab_syntax"
    	commentExpr="(?x)                                    # free-spacing (see `RegEx - Pattern Modifiers`)
    					(?'MLC':                             # Multi Line Comment
    						%\{                              # ...start-of-comment indicator
    						(?:                              # ...followed by zero or more characters
    							[^%]                         # ...not start of start-of-comment indicator
    						|	%(?![%{}])                   # ...not being an SLC or a start- or end-of-comment indicator
    						)*?
    						%\}                              # ...end-of-comment indicator
    					)
    				|	(?m-s:\x25.*$)                       # Single Line Comment (SLC)
    				|	(?s:\x22(?:[^\x22\x5C]|\x5C.)*\x22)  # String Literal - Double Quoted
    				|	(?s:\x27(?:[^\x27\x5C]|\x5C.)*\x27)  # String Literal - Single Quoted
    				"
    >
    	<classRange
    		mainExpr    ="(?x)                               # free-spacing (see `RegEx - Pattern Modifiers`)
    				(?s-i)                                   # dot matches at line breaks, case-sensitive
    				\bclassdef\b                             # start-of-class indicator
    				.*?                                      # whatever, until...
    				\bmethods\b                              # ...start-of-class-body indicator
    			"
    		openSymbole ="(?x)                               # free-spacing (see `RegEx - Pattern Modifiers`)
    				\b                                       # ensure leading word boundary for start-of-block indicator
    				(?-i:                                    # case-sensitive start-of-block indicators
    					e(?:numeration|vents)
    				|	f(?:or|unction)
    				|	if
    				|	methods
    				|	p(?:arfor|roperties)
    				|	switch
    				|	try
    				|	while
    				)
    				\b
    			"
    		closeSymbole="(?x)                               # free-spacing (see `RegEx - Pattern Modifiers`)
    				\b                                       # ensure leading word boundary for end-of-block indicator
    				(?-i:                                    # case-sensitive
    					end                                  # end-of-block indicator
    				)
    				\b                                       # ensure trailing word boundary for end-of-block indicator
    			"
    	>
    		<className>
    			<nameExpr expr="(?x)                         # free-spacing (see `RegEx - Pattern Modifiers`)
    					\b(?-i:classdef)                     # case-sensitive start-of-class indicator
    					\h+                                  # required whitespace
    					(?:\([^)]*?\)\h+)?                   # optional class-attributes
    					\K                                   # discard text matched so far
    					[A-Za-z]\w*                          # valid character combination for identifiers i.e. class name
    				"
    			/>
    		</className>
    		<function
    			mainExpr="(?x)                               # free-spacing (see `RegEx - Pattern Modifiers`)
    					(?ms-i)                              # ^, $ and dot match at line breaks, case-sensitive
    					(?:	^                                # a function can be found at start-of-line
    					|	[,;]                             # ...or after a separator
    					)\h*                                 # ...optionally followed by whitespace
    					\K                                   # discard text matched so far
    					\bfunction                           # ensure word boundary for start-of-function indicator
    					\s+                                  # required whitespace separator
    					.*?                                  # whatever, until...
    					\bend\b                              # ...the first end-of-block indicator
    				"
    		>
    			<functionName>
    				<funcNameExpr expr="(?x)                 # free-spacing (see `RegEx - Pattern Modifiers`)
    						function\s+                      # start-of-function indicator
    						(?:\.{3}\s+)?                    # optional continuation-line indicator
    						\K                               # discard text matched so far
    						(?:                              # optional return value(s)
    							(?:	\w+                      # ...single variable name
    							|	\[[\h,\w]*]             # ...or one or more variable names in brackets
    							)\h*=\h*                     # ...followed by a separator with optional whitespace
    						)?
    						[A-Za-z]\w*                      # valid character combination for identifiers i.e. function name
    						\b                               # ensure word boundary for name
    						(?:                              # optional parameter list
    							\s*                          # ...optional leading whitespace
    							\(                           # ...start-of-parameter-list indicator
    							[\h,\w~]*                    # ...with optional parameters
    							(?:	\)                       # ...until end-of-parameter-list indicator
    							|	\.{3}                    # ...or continuation-line indicator
    							)
    						)?
    					"
    				/>
    				<!-- comment out the following node to display the method with its parameters -->
    <!--				<funcNameExpr expr="(?:(?:\w+|\[[\h,\w]*])\h*=\h*)?[A-Za-z]\w*" /> -->
    			</functionName>
    		</function>
    	</classRange>
    	<function
    		mainExpr="(?x)                                   # free-spacing (see `RegEx - Pattern Modifiers`)
    				(?ms-i)                                  # ^, $ and dot match at line breaks, case-sensitive
    				^\h*                                     # optional leading whitespace at start-of-line
    				function\b                               # start-of-function indicator
    				.*?                                      # whatever, until...
    				\bend\b                                  # ...the first end-of-block indicator
    			"
    	>
    		<functionName>
    			<nameExpr expr="(?x)                         # free-spacing (see `RegEx - Pattern Modifiers`)
    					function\s+                          # start-of-function indicator
    					(?:\.{3}\s+)?                        # optional continuation-line indicator
    					\K                                   # discard text matched so far
    					(?:                                  # optional return value(s)
    						(?:	\w+                          # ...single variable name
    						|	\[[\h,\w]*]                 # ...or one or more variable names in brackets
    						)\h*=\h*                         # ...followed by a separator with optional whitespace
    					)?
    					[A-Za-z]\w*                          # valid character combination for identifiers i.e. function name
    					\b                                   # ensure word boundary for name
    					(?:                                  # optional parameter list
    						\s*                              # ...optional leading whitespace
    						\(                               # ...start-of-parameter-list indicator
    						[\h,\w~]*                        # ...with optional parameters
    						(?:	\)                           # ...until end-of-parameter-list indicator
    						|	\.{3}                        # ...or continuation-line indicator
    						)
    					)?
    				"
    			/>
    			<!-- comment out the following node to display the method with its parameters -->
    <!--			<nameExpr expr="(?:(?:\w+|\[[\h,\w]*])\h*=\h*)?[A-Za-z]\w*" /> -->
    		</functionName>
    	</function>
    </parser>
    


  • @MAPJe71 ,
    Your new parser works like a charm. It even handles WeirdUseOfContinuationLine.m. Thank you very much!


Log in to reply