Community
    • Login

    Perl keywords "class" and "method" not recognised by Function List

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    13 Posts 3 Posters 846 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • J
      JohnL22
      last edited by

      PeterJones et al
      Thank you,
      That looks to be what I would expect the Function List to contain.
      I only use the most simple syntax of “class” and “method” currently.
      I agree ADJUST is very like BEGIN & END (which I have NOT yet used), so I am undecided about including them in the Function List.
      I will use your updated code, test it out, and confirm it does what I need soon.
      Regards

      1 Reply Last reply Reply Quote 1
      • PeterJonesP
        PeterJones @PeterJones
        last edited by

        @PeterJones said in Perl keywords "class" and "method" not recognised by Function List:

        Please note: the bad highlighting on the $) on lines 325 and 326 compared to 316-317 is caused by the underlying lexer from the Lexilla library project. I have already created an issue there to get that resolved; it cannot be fixed in Notepad++ until it’s fixed in Lexilla.

        (This is separate from the Function List behavior, but since I thought you might notice something like that if you used prototypes, I decided to point it out.)

        1 Reply Last reply Reply Quote 2
        • guy038G
          guy038
          last edited by guy038

          Hello, @peterjones and All,

          I finally succeeded to get a new perl.xml file parser !


          In short :

          • I do NOT use any atomic construction !

          • In mainExpr of the class range, I do NOT use a named group but, simply, use the part ^ (?: package | class ) \b, twice !

          • I changed your prototype / signature syntax (?:\([^()]*+\)\s*+)?+ to (?: \( [\x20-\x27\x2A-\x7E]* \) \s* )?

          • I changed your attributes syntax (?:\:[^{]+)?+ to (?: : [\x20-\x7A\x7C-\x7E]+ \s* )?

          So, for these two syntaxes, I just supposed that standard ASCII characters are used, from \x20 to \x7E, except for \x28 and \x29 in one part and \x7B in second part ! May be, the \t should be part of each class character, either !

          • I changed the regex class name (?x)\s\K[^;{]+ to (?x)\s+\K.+?(?=[;{])

          BTW, my parser presently contains 13 strings \s. May be, the \h or even [\t\x20] should be more appropriate, in some parts ?

          Also, how many optional parts (?: \w+ :: )* may exist, before the mandatory \w+ of the function name ?

          Anyway, this is a first draft. As I’m definitively not a Perl Expert, I probably missed a lot !


          So, here is the first version of my Perl.xml parser :

          <?xml version="1.0" encoding="UTF-8" ?>
          <!-- ==========================================================================\
          |
          |   To learn how to make your own language parser, please check the following
          |   link:
          |       https://npp-user-manual.org/docs/function-list/
          |
          \=========================================================================== -->
          <NotepadPlus>
          	<functionList>
          		<!-- ======================================================== [ PERL ] -->
          		<!-- Perl - functions and packages, including fully-qualtified subroutine names -->
          
          			<parser
          				displayName="Perl"  id="perl_syntax"
          
          				commentExpr="(?x)                                               # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          							(?m-s:                                              # 'Multi-lines' mode ( ^ and $ match at line-breaks ) / 'Dot' char does NOT match line-breaks
          								\x23 .*                                         #   Single Line Comment ( #................ )
          							)                                                   #
          							|                                                   # OR
          							(?s:                                                # 'Single line' mode (letter s optional as mode set by DEFAULT)
          								__ (?: END | DATA ) __                          #   String '__END__' or '__DATA__' 
          								.*                                              #   ANY character(s), including line-breaks, till...
          								\Z                                              #   Last line-break, included
          							)
          						"
          			>
          				<classRange
          					mainExpr="(?x)                                              # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          							(?m-i)                                              # 'Multi-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode
          							^                                                   # NO leading white-space at start of line
          							(?: package | class ) \b                            # Header : word 'package' or 'clas', in LOWER case
          							(?s:                                                # 'Single line' mode (letter s optional as mode set by DEFAULT)
          								.+?                                             #   ANY character(s), including line-breaks, till...
                                      )                                                   # Section below, excluded
          							(?=                                                 # Start of look-ahead
          								\s*                                             #   Optional leading white-space of
          								^                                               #   NO leading white-space at start of line 
          								(?: package | class ) \b                        #   Next header : word 'package' or 'clas', in LOWER case
          							|                                                   # OR
          								\Z                                              #   last line-break
          							)                                                   # End of look-ahead
          						"
          				>
          					<className>
          						<nameExpr expr="(?x)                                    # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          										\s+                                     # Leading white-space(s)
          										\K                                      # Discard text matched so far
          										.+?                                     # ANY character(s) till...
          										(?= [;{] )                              # First semi-colon or left brace, excluded
          									"
          						/>
          					</className>
          					<function
          						mainExpr="(?x)                                          # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          								(?m-i)                                          # 'Mutli-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode
          								^ \h*                                           # Optional leading spaces or tabulations
          								(?: sub | method ) \b                           # Word 'sub' or 'method', in LOWER case 
          								\s+                                             # White-space character(s)
          								(?: \w+ :: )*                                   # Optional list of words EACH followed with ::
          								\w+                                             # Word character(s)
          								\s*                                             # Optional white-space character(s)
          								(?: \( [\x20-\x27\x2A-\x7E]* \) \s* )?          # Optional Prototype or Signature section
          								(?: : [\x20-\x7A\x7C-\x7E]+ \s* )?              # Optional Attributes section
          								\{                                              # Start of function body
          							"
          					>
          						<functionName>
          							<funcNameExpr expr="(?x)                            # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          												(?: sub | method )              # Word 'sub' or 'method', in LOWER case
          												\s+                             # White-space character(s)
          												\K                              # Discard text matched, so far (move this line right before \w+ if 'prefix::' part NOT desired)
          												(?: \w+ :: )*                   # Optional prefix:: part ( package:: / names:: )
          												\w+                             # Word character(s)
          											"
          							/>
          						</functionName>
          					</function>
          				</classRange>
          				<function
          					mainExpr="(?x)                                              # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          							(?m-i)                                              # 'Mutli-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode
          							^ \h*                                               # Optional leading spaces or tabulations
          							(?: sub | method )                                  # Word 'sub' or 'method', in LOWER case 
          							\s+                                                 # White-space character(s)
          							(?: \w+:: )*                                        # Optional list of words, EACH followed with ::
          							\w+                                                 # Word character(s)
          							\s*                                                 # Optional white-space character(s)
          							(?: \( [\x20-\x27\x2A-\x7E]* \) \s* )?              # Optional Prototype or Signature section
          							(?: : [\x20-\x7A\x7C-\x7E]+ \s* )?                  # Optional Attributes section
          							\{                                                  # Start of function body
          						"
          				>
          					<functionName>
          						<nameExpr expr="(?x)                                    # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          										(?: sub | method )                      # Word 'sub' or 'method', in LOWER case
          										\s+                                     # White-space character(s)
          										\K                                      # Discard text matched, so far ( move this line right before \w+ if part 'prefix::' NOT desired
          										(?: \w+ :: )*                           # Optional prefix:: part ( package:: / names:: )
          										\w+                                     # Word character(s)
          									"
          						/>
          					</functionName>
          					<className>
          						<nameExpr expr="(?x)                                    # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
          										(?: sub | method )                      # Word 'sub' or 'method', in LOWER case
          										\s+                                     # White-space character(s)
          										\K                                      # Discard text matched, so far
          										\w+                                     # Word character(s)
          										( :: \w+ )*                             # Optional list of words, EACH preceded with ::
          										(?= :: \w )                             # Till a last string ':: + word char' excluded
          									"
          						/>
          					</className>
          				</function>
          			</parser>
          	</functionList>
          </NotepadPlus>
          

          May be, it would be interesting to compare my version to yours, in terms of speed. To my mind, it’s seems similar !?

          Best Regards,

          guy038

          PeterJonesP 1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @guy038
            last edited by

            @guy038 said in Perl keywords "class" and "method" not recognised by Function List:

            for these two syntaxes, I just supposed that standard ASCII characters are used, from \x20 to \x7E, except for \x28 and \x29 in one part and \x7B in second part ! May be, the \t should be part of each class character, either

            Perl allows Unicode alphanumeric/“word character” in any such identifier or token (just cannot start with a numeric), so restricting to ASCII is not going to work

            1 Reply Last reply Reply Quote 1
            • J
              JohnL22
              last edited by

              I have used the changed perl.xml for more than 2 weeks.
              It handles all formats of “class” definitions - test file attached…

              The only time it sometimes stumbles is when there is no class-ending statement after the “STATEMENT form of class” before the end-of-file. I can not yet reliably reproduce the situation

              Classtest.jpg

              I will not comment on recognising identifiers in perl syntax.

              PeterJonesP 3 Replies Last reply Reply Quote 0
              • PeterJonesP
                PeterJones @JohnL22
                last edited by

                @JohnL22 ,

                The only time it sometimes stumbles is when there is no class-ending statement after the “STATEMENT form of class” before the end-of-file.

                The image you showed has no stumble. Zooming in:

                61e03153-9355-434c-b916-5fee3c87e207-image.png

                The green arrow points to the fact that most of the classes are set to show their contents, so have the ﹀ down arrow next to them. The red arrow highlights the fact that the final class is set to be collapsed (hiding the functions/methods in that class), so has the > right arrow. If you want to see the methods in VERSION_ATTR_STMT 3.5, you need to click the > to make it a ﹀

                6ac3a1a8-cac9-496c-a462-134894f55dd8-image.png
                becomes
                a496a5c9-e947-458c-a950-8118a0552369-image.png
                by toggling that arrow.

                1 Reply Last reply Reply Quote 0
                • PeterJonesP
                  PeterJones @JohnL22
                  last edited by

                  @JohnL22 said in Perl keywords "class" and "method" not recognised by Function List:

                  The only time it sometimes stumbles is when there is no class-ending statement after the “STATEMENT form of class” before the end-of-file. I can not yet reliably reproduce the situation

                  The other possibility is that you didn’t have final newline/whitespace after the last line of code. That’s a long-standing issue buried deep in the FunctionList implementation that no one has been able to fix:

                  No newline:
                  1d198ea7-1f79-4739-882d-f6850e7ac599-image.png

                  vs with newline:
                  f61b6f03-6185-4794-aa82-8691d51e8c17-image.png

                  If you cannot force yourself to remember to always have the final newline in your file, you can use the EditorConfig plugin and set insert_final_newline = true for [*.p{l,m}] or for [*]

                  J 1 Reply Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones @JohnL22
                    last edited by PeterJones

                    @JohnL22 ,

                    … but since it’s essentially working for you, I have taken that as confirmation that the new perl.xml is an improvement, so it’s in a Pull Request now… hopefully, it will get merged for the v8.9.1 release candidate coming next week.


                    update: PR has been merged,so it will be in 8.9.1

                    1 Reply Last reply Reply Quote 1
                    • J
                      JohnL22 @PeterJones
                      last edited by

                      @PeterJones I am happy :)

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by

                        Hello, @peterjones,

                        First, read this post to @coises, where I discuss the Unicode concept of identifiers, particularly in Perl !


                        Thus, as explained at the end of that post, I created a second version of my perl.xml file parser which should work correctly without significant delay !

                        In short :

                        • I do NOT use any atomic structure !

                        • In mainExpr of the class range, I do NOT use a named group but, simply, use the part ^ (?: package | class ) \b, twice !

                        • I changed your prototype / signature syntax (?:\([^()]*+\)\s*+)?+ to (?: \( [\x20-\x7E\w]* \) \s* )?

                        • I changed your attributes syntax (?:\:[^{]+)?+ to (?: : [\x20-\x7A\x7C-\x7E\w]+ \s* )?

                        In the two syntaxes above, I simply added \w within each character class

                        Note that, from this article https://www.effectiveperlprogramming.com/2015/04/use-v5-20-subroutine-signatures/, the following syntax seems possible :

                        sub animals ( $cat, $auto_id = get_id() ) {
                            say "$auto_id: The cat is $cat";
                            }
                        

                        Thus, for prototype / signature syntax, I’ve allowed parentheses within the outer parentheses. If this example seems not pertinent, use the alternate syntax :

                        (?: \( [\x20-\x27\x2A\x7E\w]* \) \s* )?

                        • Finally, I changed the regex class name (?x)\s\K[^;{]+ to (?x) \s+ \K .+? (?= \x20* [;{] )

                        BTW, my parser presently contains 13 strings \s. May be, the \h or even the [\t\x20] syntax should be more appropriate, in some parts ?


                        <?xml version="1.0" encoding="UTF-8" ?>
                        <!-- ==========================================================================\
                        |
                        |   To learn how to make your own language parser, please check the following
                        |   link:
                        |       https://npp-user-manual.org/docs/function-list/
                        |
                        \=========================================================================== -->
                        <NotepadPlus>
                        	<functionList>
                        		<!-- ======================================================== [ PERL ] -->
                        		<!-- Perl - functions and packages, including fully-qualtified subroutine names -->
                        
                        			<parser
                        				displayName="Perl"  id="perl_syntax"
                        
                        				commentExpr="(?x)                                               # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        							(?m-s:                                              # 'Multi-lines' mode ( ^ and $ match at line-breaks ) / 'Dot' char does NOT match line-breaks
                        								\x23 .*                                         #   Single Line Comment ( #................ )
                        							)                                                   #
                        							|                                                   # OR
                        							(?s:                                                # 'Single line' mode (letter s optional as mode set by DEFAULT)
                        								__ (?: END | DATA ) __                          #   String '__END__' or '__DATA__' 
                        								.*                                              #   ANY character(s), including line-breaks, till...
                        								\Z                                              #   Last line-break, included
                        							)
                        						"
                        			>
                        				<classRange
                        					mainExpr="(?x)                                              # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        							(?m-i)                                              # 'Multi-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode
                        							^                                                   # NO leading white-space at start of line
                        							(?: package | class ) \b                            # Header : word 'package' or 'clas', in LOWER case
                        							(?s:                                                # 'Single line' mode (letter s optional as mode set by DEFAULT)
                        								.+?                                             #   ANY character(s), including line-breaks, till...
                                                    )                                                   # Section below, excluded
                        							(?=                                                 # Start of look-ahead
                        								\s*                                             #   Optional leading white-space of
                        								^                                               #   NO leading white-space at start of line 
                        								(?: package | class ) \b                        #   Next header : word 'package' or 'clas', in LOWER case
                        							|                                                   # OR
                        								\Z                                              #   last line-break
                        							)                                                   # End of look-ahead
                        						"
                        				>
                        					<className>
                        						<nameExpr expr="(?x)                                    # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        										\s+                                     # Leading white-space(s)
                        										\K                                      # Discard text matched so far
                        										.+?                                     # ANY character(s) till...
                        										(?= \x20* [;{] )                        # First semi-colon or left brace, excluded
                        									"
                        						/>
                        					</className>
                        					<function
                        						mainExpr="(?x)                                          # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        								(?m-i)                                          # 'Mutli-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode
                        								^ \h*                                           # Optional leading spaces or tabulations
                        								(?: sub | method ) \b                           # Word 'sub' or 'method', in LOWER case 
                        								\s+                                             # White-space character(s)
                        								(?: \w+ :: )*                                   # Optional list of words EACH followed with ::
                        								\w+                                             # Word character(s)
                        								\s*                                             # Optional white-space character(s)
                        								(?: \( [\x20-\x7E\w]* \) \s* )?                 # Optional Prototype or Signature section
                        								(?: : [\x20-\x7A\x7C-\x7E\w]+ \s* )?            # Optional Attributes section
                        								\{                                              # Start of function body
                        							"
                        					>
                        						<functionName>
                        							<funcNameExpr expr="(?x)                            # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        												(?: sub | method )              # Word 'sub' or 'method', in LOWER case
                        												\s+                             # White-space character(s)
                        												\K                              # Discard text matched, so far (move this line right before \w+ if 'prefix::' part NOT desired)
                        												(?: \w+ :: )*                   # Optional prefix:: part ( package:: / names:: )
                        												\w+                             # Word character(s)
                        											"
                        							/>
                        						</functionName>
                        					</function>
                        				</classRange>
                        				<function
                        					mainExpr="(?x)                                              # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        							(?m-i)                                              # 'Mutli-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode
                        							^ \h*                                               # Optional leading spaces or tabulations
                        							(?: sub | method )                                  # Word 'sub' or 'method', in LOWER case 
                        							\s+                                                 # White-space character(s)
                        							(?: \w+ :: )*                                       # Optional list of words, EACH followed with ::
                        							\w+                                                 # Word character(s)
                        							\s*                                                 # Optional white-space character(s)
                        							(?: \( [\x20-\x7E\w]* \) \s* )?                     # Optional Prototype or Signature section
                        							(?: : [\x20-\x7A\x7C-\x7E\w]+ \s* )?                # Optional Attributes section
                        							\{                                                  # Start of function body
                        						"
                        				>
                        					<functionName>
                        						<nameExpr expr="(?x)                                    # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        										(?: sub | method )                      # Word 'sub' or 'method', in LOWER case
                        										\s+                                     # White-space character(s)
                        										\K                                      # Discard text matched, so far ( move this line right before \w+ if part 'prefix::' NOT desired
                        										(?: \w+ :: )*                           # Optional prefix:: part ( package:: / names:: )
                        										\w+                                     # Word character(s)
                        									"
                        						/>
                        					</functionName>
                        					<className>
                        						<nameExpr expr="(?x)                                    # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`)
                        										(?: sub | method )                      # Word 'sub' or 'method', in LOWER case
                        										\s+                                     # White-space character(s)
                        										\K                                      # Discard text matched, so far
                        										\w+                                     # Word character(s)
                        										( :: \w+ )*                             # Optional list of words, EACH preceded with ::
                        										(?= :: \w )                             # Till a last string ':: + word char' excluded
                        									"
                        						/>
                        					</className>
                        				</function>
                        			</parser>
                        	</functionList>
                        </NotepadPlus>
                        

                        In the https://github.com/notepad-plus-plus/notepad-plus-plus/blob/a91b22bd8337465e04c1afa30cb71f7909340293/PowerEditor/Test/FunctionList/perl/unitTest file, I added text at various locations :

                        • Before the line ############### Start ###############
                        
                        ################ Added by guy038 to test Notepad++'s FunctionList
                        
                        sub animals ( $cat, $autoid = get_id() ) {
                        	say "$auto_id: the cat is $cat";
                        }	
                        
                        sub _function_été {
                            return 1
                        }
                        
                        
                        • Before the line package NameSpace::Block {
                        
                        ################ Added by guy038 to test Notepad++'s FunctionList
                        
                        sub grâce::Hôte          { return 'running' }
                        sub grâce::Son_ø         { return 'stopped' }
                        
                        #################################################################
                        
                        • At the very end of file :
                        
                        ################ Added by guy038 to test Notepad++'s FunctionList
                        
                        class NewClassSyntax {
                            method inBlock { return 1 }
                            method inBlockProto($) { return $_[0] }
                            method inBlockAttrib :prototype($) { return $_[0] }
                        }
                        
                        class Chaîne{
                            method inBlock { return 1 }
                            method Dûment($) { return $_[0] }
                            method ƒ_Hameçon :prototype($) { return $_[0] }
                        }
                        #################################################################
                        

                        In terms of speed, the Function List panel seems quickly displayed. I also did a test copying UniTest.txt twice, and then adding, by regex, _1, _2 and _3 at end of the different names, the Function List panel still appeared without delay !

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors