Function list with Java problems



  • I’ve found several problems using function list for Java.

    1. closing bracket “}” must not be the last character
      there must be at least one chcracter beyond. Can be space or newline.

      public class MyClass {
      void method() {
      }
      }

    2. comments don’t work at all
      If there is comment inside my class the function list produces no output at all.

      public class MyClass {
      // this comment must not be here
      /* this also breaks the functionlist */
      void method() {
      // even this comment is wrong
      }
      }

    3. nested classes not recognized
      No support for nested classes. I would really appreciate if ClassRange would be recursive

      public class OuterClass {
      private class InnerClass { // this class is not in functionlist
      void innerMethod() { // this method is placed in OuterClass
      }
      }
      void outerMethod() {
      }
      }

    4. template classes extending other templates
      I created my custom Collection. The implemented interface destroyed the function list.

      public class MyCollection<T> implements Collection<T> {
      void overrideCollectionMethods() {
      }
      }

    5. templates limiting the template type
      My collection was used for certain types only. This was also not recognized.

      public class MyCollection<T extends MyInterface> {
      void someMethod() {
      }
      }



    1. Known issue, is being worked on;
    2. Known issue, is being worked on;
    3. Correct, nested classes are not supported; It’s implementation raises a whole new set of questions/challenges; Won’t implement in near future;
    4. I’ll look into this;
    5. I’ll look into this (as related to no.4).


    1. interface (abstract classes) methods are not recognized
      regex for method recognition requires method implementation {…}. Abstract methods are not recognized.

      public interface MyInterface {
      public void method();
      }



  • I did some investigation on templates and I ended up with this FunctionList. Please check it out.
    Comment is commented out (ad 2.).

    		<parser
    			id         ="java"
    			displayName="Java"
    			c_ommentExpr="(?s:/\*.*?\*/)|(?m-s://.*$)"
    		>
    			<classRange
    				mainExpr    ="^[\t\x20]*((public|protected|private|static|final|abstract|synchronized|@(\w)+)\s+)*(class|enum|interface|@interface)\s+(?'D'\s*\w+(\s*&lt;(?'G'\s*(((\w+\.)*(?&amp;D))|((\?|\w+)(\s+(extends|super)\s+(?&amp;D))?)))(\s*,(?&amp;G))*\s*&gt;)?)(\s+extends\s+(?&amp;D))?(\s+implements\s+(?&amp;D)(\s*,(?&amp;D))*)?\s*\{"
    				openSymbole ="\{"
    				closeSymbole="\}"
    			>
    				<className>
    					<nameExpr expr="(class|enum|interface|@interface)\s+\w+" />
    				</className>
    				<function
    					mainExpr="^[\t\x20]*((public|protected|private|static|final|abstract|synchronized|@(\w)+)\s+)*(\s*&lt;(?&amp;G)\s*&gt;)?(?'D'\s*(\w+\.)*\w+(\s*&lt;(?'G'\s*((?&amp;D)|((\?|\w+)(\s+(extends|super)\s+(?&amp;D))?)))(\s*,(?&amp;G))*\s*&gt;)?([\s*])*)\s+(?!(if|while|for|switch|catch|synchronized|this|super)\b)\w+\(((?'A'(?&amp;D)\s+\w+)(\s*,(?&amp;A))*)?\)(\s+throws\s+\w+)?\s*(\{|;)"
    				>
    					<functionName>
    						<funcNameExpr expr="\w+(\s*&lt;.*&gt;)?\s*\(" />
    						<funcNameExpr expr="\w+" />
    					</functionName>
    				</function>
    			</classRange>
    		</parser>


  • A = Argument?
    D = Declarator?
    G = Generic?



  • @Petr-Skýpala Based on your post I adapted the parser to:

    			<parser
    				displayName="Java"
    				id         ="java_syntax"
    			>
    				<classRange
    					mainExpr    ="(?x)                                          # Utilize inline comments (see `RegEx - Pattern Modifiers`)
    							^[\t\x20]*                                          # leading whitespace
    							(?:
    								(?-i:abstract|final|private|protected|public|static|synchronized|@\w+)
    								\s+
    							)*
    							(?-i:class|enum|@?interface)
    							\s+
    							(?'DECLARATOR'
    								(?'VALID_ID'                                    # valid identifier, use as subroutine
    									\b(?!(?-i:
    										a(?:bstract|ssert)|b(?:oolean|reak|yte)|c(?:ase|atch|har|lass|on(?:st|tinue))|
    										d(?:efault|o(?:uble)?)|e(?:lse|num|xtends)|f(?:inal(?:ly)?|loat|or)|
    										goto|i(?:f|mp(?:lements|ort)|nstanceof|nt(?:erface)?)|
    										long|n(?:ative|ew)|p(?:ackage|rivate|rotected|ublic)|return|
    										s(?:hort|tatic|trictfp|uper|witch|ynchronized)|th(?:is|rows?)|tr(?:ansient|y)|
    										vo(?:id|latile)|while
    									)\b)                                        # keywords, not to be used as identifier
    									[A-Za-z_]\w*                                # valid character combination for identifiers
    								)
    								(?:
    									\s*\x3C                                     # start-of-template indicator
    									(?'GENERIC'                                 # ...match first generic, use as subroutine
    										\s*
    										(?:
    											(?:                                 # optional parent type name(s)
    												(?&amp;VALID_ID)                # ...parent type name
    												\.                              # ...parent-sibling separator
    											)*
    											(?&amp;DECLARATOR)
    										|
    											(?:
    												\?                              # ..
    											|	\w+                             # .. name
    											)
    											(?:                                 # optional type extension
    												\s+(?-i:extends|super)
    												\s+(?&amp;DECLARATOR)
    											)?
    										)
    									)
    									(?:                                         # ...match consecutive generic objects, they are
    										\s*,                                    #    separated by a comma
    										(?&amp;GENERIC)
    									)*
    									\s*\x3E                                     # ...end-of-template indicator
    								)?
    							)
    							(?:                                                 # optional object extension
    								\s+(?-i:extends)
    								\s+(?&amp;DECLARATOR)
    							)?
    							(?:                                                 # optional object implementation
    								\s+(?-i:implements)
    								\s+(?&amp;DECLARATOR)                           # ...match first object
    								(?:                                             # ...match consecutive objects, they are
    									\s*,                                        #    separated by a comma
    									\s*(?&amp;DECLARATOR)
    								)*
    							)?
    							\s*\{                                               # whatever, up till start-of-body indicator
    						"
    					openSymbole ="\{"
    					closeSymbole="\}"
    				>
    					<className>
    						<nameExpr expr="(?-i:class|enum|@?interface)\s+\K\w+(?:\s*\x3C.*?\x3E)?" />
    					</className>
    					<function
    						mainExpr="(?x)                                          # Utilize inline comments (see `RegEx - Pattern Modifiers`)
    								^[\t\x20]*                                      # leading whitespace
    								(?:
    									(?-i:abstract|final|native|private|protected|public|static|synchronized|@\w+)
    									\s+
    								)*
    								(?:
    									\s*\x3C                                     # start-of-template indicator
    									\s*(?&amp;GENERIC)
    									\s*\x3E                                     # end-of-template indicator
    								)?
    								\s*
    								(?'DECLARATOR'
    									(?:                                         # optional parent type name(s)
    										[A-Za-z_]\w*                            # ...parent type name
    										\.                                      # ...parent-sibling separator
    									)*
    									[A-Za-z_]\w*                                # type name
    									(?:                                         # optional template type
    										\s*\x3C                                 # ...start-of-template indicator
    										(?'GENERIC'                             # ...match first generic object, use as subroutine
    											\s*
    											(?:
    												(?&amp;DECLARATOR)
    											|
    												(?:
    													\?                          # ..
    												|	\w+                         # .. name
    												)
    												(?:                             # optional type extension
    													\s+(?-i:extends|super)
    													\s+(?&amp;DECLARATOR)
    												)?
    											)
    										)
    										(?:                                     # ...match consecutive generic objects, they are
    											\s*,                                #    separated by a comma
    											(?&amp;GENERIC)
    										)*
    										\s*\x3E                                 # ...end-of-template indicator
    									)?
    									(?:                                         # optional compound type
    										\s*[                                   # ...start-of-compound indicator
    										\s*]                                   # ...end-of-compound indicator
    									)*
    								)
    								\s+
    								(?'VALID_ID'                                    # valid identifier, use as subroutine
    									\b(?!(?-i:
    										a(?:bstract|ssert)|b(?:oolean|reak|yte)|c(?:ase|atch|har|lass|on(?:st|tinue))|
    										d(?:efault|o(?:uble)?)|e(?:lse|num|xtends)|f(?:inal(?:ly)?|loat|or)|
    										goto|i(?:f|mp(?:lements|ort)|nstanceof|nt(?:erface)?)|
    										long|n(?:ative|ew)|p(?:ackage|rivate|rotected|ublic)|return|
    										s(?:hort|tatic|trictfp|uper|witch|ynchronized)|th(?:is|rows?)|tr(?:ansient|y)|
    										vo(?:id|latile)|while
    									)\b)                                        # keywords, not to be used as identifier
    									[A-Za-z_]\w*                                # valid character combination for identifiers
    								)
    								\s*
    								\(                                              # start-of-arguments indicator
    								(?:                                             # optional arguments
    									(?'ARG'                                     # ...match first argument, use as subroutine
    										\s*
    										(?-i:final\s+)?
    										(?&amp;DECLARATOR)
    										\s+(?&amp;VALID_ID)                     #    argument name
    									)
    									(?:                                         # ...consecutive arguments are
    										\s*,                                    #    separated by commas
    										(?&amp;ARG)
    									)*
    								)?
    								\)                                              # end-of-arguments indicator
    								(?:                                             # optional exceptions
    									\s*(?-i:throws)
    									\s+(?&amp;VALID_ID)                         # ...first exception name
    									(?:                                         # ...consecutive exception names are
    										\s*,                                    #    separated by commas
    										\s*(?&amp;VALID_ID)
    									)*
    								)?
    								\s*
    								(?:                                             # function declaration ends with ...
    									\{                                          # ...a start-of-function-body indicator or
    								|	;                                           # ...an end-of-declaration indicator
    								)
    							"
    					>
    						<functionName>
    							<funcNameExpr expr="\w+(?=\s*\()" />
    						</functionName>
    					</function>
    				</classRange>
    			</parser>
    

    Could you please test/verify it.
    Thanx!



  • Hi MAPJe71,

    Ah, Indeed ! I’ve never thought yet, about using the (?x) modifier in regexes of the functionList.xml file, but you’re quite right : this improve readability of text and, surely, helps in achieving overall exact regexes :-))

    Cheers,

    guy038



  • Great job. You’re a genius. Making the regex multiline and commentes is real blessing.

    I’ve added several whitespaces \s where i missed them.
    I’ve added bounded templates <T extends A & B>
    I’ve added multiple inheritance - interface can extend multiple interfaces
    I’ve generalized the DECLARATOR and GENERIC groups a bit.

    The result is far from perfect, but it completely satisfies my needs. I don’t mind if it recognizes multiple inheritance. We can leave some work for compiler too. I hope this class-function template will help with other languages.

            <parser
                displayName="Java"
                id         ="java"
            >
                <classRange
                    mainExpr    ="(?x)                                          # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                            ^[\t\x20]*                                          # leading whitespace
                            (?:
                                (?-i:
                                    abstract
                                    |final
                                    |native
                                    |p(?:rivate|rotected|ublic)
                                    |s(?:tatic|trictfp|ynchronized)
                                    |transient
                                    |volatile
                                    |@[A-Za-z_]\w*                              # qualified identifier
                                        (?:
                                            \.
                                            [A-Za-z_]\w*
                                        )*
                                )
                                \s+
                            )*
                            (?-i:class|enum|@?interface)
                            \s+
                            (?'DECLARATOR'
                                (?'VALID_ID'                                    # valid identifier, use as subroutine
                                    \b(?!(?-i:
                                        a(?:bstract|ssert)
                                        |b(?:oolean|reak|yte)
                                        |c(?:ase|atch|har|lass|on(?:st|tinue))
                                        |d(?:efault|o(?:uble)?)
                                        |e(?:lse|num|xtends)
                                        |f(?:inal(?:ly)?|loat|or)
                                        |goto
                                        |i(?:f|mp(?:lements|ort)|nstanceof|nt(?:erface)?)
                                        |long
                                        |n(?:ative|ew)
                                        |p(?:ackage|rivate|rotected|ublic)
                                        |return
                                        |s(?:hort|tatic|trictfp|uper|witch|ynchronized)
                                        |th(?:is|rows?)
                                        |tr(?:ansient|y)
                                        |vo(?:id|latile)
                                        |while
                                    )\b)                                        # keywords, not to be used as identifier
                                    [A-Za-z_]\w*                                # valid character combination for identifiers
                                )
                                (?:
                                    \s*\x3C                                     # start-of-template indicator
                                    (?'GENERIC'                                 # ...match first generic, use as subroutine
                                        \s*
                                        (?:
                                            (?&amp;DECLARATOR)                  # use named generic
                                        |   \?                                  # or unknown
                                        )
                                        (?:                                     # optional type extension
                                            \s+(?-i:extends|super)
                                            \s+(?&amp;DECLARATOR)
                                            (?:                                 # multiple bounds
                                                \s+\x26                         # ...are ampersand separated
                                                \s+(?&amp;DECLARATOR)
                                            )*
                                        )?
                                        (?:                                     # match consecutive generics objects
                                            \s*,                                # ...comma separated
                                            (?&amp;GENERIC)
                                        )?
                                    )
                                    \s*\x3E                                     # ...end-of-template indicator
                                )?
                                (?:                                             # package and|or nested classes
                                    \.                                          # ...are dot separated
                                    (?&amp;DECLARATOR)
                                )?
                            )
                            (?:                                                 # optional object extension
                                \s+(?-i:extends)
                                \s+(?&amp;DECLARATOR)
                                (?:                                             # ...match consecutive objects, they are
                                    \s*,                                        #    separated by a comma
                                    \s*(?&amp;DECLARATOR)
                                )*
                            )?
                            (?:                                                 # optional object implementation
                                \s+(?-i:implements)
                                \s+(?&amp;DECLARATOR)                           # ...match first object
                                (?:                                             # ...match consecutive objects, they are
                                    \s*,                                        #    separated by a comma
                                    \s*(?&amp;DECLARATOR)
                                )*
                            )?
                            \s*\{                                               # whatever, up till start-of-body indicator
                        "
                    openSymbole ="\{"
                    closeSymbole="\}"
                >
                    <className>
                        <nameExpr expr="(?-i:class|enum|@?interface)\s+\K\w+(?:\s*\x3C.*?\x3E)?" />
                    </className>
                    <function
                        mainExpr="(?x)                                          # Utilize inline comments (see `RegEx - Pattern Modifiers`)
                                ^[\t\x20]*                                      # leading whitespace
                                (?:
                                    (?-i:
                                        abstract
                                        |final
                                        |native
                                        |p(?:rivate|rotected|ublic)
                                        |s(?:tatic|trictfp|ynchronized)
                                        |transient
                                        |volatile
                                        |@[A-Za-z_]\w*                          # qualified identifier
                                            (?:
                                                \.                              # ... dot separated
                                                [A-Za-z_]\w*
                                            )*
                                    )
                                    \s+
                                )*
                                (?:
                                    \s*\x3C                                     # start-of-template indicator
                                    (?&amp;GENERIC)
                                    \s*\x3E                                     # end-of-template indicator
                                )?
                                \s*
                                (?'DECLARATOR'
                                    [A-Za-z_]\w*                                # type name
                                    (?:                                         # optional parent type name(s)
                                        \.                                      # ...parent-sibling separator
                                        [A-Za-z_]\w*                            # ...parent type name
                                    )*
                                    (?:
                                        \s*\x3C                                 # start-of-template indicator
                                        (?'GENERIC'                             # ...match first generic, use as subroutine
                                            \s*
                                            (?:
                                                (?&amp;DECLARATOR)              # use named generic
                                            |   \?                              # or unknown
                                            )
                                            (?:                                 # optional type extension
                                                \s+(?-i:extends|super)
                                                \s+(?&amp;DECLARATOR)
                                                (?:                             # multiple bounds
                                                    \s+\x26                     # ...are ampersand separated
                                                    \s+(?&amp;DECLARATOR)
                                                )*
                                            )?
                                            (?:                                 # match consecutive generics objects
                                                \s*,                            # ...comma separated
                                                (?&amp;GENERIC)
                                            )?
                                        )
                                        \s*\x3E                                 # ...end-of-template indicator
                                    )?
                                    (?:                                         # package and|or nested classes
                                        \.                                      # ... are dot separated
                                        (?&amp;DECLARATOR)
                                    )?
                                    (?:                                         # optional compound type
                                        \s*[                                   # ...start-of-compound indicator
                                        \s*]                                   # ...end-of-compound indicator
                                    )*
                                )
                                \s+
                                (?'VALID_ID'                                    # valid identifier, use as subroutine
                                    \b(?!(?-i:
                                        a(?:bstract|ssert)
                                        |b(?:oolean|reak|yte)
                                        |c(?:ase|atch|har|lass|on(?:st|tinue))
                                        |d(?:efault|o(?:uble)?)
                                        |e(?:lse|num|xtends)
                                        |f(?:inal(?:ly)?|loat|or)
                                        |goto
                                        |i(?:f|mp(?:lements|ort)|nstanceof|nt(?:erface)?)
                                        |long
                                        |n(?:ative|ew)
                                        |p(?:ackage|rivate|rotected|ublic)
                                        |return
                                        |s(?:hort|tatic|trictfp|uper|witch|ynchronized)
                                        |th(?:is|rows?)
                                        |tr(?:ansient|y)
                                        |vo(?:id|latile)
                                        |while
                                    )\b)                                        # keywords, not to be used as identifier
                                    [A-Za-z_]\w*                                # valid character combination for identifiers
                                )
                                \s*\(                                           # start-of-arguments indicator
    							(?'ARG'                                     # ...match first argument, use as subroutine
    								\s*(?-i:final\s+)?
    								(?&amp;DECLARATOR)
    								\s+(?&amp;VALID_ID)                     #    argument name
    								(?:                                     # ...consecutive arguments are
    									\s*,                                #    separated by commas
    									(?&amp;ARG)
    								)?
    							)?
                                \)                                              # end-of-arguments indicator
                                (?:                                             # optional exceptions
                                    \s*(?-i:throws)
                                    \s+(?&amp;VALID_ID)                         # ...first exception name
                                    (?:                                         # ...consecutive exception names are
                                        \s*,                                    #    separated by commas
                                        \s*(?&amp;VALID_ID)
                                    )*
                                )?
                                \s*(?:                                          # function declaration ends with ...
                                    \{                                          # ...a start-of-function-body indicator or
                                |   ;                                           # ...an end-of-declaration indicator
                                )
                            "
                    >
                        <functionName>
                            <funcNameExpr expr="\w+(?=\s*\()" />
                        </functionName>
                    </function>
                </classRange>
            </parser>

Log in to reply