FunctionList Confused
-
Hello, @lycan-thrope and All,
First I would say that, again, I was completely mistaken in this post, too :
https://community.notepad-plus-plus.org/post/72550
So, just forget my two previous posts and follow this simple rule to not add any code definition in comments, whatever the language used (
built_in
oruser-defined
) !
Now, regarding your IF/ELSE kind of Regex construction, there is, indeed, a valid
IF THEN / ELSE
regex contruction ! Its general syntax is :(?ConditionTHEN_part)
OR(?ConditionTHEN_part|ELSE_part)
, where condition is either :-
A previous defined group, named or not
-
A look-around feature
-
A recursive pattern
Two simple examples :
- The regex
(TEST)?123(?(1)===|---)456
matches the TEST123===456 string and any 123—456 string, whatever occurs before the 123 part
TEST123===456 abc123---456 xyz123---456
- The regex
___((?=TEST)TEST12345|67890)___
matches the two strings ___TEST12345___ and ___67890___
___TEST12345___ ___67890___
Back to your
dbasePlus.xml
parser and assuming these conditions :- I just associated Normal text to your
dbasePlus.xml
parser :
<association id= "dbaseplus.xml" langID= "0"/> <!-- Normal Text ID -->
I used this modified version :
<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ | | To learn how to make your own language parser, please check the following | link: | https://npp-user-manual.org/docs/function-list/ | \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser displayName="dBASEPlus" id ="dbaseplus" commentExpr="(?s:/\*.*?\*/)|(?-s)^(//|&&).*" > <classRange mainExpr="(?x-i) # Free-spacing mode and inline comments + search sensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part ( Group 1 ) \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part, in a non-capturing group # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class ToolButtonFx(oParent) of Toolbutton(oParent). \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( Subroutine call to Group 1 ) # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part, in a non-capturing group \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char (?: # Beginning of a non-capturing group : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) # End of a non-capturing group )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration (?s:.*?^\h*endclass) # must match all the way to 'endclass' " closeSymbole="endclass" > <className> <nameExpr expr="(?x-i) # Free-spacing mode and inline comments and search sensible to case \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Pure class name " /> </className> <function mainExpr="(?x-s) \h* (?: function \h+ \w+ | procedure \h+ \w+ | with \h+ .+ ) \h* " > <functionName> <funcNameExpr expr="(?x-s) # multiline/comments # (! // | && | * ) trying to keep following keywords from being included in comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | procedure # must have word 'procedure' as first word \h+ # must have at least one horizontal space after procedure \K # don't keep 'procedure' in the name of the function in the panel (!to)\w+ # the name of the function is the first whole word after 'procedure' - 'to' # so as to exclude any 'set procedure to' statements, needs work though. | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel ((?=\(this\))\(this\)|.+)$ # If '(this)' exits at CURRENT position then select it ELSE select any NON NULL string till END of LINE ) " /> </functionName> </function> </classRange> <function mainExpr="(?x-s) \h* (?: function \h+ \w+ | procedure \h+ \w+ | with \h+ .+ ) \h* " > <functionName> <nameExpr expr="(?x-s) # multiline/comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | procedure \h+ \K (!to)\w+ | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel ((?=\(this\))\(this\)|.+)$ # If '(this)' exits at CURRENT position then select it ELSE select any NON NULL string till END of LINE ) " /> </functionName> </function> </parser> </functionList> </NotepadPlus>
Important note : That it’s just a first try to give you some ideas !
I supposed that, when using the
with
syntax, you would want either :-
To see the
(this)
part if it exists -
To see anything till the end of line if the
(this)
part is absent, after thewith
and the blank char
And I did the modifications, both in the
classRange
part and in thefunction
parts , using anIF...THEN...ELSE...
construction :((?=\(this\))\(this\)|.+)$ # If '(this)' exits at CURRENT position then select it ELSE select any NON NULL string till END of LINE
Of course, in order that the parser work properly, I had to change this line in the two
mainExpr
parts :with \h+ .+ # INSTEAD of : with \h+ \(.*?\)
Now, I used the test file, below :
/*Test_1 OK */ /* Test_2 OK */ /* Test_3 OK */ //Test_4 OK // Test_5 OK // Test_6 OK &&Test_7 OK && Test_8 OK && Test_9 OK class ABC bla blah function foo bla bla blah with ($^|]!:Test__--~ blah bla endclass class XYZ bla function 123 bla blah with (this) blah endclass BLA function bar bla blah with (this) bla blaH with (.This.TESTCONTAINER.) bla blah
-
It correctly avoids all the comment lines, at beginning of file
-
It correctly displays the classes
ABC
andXYZ
-
It correctly displays the functions
foo
and123
And regarding the modified part :
-
If the
(this)
part exists, afterwith
, it correctly displays(this)
( method(this)
, in theXYZ
class and single functionthis
) -
If the
(this)
part is absent, afterwith
, it correctly displays all the characters till end of line ( method($^|]!:Test__--~
, in theABC
class and the single function(.This.TESTCONTAINER.)
Here is a screenshot :
Please, take time to re-read this post ! Not easy to catch everything at first glance !
Best Regards,
guy038
-
-
@guy038 ,
Thank you. Now, I will have to look at your code, but while I was signed out, I had another one of those epiphany things. :-)
Figured it out with the reduced complexity by allowing everything after the
this
and stopping before the\)
with the lookahead, by using this regex:
\Kthis\.\K(.+)(?=\))
By doing this I was able to remove the longer more complex one that only went 3 levels deep, and reducing my or construct by one or level. This is what I can get now:
I think I have a problem with overthinking things. :-) I was looking for the one you showed up there, and I might still be able to use it, somewhere else, so thanks.
Lee