FunctionList Confused
-
@lycan-thrope
Testing to see if my change fixed the highlight difference in here:<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ | | To learn how to make your own language parser, please check the following | link: | https://npp-user-manual.org/docs/function-list/ | \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser displayName="dBASEPlus" id ="dbaseplus" commentExpr="(?s:/\*.*?\*/)|(?m-s://.*?$)|(?m-s:\&\&.*?$)" > <classRange mainExpr="(?x-i) # Free-spacing mode and inline comments + search sensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part ( Group 1 ) \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part, in a non-capturing group # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class ToolButtonFx(oParent) of Toolbutton(oParent). \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( Subroutine call to Group 1 ) # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part, in a non-capturing group \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char (?: # Beginning of a non-capturing group : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) # End of a non-capturing group )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration (?s:.*?^\h*endclass) # must match all the way to 'endclass' " closeSymbole="endclass" > <className> <nameExpr expr="(?x-i) # Free-spacing mode and inline comments and search sensible to case \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Pure class name " /> </className> <function mainExpr="(?x-s) \h* (?: function \h+ \w+ | procedure \h+ \w+ | with \h+ \(.*?\) ) \h* " > <functionName> <funcNameExpr expr="(?x-s) # multiline/comments # (! // | && | * ) trying to keep following keywords from being included in comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | procedure # must have word 'procedure' as first word \h+ # must have at least one horizontal space after procedure \K # don't keep 'procedure' in the name of the function in the panel (!to)\w+ # the name of the function is the first whole word after 'procedure' - 'to' # so as to exclude any 'set procedure to' statements, needs work though. | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel \( # start paren .*? # 'this' or equivalent \) # end paren ) " /> </functionName> </function> </classRange> <function mainExpr="(?x-s) \h* (?: function \h+ \w+ | procedure \h+ \w+ | with \h+ \(.*?\) ) \h* " > <functionName> <nameExpr expr="(?x-s) # multiline/comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | procedure \h+ \K (!to)\w+ | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel \( # start paren .*? # 'this' or equivalent \) # end paren ) " /> </functionName> </function> </parser> </functionList> </NotepadPlus>
-
@lycan-thrope
There seems, according to this forum code highlighting, to be a difference in highlighting, that to me, says there’s a typo somewhere, that I’m not seeing in NPP. Screen 1 shows what looks to be normal, kind of going abnormal:
And then Screen 2 shows what looks like something unclosed or different and abnormal:
Will keep looking and thanks for at least, pointing out it works for you, so I will go back to seeing what the heck happened. errant keypress or whatever.
Lee
-
@lycan-thrope
Well, one of the things I may have just discoverd is that after taking all those functions that were showing inside comments, and putting them in a class/endclass structure, they stopped showing in the functionlist. If I’m correct, then that means the problem may be in the post class function part, since after making them disappear, I copied them outside of the class structure, and they reappeared.Lee
-
@lycan-thrope
Well this is interesting. An overseas (to me) user found that by putting the open and close parens after the functionfunction()
, he was able to make the FunctionList not see it. I just tried it out and indeed, it works. Why, I don’ t get it, but apparently it does.
Screenshot1 shows my code outside the class/endclass construct, representing a non-class function inside comments:
Screenshot 2 shows that after adding the parens, and reloading the FunctionList, the commented code is now invisible to the FunctionList:
So, I may have coded my regex improperly (other than the misnaming I did, that I fixed), or this is a weird bug maybe? The same code is used in the Class/Function class range, which is why I enclosed it in a class/endclass construct to see if was persistent or seperately problematic.
Lee
-
@lycan-thrope,
Equally problematic is that looking at the screenshot now, I just noticed that the Class/Endclass construct, is not showing as a class inside the FunctionList.Hmmm…
Lee
-
@lycan-thrope
Never mind, just remembered it has to have an UNcommented function to show in the list. :-(. DOH!! -
@lycan-thrope said in FunctionList Confused:
There seems, according to this forum code highlighting, to be a difference in highlighting, that to me, says there’s a typo somewhere
Just a quick FYI: you really shouldn’t rely on this forum’s syntax highlighter to accurately determine whether or not you have valid XML. It wasn’t built for that. (The Notepad++ plugin XML Tools would be a much better choice for such checking.)
-
The
(!to)
that you have in there a couple times does not mean what you think it means. I think you meant to say(?!to)
to say a negative lookahead to preventto
from being the next word afterprocedure
.But even that isn’t quite right, because if you had a procedure named
todosomething
, theprocedure \h+ \K (?!to)\w+
would eliminate that match. So we need to force a boundary after that as well, soprocedure \h+ \K (?!to\b)\w+
: so I think what you want for that alternation in both isprocedure # must have word 'procedure' as first word \h+ # must have at least one horizontal space after procedure \K # don't keep 'procedure' in the name of the function in the panel (?!to\b)\w+ # the name of the function is the first whole word after 'procedure' - 'to' # so as to exclude any 'set procedure to' statements, needs work though.
I don’t know why our comment expressions aren’t working right in your definition given in your “highlight difference in here” post, because in my simplified definition, that comment expression prevented it completely. Hmph.
Still, we should be able to make it so that
set procedure
or// procedure
or&& procedure
will never allow it, by making the function names require a start-of-line before the spaces, so procedure/function/with must be the first non-space on the given line to match. Yes, adding a^
to the beginning of each of the<function mainExpr="...">
attribute values seems to have worked.<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ | | To learn how to make your own language parser, please check the following | link: | https://npp-user-manual.org/docs/function-list/ | \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser displayName="dBASEPlus" id ="dbaseplus" commentExpr="(?s:/\*.*?\*/)|(?m-s:(//|&&).*?$)" > <classRange mainExpr="(?x-i) # Free-spacing mode and inline comments + search sensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part ( Group 1 ) \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part, in a non-capturing group # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class ToolButtonFx(oParent) of Toolbutton(oParent). \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( Subroutine call to Group 1 ) # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part, in a non-capturing group \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char (?: # Beginning of a non-capturing group : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) # End of a non-capturing group )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration (?s:.*?^\h*endclass) # must match all the way to 'endclass' " closeSymbole="endclass" > <className> <nameExpr expr="(?x-i) # Free-spacing mode and inline comments and search sensible to case \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Pure class name " /> </className> <function mainExpr="(?x-s) ^ # peter added ^ to make sure function/procedure/with is first non-whitespace on line \h* (?: function \h+ \w+ | procedure \h+ \w+ | with \h+ \(.*?\) ) \h* " > <functionName> <funcNameExpr expr="(?x-s) # multiline/comments # (! // | && | * ) trying to keep following keywords from being included in comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | procedure # must have word 'procedure' as first word \h+ # must have at least one horizontal space after procedure \K # don't keep 'procedure' in the name of the function in the panel (?!to\b)\w+ # the name of the function is the first whole word after 'procedure' - 'to' # so as to exclude any 'set procedure to' statements, needs work though. | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel \( # start paren .*? # 'this' or equivalent \) # end paren ) " /> </functionName> </function> </classRange> <function mainExpr="(?x-s) ^ # peter added ^ to make sure function/procedure/with is first non-whitespace on line \h* (?: function \h+ \w+ | procedure \h+ \w+ | with \h+ \(.*?\) ) \h* " > <functionName> <nameExpr expr="(?x-s) # multiline/comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | procedure \h+ \K (?!to\b)\w+ # the name of the function is the first whole word after 'procedure' - 'to' | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel \( # start paren .*? # 'this' or equivalent \) # end paren ) " /> </functionName> </function> </parser> </functionList> </NotepadPlus>
Also, I see you again had (but fixed) the confusion between
<functionName><functionNameExpr>
in a class vs<functionName><nameExpr>
not in a class. The FAQ definitely shows it correctly, but I apparently need to clarify that better in the usermanual, because it’s not well-defined there. :-( -
Thanks, that seems to have been the problem. I (mistakenly) though that the caret
^
meant it had to be the very first character on the line, all the way to the left with no spaces, which is why I didn’t try that. :-(Thanks for that
!to
fix. I have to say, both of these issues are the result that I probably don’t have the understanding of regex usage that I should, and that is definitely on me, so thanks for these fixes I should have caught. :(Anything to help clarity, but really, the manual and the Outline that MAPJe71 lays out, is pretty clear, it’s just kind that if I had followed your process of doing the function first, I wouldn’t have gotten confused thinking the Class fuction finding structure, was the same for the non-Class function finding structure. This might be what I saw in the highlighting of the forum syntax highlighting being the difference, and started searching for bad tags and found it. :) Fortunate coincidence, maybe, but effective just the same.
Thanks a bunch. As one of our programmers in the group always points out, that code always works against the test usage, it’s when it get used against other usage that it breaks. That was the problem here. While I was testing against the kind of code I write (without commenting out sections) it worked just fine…it wasn’t until someone that does comment out code sections used it that the error came to light. :-)
Again, many thanks for the help and hopefully I won’t bother you over the New Year, since other than the hints I’m meticulously going over to put together, this problem should be the last regex related one. :-)
Happy New Year,
Lee -
hello, @lycan-thrope and All,
First, I wish an excellent year 2022, to any N++ user, with less
COVID-19
stories than before !
Well, Lycan, I thought that the kind of regex, below, would solve the problem of false positive
functions
,procedures
andwith (this)
things !commentExpr="(?s:/\*((?!function|procedure|with \(this\)|endwith).)*?\*/)|(?-s://|&&)((?!function|procedure|with \(this\)|endwith).)*$"
Unfortunately, this does not work, too ! Indeed, assuming this INPUT text, in a new tab :
/*Test_1 */ //Test_2 &&Test_3 //with (that) /* Test_4 */ // Test_5 && Test_6 // with (that) /* Test_7 */ // Test_8 && Test_9 // with (that) //function //function ABC &&function DEF /*function GHI */ //with (this) //endwith //procedure //procedure JKL // function // function ABC_1 && function DEF_1 /* function GHI_1 */ // with (this) // endwith // procedure // procedure JKL_1 // function // function ABC_2 && function DEF_2 /* function GHI_2 */ // with (this) // endwith // procedure // procedure JKL_2 //Test this function being caught // Test this function being caught // Test this function being caught class Test bla blah function toto bla bla blah blah bla endclass
and using your
dbasePlus.xml
file, with my newcomments
tag, thefunction list
panel does display theclass
Test and the embeddedfunction
Toto.Now, if you try to see the occurrences of the associated regex :
(?s:/\*((?!function|procedure|with \(this\)|endwith).)*?\*/)|(?-s://|&&)((?!function|procedure|with \(this\)|endwith).)*$
, it’ll just highlights, as expected, the following lines :/*Test_1 */ //Test_2 &&Test_3 //with (that) /* Test_4 */ // Test_5 && Test_6 // with (that) /* Test_7 */ // Test_8 && Test_9 // with (that)
Which are really true comments ! Note that
//..with (that)
is considered as true comment because it’s different fromwith (this)
However, the
commentExpr
tag still displays the differentFunctions
,procedures
and so on, as valid values, although they are defined incomment
parts !So the simple rule to remember is to forbid any piece of code in
comments
!Best Regards,
guy038
-
Hi, @lycan-thrope and All,
Back to what I wrote yesterday, I realize that my reasonning is absolutely false. For instance, if I suppose, as does my regex, that the word
function
must not be present in comments, this implies that it should be displayed as a normalfunction
, i.e. exactly the opposite way that I suggested before :-((So, inverting the regex logic, I should have used the regex :
commentExpr="(?s:/\*((?=function|procedure|with \(this\)|endwith).)*?\*/)|(?-s://|&&)((?=function|procedure|with \(this\)|endwith).)*$"
But, in this case, normal comments, not containing any code, would not be considered as comments, too :-((
So, globally, we should say that the
commentExpr
tag is not acting the way it should ! Indeed, whatever occurs, right after the comment character(s), it should always be considered as a true comment !Thus, again, just forbid any piece of code in
comments
!BR
guy038
-
@guy038
Well, before Peter found the problem with my regex, that was my advice to the guy with the problem. :-)Patient: Doctor, it hurts when I do this.
Doctor: Stop doing that, then. That’ll be $50.00, please.Anyway, with Peter’s help it was fixed, and I’m on to the Auto Completion stuff to try and finish this project within the next few days.
Thanks for looking at this, though. After I get done with the Auto Complete, I have to go back and look at removing the ’
(this.object.object)
parens and let it show the final object as the object listed in the FunctionList, so that it looks a little neater, and more in keeping with the present usage of the in IDE editor views…but that’s for playing later, I just want to get the full project done before I start fine tuning things. :-)Happy New Year to you also, and all the forum dwellers here.
Lee
-
@lycan-thrope
Okay, it’s a New Year. Back to the regex stuff. :-)I’m trying to remove the Parens from around the
this
naming that shows the objects in our FunctionList, and am trying to pick off the end name or any combination to display without thethis
. I came up with this regex, that seems to do the job of isolating the last word in the parens and capture it in it’s own capture group, but I seem to be having trouble making it work in side of the functionList regex.([.](\w+)\))
As this screenshot of Regex101.com shows:
This is what comes up with your regex:
And what it shows when I try just assigning capture group 2 for the id with this
([.](\w+)\)) \2
in place of your\( *.? \)
regex:
The problem seems to come when I try anything other than the original regex you developed Peter. I tried a couple of things to try and give an option of either this last word or
this
…kind of like(this | ([.](\w+)\))
and most often it comes up with the first(this)
object and nothing else, or nothing at all, only functions. Is there a way to isolate the names without thethis
and the parens like a look-behind thing or am I way off base here?Lee
-
@lycan-thrope
By the way, the only thing that really changed, was that I got rid of the \K and allowed the Function or Procedure to be included with it’s name. One of the users who has a lot of old dBASE code that he’s porting over to the dBASEPlus has the differentiations, and this helps him quickly identify renaming or recoding the differently named identifiers. We found and he liked this quirk that the Functions inside comments was doing as he was able to identify them by doing that, but until you showed me how to fix it, he was going to wait until I was able to change it, and once I got the Autocompletion done, it was a matter of minutes before I figured out what to do to return that functionality for him and he’s really grateful for the ability to discern them now. :-) You’ve a big help already to our community, so thanks.Lee
-
@lycan-thrope
Never mind, just saw this section of the FunctionList explanation.
The parser can only search for function names, it will not do regular expression replacement or modification (so you cannot add text to the matching names)
I guess the question becomes, is there a way to isolate those sections that I do want with regex, and then pose a selection choice
this| (foundtext)
depending on if there are other names are not.Lee
-
@lycan-thrope
Well, I have hope I’m on to something here. Tried a different site to explain some of the look ahead/behind stuff and managed to come up with something that half works. :-)The first half works in that it changes
(this)
tothis
in the functionList panel for the first object which is the Form itself. I used this regex, probably improperly or need to phrase it differently\(\Kthis(?=\))|[.]\K\w+(?=\))
but the OR condition didn’t take, or it did but since the other objects have this with a dot in it, it disqualifed the condition and threw out the final objects instead. So now this:
Looks like this:
I need to figure out how to make the regex choose between supplying the captured group of either
this
by it’s self, OR if a match works that ends with a dotWordclosingparen, it displays just the word for that Object.If the above screenshot, the
(this.TESTCONTAINER.VSCROLLBAR1)
is a child object of the TESTCONTAINER, but…the VSCROLLBAR1 is an obect in it’s self. If I can capture the whole thing afterthis
without the parens, but the dot between the two objects that would be considered a success since it at least cleans it up, but I am tryig to just pick off the last object name so the list would just include the objects themselves for navigation purposes when using the FunctionList panel…so I’m still working on it, trying to see how to make it work.Lee
-
@lycan-thrope
Interesting, reversing a different, but similar regex gets me the VSCROLLBAR1 but nothing else. Hmm… I guess the or operator is not going to get this job done. Hmm. -
Okay, now we’re cooking with gas. :-)
I figured out how to exclude the parens with
\K(.*?)(?=\))
and here’s what it looks like:
I fear, however, keeping the
this
for the first object, and removing it from the rest is going to be a bit harder…but maybe not if I can figure out how to make the or work and figure out how to keep thethis
keyword if followed by the closing paren, but not if it’s follow by a dot operator, and use the rest of the object description instead. This is kind of fun…and kind of frustrating at the same time. :-)Lee
-
@lycan-thrope
Any chance that there is an IF/ELSE kind of Regex construction that the FunctionList parser will accept, instead of selecting a capture group, or using OR(|)?Lee
-
@lycan-thrope
Well, success for the most part for the original goals. :-)Of course, as usual, goals shift. :-)
I used Peter’s or break down and added additional ones that started with the longest number (in this case, and unfortunately only this case) 3, and had two more or’s with the whole regex up to the point of the opening parens being starting the reset, and changed the capture portion.
\Kthis\.\w+\.\K\w+(?=\))|\)this\.\K\w+(?=\.)|\(\Kthis(?=\))
picks off the longest extended name and does it first, followed by the next section with this regex:
\Kthis\.\K\w+(?=\))
to pick off any objects immediately after the dot operator followingthis
to capture that object and then folowed by:
\K(.*?)(?=\))
which captures thethis
object before the closing parens.Screenshot of new FunctionList panel at work:
So this was the original hope of being able to do, but after looking at it, since I can’t do replacement text, it makes sense to keep the object list together past the first one following thethis
, as that denotes a parent objectTESTCONTAINER
and the lineage to the child objectVSCROLLBAR1
.So now my next goal, is to try and check after the initial object after the superparent
this
is set alone to allow any other objects that have children to continue being sucked into the capture and listed as is, in this caseTESTCONTAINER.VSCROLLBAR1
. So back to the drawing board on figure out how to test for a following dot operator without stopping the accumulation from stopping at the point. The look-ahead (positive?) worked to identify and not include the closing paren or the dot operator, so now to test a negative lookahead?Thanks for help so far Peter, et al. It’s fun again for the moment. :-)
Lee