FunctionList Confused
-
@michael-vincent said in FunctionList Confused:
Note I think there is an error in @PeterJones
functionName
block in theclassRange
tags.And when I fix that, I now see the function in the class.
So @Lycan-Thrope , that’s what you need to fix, and my simple mixed parser now works with your example code.
<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ | | To learn how to make your own language parser, please check the following | link: | https://npp-user-manual.org/docs/function-list/ | \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser displayName="dBASEPlus" id ="dbaseplus" commentExpr="(?s:/\*.*?\*/)|(?m-s://.*?$)" > <classRange mainExpr="(?xms) ^ \h* class \h+.* ^ \h* endclass" > <className> <nameExpr expr="(?x) [\t\x20]*? ^class [\t\x20]* \K(\w+)" /> </className> <function mainExpr="(?x) \h* function \h+ (\w+)" > <functionName> <funcNameExpr expr="(?x) \h* function \h+ \K (\w+)" /> </functionName> </function> </classRange> <function mainExpr="(?x) \h* function \h+ (\w+)" > <functionName> <nameExpr expr="(?x) \h* function \h+ \K (\w+)" /> </functionName> </function> </parser> </functionList> </NotepadPlus>
(Since I used the
(?x)
, you can go back and add newlines and comments to document the regex again; I just collapsed them because it’s easier to copy/paste into FIND dialog that way) -
@michael-vincent
Thank you Michael for that catch. I wouldn’t have seen that while following instructions. :-)Lee
-
@peterjones
Thank you so much also Peter. I can now feel somewhat elated at banging my head all day yesterday. :-)Now to get back to work…had to take sometime for house chores since I neglected them yesterday beating my head on the keyboard. :-)
Lee
-
@lycan-thrope
Argh…sigh…I must be setup wrong or something. I just tried it with it in both the Appdata directory NPP FunctionList directory and the Program(x86) FunctionList directory and it’s not showing anything…so back to reading the docs, since I know yours works. :-( -
Can you show the relevant portion of
overrideMap.xml
, and compare that to the name in the User Defined Language list / definition?in my working version (which is a portable, so no %AppData%):
<!-- ==================== User Defined Languages ============================ --> <association id= "krl.xml" userDefinedLangName="KRL"/> <association id= "sinumerik.xml" userDefinedLangName="Sinumerik"/> <association id= "universe_basic.xml" userDefinedLangName="UniVerse BASIC"/> <association id= "dbaseplus.xml" userDefinedLangName="dBASEPlus"/> <!-- ======================================================================== -->
and from the UDL definition:
<UserLang name="dBASEPlus" ext="dbp" udlVersion="2.1">
Notice that the
userDefinedLangName
in overrideMap andname
in UDL definition match exactly.And the active file must have selected the
dBASEPlus
UDL as its syntax highlighter – either through automatic extension (.dbp
in my example), or by manually selecting Language > dBASEPlus, where the dBASEPlus will be down below the User Defined Language… submenu:Also remember that for any change in functionList settings to take effect, you have to completely exit Notepad++ and restart it.
-
@lycan-thrope
Well, kind of at a loss to explain this. I’ve gone over the installations again, made sure the Appdata directories had the right files, even went into Admin mode to make sure it wasn’t rejected. The folowing screenshot shows the overrideMap.xml at the end where the UDL association is stored.
Mind you, this was while I only had the UDL in the dialog box, not stored in the UDL folder in AppData area.When I thought, hmm…maybe the file needs to be in the UDL Appdata folder I took an exported version I’ve made, renamed it appropriately “dBASEPlus.xml” and now I have this:
I wasn’t able to get it to work without the duplicate there or not. I’m going to delete the UDL file contents to see if that does the trick, either way, all the file extension needed were listed in that UDL and the pasted and copied version of Peter’s code still wasn’t working, so it must be me. :-(
File extensions in UDL file:
I guess I’ll need to retrace my steps.
Lee
-
@lycan-thrope
Smacking self vigorously about the face and head. :-(I’m glad I was re-reading out posts, and noticed that I had NOT exactly put the right name in the associations I had spelled it “dBasePlus” in the non-working file and after changing it to the proper “dBASEPlus”, it worked on reboot…so thank you very much Peter for saving me from myself. Maybe now I can get to some productive work of finishing this UDL for use for the community and of course, if it’s of use to this NPP community, it’s of course usable…just need to check how to add it to the UDL’s. Of course, I need to make sure the rest of the keywords are put in, don’t collide with each other and other delimiters etc…and make sure I can fully flesh out as much as I can without having to actually write a lexer for the language, because…let’s face, I don’t think I’m qualified for that extensive bit of work. :-)
Lee
-
@lycan-thrope
Next hurdle, I need to addess and ask about. Does the parser only handle one class and multiple functions per file? Or does it require for a class to be listed that it has a function in it?I tried using my original code with some of Peter’s and found it messing up on the class not showing up, and the functions acting a little erratic. I’ve put it back to the way he wrote it now, with one or two changes that don’t seem to affect the end result, but I’m checking the different kinds of files we use, and one in particular, our .cc file which usually are Custom Component files include more than one class.
In the one I was testing of my own creation, of the multiple classes I have in it, only one of them has a function, and it was the only one that showed up in FunctionList panel, which is what prompted this question coming up. If I just need to work further to get it to function listing classes, that is one thing, but if it’s not built to that, I wanted to check, as I don’t seem to have any multiple class code files available right now for any of the languages available in NPP…I will but just don’t have them right now…and it’s getting late time to stop beating the keyboard. :) Thanks in advance.
Here’s two screenshots of the one class, multiple functions and one of the multiple classes but one function in the FunctionList panel.
Lee
-
@lycan-thrope said in FunctionList Confused:
Does the parser only handle one class and multiple functions per file? Or does it require for a class to be listed that it has a function in it?
It can handle multiple classes, but for a class to be listed, it must have at least 1 function in it. It can also handle “main” functions - that is, functions not in a class.
Cheers.
-
Hello, @lycan-thrope @peterjones, @michael-vincent and All,
Sorry to be late ! From your functionList file, provided in this specific post, I tried to re-build and simplify your regexes which allow to detect all the syntaxes of
DBasePlus
classes, as well as the class names !-
I added, at beginning, the
(?-i)
in-line modifier for a search sensitive to the case -
I mainly used the
\h
syntax, instead of[\t\x20]
, as it’s equivalent to[\t\x20\x85]
-
I replaced any
\w*
syntax by\w+
, as we don’t search for empty words ! Don’t we ? -
I suppressed some useless groups
-
I changed some complex multi-lines groups
(•••••)
in the non_capturing equivalent groups(?:•••••)
-
I kept the first optional parameter(s) part as group
1
, which is re-used, with the same syntax, in the optional ‘of’ part
Leading to the following class part of the
dBasePlus
parser :<NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser displayName="dBASEPlus" id ="dbaseplus" commentExpr="/\*.*?\*/|(?-s://.*)" > <classRange mainExpr="(?x-i) # Free-spacing mode and inline comments + search sensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part ( Group 1 ) \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part, in a non-capturing group # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class class ToolButtonFx(oParent) of Toolbutton(oParent). \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( Subroutine call to Group 1 ) # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part, in a non-capturing group \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char (?: # Beginning of a non-capturing group : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) # End of a non-capturing group )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration " closeSymbole="endclass" > <className> <nameExpr expr="(?x-i) # Free-spacing mode and inline comments and search sensible to case \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Pure class name " /> </className> ... ... ... ... </parser> </functionList> </NotepadPlus>
Best Regards,
guy038
-
-
@guy038 ,
I tried replacing my simple class detection with yours, and it doesn’t show the classes – even when I keep my function definitions. When I put your expression into FIND, it’s back to matching only a single line, not the whole class – so that means, it can never find a function inside that class, and it thus won’t display anything in the function list (which is the second problem I pointed out here: “
mainExpr
needs to match all of the text in the class…”)If we use the function blocks from my “my simple mixed parser now works”, and incorporate your regex (unaltered), then I get
… which is not finding it as a class, and is instead finding it as a top-level function (which isn’t what it really is).
This does confirm that defining
closeSymbole
isn’t enough to get it to extend the class toendclass
even when not in themainExpr
.Thus, if I add
(?s:.*?^\h*endclass) # must match all the way to 'endclass'
after your $ line in the
mainExpr
, which will then find@Lycan-Thrope , That last line I added defines a non-capturing group where
.
-matches-newline, and finds as few characters as possible between the end of the class-starting-line, and the first instance of the wordendclass
at the beginning of a line (with potential whitespace before) – thus making themainExpr
match the whole class, not just the first line of the class.<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ | | To learn how to make your own language parser, please check the following | link: | https://npp-user-manual.org/docs/function-list/ | \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser displayName="dBASEPlus" id ="dbaseplus" commentExpr="(?s:/\*.*?\*/)|(?m-s://.*?$)" > <classRange mainExpr="(?x-i) # Free-spacing mode and inline comments + search sensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part ( Group 1 ) \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part, in a non-capturing group # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class class ToolButtonFx(oParent) of Toolbutton(oParent). \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( Subroutine call to Group 1 ) # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part, in a non-capturing group \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char (?: # Beginning of a non-capturing group : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) # End of a non-capturing group )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration (?s:.*?^\h*endclass) # must match all the way to 'endclass' " closeSymbole="endclass" > <className> <nameExpr expr="(?x-i) # Free-spacing mode and inline comments and search sensible to case \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Pure class name " /> </className> <function mainExpr="(?x) \h* function \h+ (\w+)" > <functionName> <funcNameExpr expr="(?x) # multiline/comments \h* # allow leading spaces function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel (\w+) # the name of the function is the first whole word after 'function' " /> </functionName> </function> </classRange> <function mainExpr="(?x) \h* function \h+ (\w+)" > <functionName> <nameExpr expr="(?x) # multiline/comments \h* # allow leading spaces function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel (\w+) # the name of the function is the first whole word after 'function' " /> </functionName> </function> </parser> </functionList> </NotepadPlus>
-
@guy038
Thanks guy038, for this lesson in regex construction. I was curious if I was being overly verbose in my regex for that line, but it does work correctly the way I wrote it in regex101.com, so I presumed it would work here. So far, although it works in the sandbox over there, and it does work in the find in NPP, I was still, as Peter points out next, not able to capture all the way to endclass, so this version still did what mine was doing. :-)…but I did pick up, because I understand my syntax, where I need to use capturing and non-capturing groups, as I was a bit lost as what meant what and where. So thanks. Here’s a screenshot, that shows the same thing I discovered last night…err early this morning before I called it quits that shows what I messed up, and Peter’s new version corrects. Mine, which you duplicate less verbosely, shows that the first class in the list gets listed with the function that actually belongs to a class further down in the code.
So thanks for the education, though. :)
Lee
-
@peterjones ,
Thanks, that works much better, and as I pointed out above, it fixes a problem I discovered late last night, this morning, that shows the wrong class with the function attached to it…which I’m going to guess is the problem you fixed with including that endclass in the expression. I was trying that also, but my verbose and maybe confusing capturing grouups was probably preventing it from working to that point. ::shrug::Here’s a screenshot of my multiclass file showing the proper class with the function that your parser code finds.
Thank you for getting this correct, if I’m not able to show classes without functions as Michael Vincent states(shout out to Michael), at least it now shows the right class with the found function.
This lack of class showing without functions, however, kind of hurts the purpose of my goal which was to use the FunctionList panel as a navigational tool, much like in the dBASEPlus editor that shows the classes and objects included in each class to be able to jump to those classes and objects to edit them. Some of the .cc files can be quite extensive since they can hold all the UI modifications a developer may want to put in it, and many of the code files can get quite large and need a some help navigating. I guess we could circumvent the problem by putting a null function in every class to compensate for the lack of ability to get the FunctionList to display the classes. But thank you for helping getting this to work properly.
Lee
-
@lycan-thrope said in FunctionList Confused:
This lack of class showing without functions, however, kind of hurts the purpose of my goal which was to use the FunctionList panel as a navigational tool,
If a class will always have at least one of “with” and “function”, then change the
function
inside theclassRange
to:<function mainExpr="(?x-s) \h* (?: function \h+ \w+ | with \h+ \(.*?\) ) \h* " > <functionName> <funcNameExpr expr="(?x-s) # multiline/comments \h* # allow leading spaces (?: function # must have word 'function' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'function' in the name of the function in the panel \w+ # the name of the function is the first whole word after 'function' | with # must have word 'with' as first word \h+ # must have at least one horizontal space after function \K # don't keep 'with' in the name of the function in the panel \( # start paren .*? # 'this' or equivalent \) # end paren ) " /> </functionName> </function>
It will then give the “argument” to
with
as a function name as well…
It might not be exactly what you want, but it comes close. -
@peterjones
You’re a genius. :-)Every class does indeed, as far as I know, have to have a
this
andwith endwith
construct that sets the properties of the object it will create, unless it is only copying and giving a different name to a default superclass.I was going to ask and point out that I thought the C++ mixed parser that MAPJe71 shows in the FAQ, would work because it shows a function being declared inside the class construct, but actually being defined outside.
dBASE has a similar thing where it is assigned as a property
onOpen = class::objectsname_onOpen
and then is later either actually defined in the class definition, or externally defined (if that’s possible). I know we define helper functions outside of the class constructor that are either defined in the same file, or an external file that will be named as needed or prior to the start of the form or function it’s being called in. Like I said, it’s kind of loosely typed and ambiguous, but it works :)I tried my earlier suggestion of putting a null function in each class and took a screenshot of it.
Simple search and replace on endclass and adding the fake function made it work, but I kind of like yours better, maybe using the
this
orwith
words will work better. Thanks for this.Lee
-
@peterjones
Incidentally, Peter.While trying to copy this over, I hadn’t read that I should put this in the classrange/function body, and had put it in the funtion body outside. No biggie right. I took the copy from the upper post, took this and copied it into the right section, then set about to remove my fake function from the .cc file using the search terms and switching them between search and replace boxes. After going through the list, which I thought for sure, I had bypassed the only class with a real function I saved and ran the program, only to find that I had replaced the real function.
Argh…panic set in. Oh wait…I have a .bak file of the original saved one. I recopied it and then tested the functionlist again and viola. See how disparate threads lead to one point? :-) Saved from myself, again. :-)
Anyway, thanks for this. I think you have made it very possible for this package to be a nice Christmas present for the dBASEPlus community that has been longing for an external editor with the basic features of the IDE editor with some of the more advanced tools available in Notepad++ of which many were users, but didn’t have highlighting which NPP now makes it possible for use to combine the use. Now our folks, as one user put it, be able to edit the file externally, and run it right away without having to close the editor just to compile the program. Of course, thanks to the default .bak choice, it will not function like our environment to save us from it and ourselves. :-)
Lee
-
Hi, @lycan-thrope, @peterjones and All,
As you can see, there are different ways to get an element, whatever it is, listed, in the
Function List
panel !!I’ll explain what my process was, for writing my previous post and how I tested my regex solution !
First, from the initial list of the
9
class syntaxes, provided by @lycan-thrope, below :class TruckNotebookForm of TBASE from :Truck:Truckbase.cfm class PlainObjectListForm of FORM class FrameCtrl(frameObj) of dBCWndCtrl custom class dBCWndCtrl class FrameAppCtrl of FrameCtrl custom class ToolButtonFx(oParent) of Toolbutton(oParent) custom class MenuFx(oParent,cName) class dContainersForm of DFORM from "dForm.cfm" class LGCENTRYFIELD(parentObj, name) of ENTRYFIELD(parentObj, name) custom
I decided to search, first, for a regex solution, only, without any reference to the
Function List
mechanism ! In addition, I considered only the line of theclass
definition ! After modifications, I ended up with the followingMainExpr
regex :(?x-i) # Free-spacing mode and inline comments + search ensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name ( # Beginning of the optional parameter(s) part \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part (?: # Beginning of the main optional part \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( \h custom )? # Optional 'custom' keyword (?: # Beginning of the optional part \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char ( : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) )? # End of the optional part )? # End of the main optional part and end of the class declaration
If you select all the text between
(?x-i)
and the last comment..... and end of the class declaration
( so a selection of1,607
chars, which is under the maximum of2,046
chars ) and open the Find dialog (Ctrl + F
), you’ll verify that it correctly matches the9
syntaxes, provided by @lycan-thrope !Of course I, tested my regex solution and was upset to notice that it would also match wrong pieces of code :-((
For instance, after taking the last syntax and adding a
#
char in some places, it would wrongly match some part of the two lines, below :class LGCENTRYFIELD(parentObj, name) of ENTRYFIELD(parentObj, #name) custom class LGCENTRYFIELD(parentObj#, name) of ENTRYFIELD(parentObj, name) custom
As you can see, as soon as a part contains a non-allowed char (
#
), my regex skips it and just matches the minimum valid form :-( I finally solved this problem by adding, at the end of my regex, the$
symbol which forces the regex engine to get valid syntaxes on a complete line !So, the right multi-lines regex, for the
mainExpr
attribute, is rather :(?x-i) # Free-spacing mode and inline comments + search ensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name ( # Beginning of the optional parameter(s) part \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part (?: # Beginning of the main optional part \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part ( \h custom )? # Optional 'custom' keyword (?: # Beginning of the optional part \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char ( : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration
And you’ll notice, this time, that the last
2
syntaxes, below, are not matched, as expected :class TruckNotebookForm of TBASE from :Truck:Truckbase.cfm class PlainObjectListForm of FORM class FrameCtrl(frameObj) of dBCWndCtrl custom class dBCWndCtrl class FrameAppCtrl of FrameCtrl custom class ToolButtonFx(oParent) of Toolbutton(oParent) custom class MenuFx(oParent,cName) class dContainersForm of DFORM from "dForm.cfm" class LGCENTRYFIELD(parentObj, name) of ENTRYFIELD(parentObj, name) custom class LGCENTRYFIELD(parentObj, name) of ENTRYFIELD(parentObj, #name) custom class LGCENTRYFIELD(parentObj#, name) of ENTRYFIELD(parentObj, name) custom
However, this second version still considers the current
class
definition line as the only domain to study, without any further stuff and/or anyendclass
statement !
In a second time, I tried to test this final regex version with the
Function List
feature. So, in theoverrideMap.xml
file, I added the line :<association id= "dBasePlus.xml" langID= "0"/> <!-- Normal Text ID -->
Then, I used the general template, provided by @lycan-thrope, to build a correct
dBasePlus.xml
file. But, NO chance, I was unable to get the list of classes, in theFunction List
panel :-(( I suppose that relation between classes and functions, in amix
orclass parser
, must be of importance ! But after numerous tries, I gave up, as I’m rather not acquainted with modern structured languages :-(But, as I’m rather stubborn, I decided to simplify the problem by using a simple
function parser
. Of course, in this case, the “elements”, detected by theFile List
mechanism, are seen as functions but, actually, may represent any element that we need to be listed. For instance, presently, thedBasePlus
classes !From this old page, caught by the
wayBack Machine
site :https://web.archive.org/web/20190826024431/https://notepad-plus-plus.org/features/function-list.html
I used this minimal form of a
function parser
( Can’t do more simple ! ) :<NotepadPlus> <functionList> <parser id ="xxxxx" commentExpr="yyyyy" > <function mainExpr="zzzzz" <functionName> <nameExpr expr="wwwww" /> </functionName> </function> </parser> </functionList> </NotepadPlus>
Giving the functional
dBasePlus.xml
file, below :<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ To learn how to make your own language parser, please check the following link: https://npp-user-manual.org/docs/function-list/ \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser id ="dBasePlus" commentExpr="(/\*.*?\*/)|(?-s://.*)" > <function mainExpr="(?x-i) # Free-spacing mode and inline comments + search ensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class class ToolButtonFx(oParent) of Toolbutton(oParent) \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char ( : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration " > <functionName> <nameExpr expr="(?x) # Free-spacing mode and inline comments \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Class name " /> </functionName> </function> </parser> </functionList> </NotepadPlus>
You may test it, by pasting, in a new tab, the text, below, with the
normal text
language :class TruckNotebookForm of TBASE from :Truck:Truckbase.cfm class PlainObjectListForm of FORM class FrameCtrl(frameObj) of dBCWndCtrl custom class dBCWndCtrl class FrameAppCtrl of FrameCtrl custom class ToolButtonFx(oParent) of Toolbutton(oParent) custom class MenuFx(oParent,cName) class dContainersForm of DFORM from "dForm.cfm" class LGCENTRYFIELD(parentObj, name) of ENTRYFIELD(parentObj, name) custom
Best Regards
guy038
-
@guy038 , your efforts are appreciated, like I said, if only to show I could have been less verbose.
Originally, I was trying to decipher the C++ FunctionList structure, but was getting dizzy trying to figure it out since it’s not my native language, so regex aside, some of the structures it was trying to capture was dizzying. Hence I tried shifting to the Java FunctionList to study, and it helped show I didn’t need the organized chaos that C++ was. :-)
It still, however, seemed to me to be trying to parse a single line to of possible options for the lines, so I began my journey trying to just regex the options of that one line. My hope was to be able to just get that recognized and just show the class name. I still don’t completely understand the workings of FunctionList, but even though Peter and you both seemed to help solve the problem, I’m happy, but still playing just to advance my knowledge of this and see if I can’t enhance or improve things.
If nothing else, this little experiment has kind of turned me on to regex, which I’ve never needed or used until now to make this. I’ve done my own little parsers, but in C or dBASE, simple little things, but dealing with the regex that had to be put inside the xml and figure everything out was …let’s just say kind of overwhelming, but educational. So thanks for that.
I find trying to regex new lines and such in the Find search of NPP doen’t work for me trying to get down to the “endclass” part and was frustrating as hell trying to make it work, using what seemed like the proper way, using the ‘class’ and ‘endclass’ keywords in the open and close symbole seemed like a simple answer…but it wasn’t…so, here we are :-) I’ve shown screenshots to the community and they’re salivating for it. Anyway, this is a work of love so thanks again. I do understand the stubborn part. I’ve been banging the head and neglecting my house chores trying to crack this, which is why I finally relented and reached out…and grateful that I did.
Happy Holidays while I try and tie this package up for delivery. :-)
Lee
-
Hello @lycan-thrope, @peterjones and All,
Well, today is an other day ! So, let’s go on studying some more structures !
Let’s suppose, as previously mentioned, that you need the general
class
syntax, below, that you’ll paste in a new tab :class Test_1 bla blah with (this) bla bla blah endwith blah bla bla endclass classTest_2 bla blah with (this) bla bla blah endwith blah bla bla endclass
So, after a simple
class
definition line, you would need :-
Further on, a line with the
with
keyword and its parameter(xxxxx)
, after possiblewhitespace
chars -
Further on, a line with the
endwith
keyword, after possiblewhitespace
chars -
At last, a line with the
endclass
keyword, after possiblewhitespace
chars
Then, each entire
class •••••••••• endclass
section, which meets all the rules, could be matched with the multi-lines regex, below :(?sx-i) # We MUUST add the '(?s)' in-line modifier as we search for a MULTI-LINE range of chars ^ \h* # Optional leading whitespace chars, beginning a line class # Mandatory 'class' keyword \h? # Optional whitepace char \w+ # Class name $ # End of current line ((?!endclass).)*? # The SMALLEST range of characters, even NULL, NOT CONTAINING the 'endclass' keyword, till ... ^ \h* # Optional leading whitespace chars, beginning a line with # Mandatory 'with' keyword \h # Mandatory whitepace char \( \w+ \) # Mandatory parameter name, between parentheses $ # End of current line ((?!endclass).)*? # The SMALLEST range of characters, even NULL, NOT CONTAINING the 'endclass' keyword, till ... ^ \h* # Optional leading whitespace chars, beginning a line endwith # Mandatory 'endwith' keyword $ # End of current line ((?!endclass).)*? # The SMALLEST range of characters, even NULL, NOT CONTAINING the 'endclass' keyword, till ... ^ \h* # Optional leading whitespace chars, beginning a line endclass # Mandatory 'endclass' keyword $ # End of current line
Note that, in order to get the smallest range of chars between two lines of importance, as
with
,endwith
orendclass
, we have to use the regex syntax((?!endclass).)*?
and not the simple regex.*?
. Why ? Just because it must not match a greater rangeclass ••••• endclass ••••••••••••••• class ••••• endclass
, in the case where an innerclass
section does not satisfy the regex rules ! Thus, the keywordendclass
must not be present at any position of the range !Now,
-
Select from
(?xs-i)
to the last# End of current line
-
Open the Find dialog (
Ctrl + F
) -
Test it, against the above text : it should select any block of lines, between the
class
andendclass
lines !
Finally, if we insert this additional regex part, in the
dBasePlus.xml
file, we obtain :<?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ To learn how to make your own language parser, please check the following link: https://npp-user-manual.org/docs/function-list/ \=========================================================================== --> <NotepadPlus> <functionList> <!-- ========================================================= [ dBASEPlus ] --> <parser id ="dBasePlus" commentExpr="(/\*.*?\*/)|(?-s://.*)" > <function mainExpr="(?x-i) # Free-spacing mode and inline comments + search ensitive to case ^\h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \w+ # Class name # Following the class name there is the option of parameters, and if so the first entry inside the parens is required, whether there is other # parameters or not, once the parens go up, the first is required. ie: class FrameCtrl(frameObj) ( # Beginning of the optional parameter(s) part \h? \( # Opening parenthesis \w+ # First and required parameter ( , \h? \w+)* # Following optional/additional parameters \) # Closing parenthesis )? # End of the optional parameter(s) part # For the rest of the class declaration, after the class name, all other options are part of one big optional set, that follows 'of' # and can be populated by one of several options. (?: # Beginning of the main optional part # The first and most prevalent is the Superclass name that the class is being subclassed from, and it's options of parameters and again, # if it has parameters, at least the first one is required ie.: class class ToolButtonFx(oParent) of Toolbutton(oParent) \h of \h # Optional 'of' keyword, surrounded by 1 horizontal whitespace char \w+ # Superclass name (?1)? # Optional parameter(s) part # The next possible option is that it is a custom object and needs to be in this line so if the object or form is opened up in the dBASE IDE, # the designers in it won't mess up the object by streaming out missing parts or overriding properties or objects and functions. ( \h custom )? # Optional 'custom' keyword # The next possible option is that the class is being subclassed from another object that is contained elsewhere and the compiler needs to know # this reference. There are two options for pointing to the file. The first is an Alias path in the IDE that can be accessed by the compiler # in the environment, or second, it is in the current directory and only the name is needed...or it has a path that can be listed here, # but this is bad practice, and an Alias is recommended if the file is in a place other than the current directory. If it is, the name can be # used in quotes as a string that gets passed to the compiler. Both follow the word 'From'. The Alias directory is a name that is enclosed # in two colons, one immediately before the Alias name and one immediately after, no spaces. (?: # Beginning of the optional part \h from \h # Optional 'from' keyword, surrounded by 1 horizontal whitespace char ( : \w+ : \w+ \. \w+ # First pointing file case | # OR \x22 \w+ \. \w+ \x22 # Second pointing file case ) )? # End of the optional part )? # End of the main optional part $ # End of current line and end of the class declaration ((?!endclass).)*? # The SMALLEST range of characters, even NULL, NOT CONTAINING the 'endclass' keyword, till ... ^ \h* # Optional leading whitespace chars, beginning a line with # Mandatory 'with' keyword \h # Mandatory whitepace char \( \w+ \) # Mandatory parameter name, between parentheses $ # End of current line ((?!endclass).)*? # The SMALLEST range of characters, even NULL, NOT CONTAINING the 'endclass' keyword, till ... ^ \h* # Optional leading whitespace chars, beginning a line endwith # Mandatory 'endwith' keyword $ # End of current line ((?!endclass).)*? # The SMALLEST range of characters, even NULL, NOT CONTAINING the 'endclass' keyword, till ... ^ \h* # Optional leading whitespace chars, beginning a line endclass # Mandatory 'endclass' keyword $ # End of current line " > <functionName> <nameExpr expr="(?x) # Free-spacing mode and inline comments \h* # Optional leading whitespace chars class # 'class' keyword \h? # Optional whitepace char \K\w+ # Class name " /> </functionName> </function> </parser> </functionList> </NotepadPlus>
Note that, in the
Function List
parser, we do not have to add the(?s)
modifier. This is enabled by default, meaning that the dot regex symbol.
matches absolutely any character of theBMP
Unicode plane !Best Regards,
guy038
-
-
@guy038 ,
That made sense, and I can use that to play with, because it did what I was trying to, although, so far, Peter’s and your contribution to it works great.
I was curious, however if I play with it further, and maybe Peter can also verify or deny, if I can get NPP to make it’s display look more like the dBASEPlus editor’s if I work more at the regex in the parser. Not so much the graphics, as just the way it breaks down the class, objects and functions etc. If not, believe me, what you guys did is fine…I’m just curious if it’s possible or not. Screenshots of both on Identical file.
dBASEPlus:
Notepad++:
Yeah or nay? :-)
Lee