Help Adding Pascal to Function List
-
@rdipardo
Can’t I just define that the word “Interface” starts a multi line comment and the word “Implementation” ends it? The declarations need to always be between those two words… so it should be easy to just ignore everything between them -
@James-Richters said in Help Adding Pascal to Function List:
Can’t I just define that the word “Interface” starts a multi line comment and the word “Implementation” ends it?
Why yes I can!
I am positive the Regex I used to make this work is completely wrong… but I just took a wild guess and stuck in:
commentExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?s:\x7B.*?\x7D) # Multi Line Comment 1st variant | (?s:Interface\x2A.*?\x2A\Implementation) # Multi Line Comment 2nd variant | (?s:\x28\x2A.*?\x2A\x29) # Multi Line Comment 2nd variant | (?m-s:\x2F{2}.*$) # Single Line Comment "
the Regex could probably be improved to not be case sensitive (and be correct)
-
I was originally thinking “you cannot hold the Interface/Implementation state inside the regex while still being able to match all the functions/procedures”. But that was a clever workaround, making your “comment” definition just exclude everything between those two keywords. Great idea!
to not be case sensitive
That part’s easy:
... | (?is:Interface\x2A.*?\x2A\Implementation) ...
(and be correct)
Looking at your regex, I was confused as to why you have the
\x2A
in there, since I don’t see any asterisks in your example code. So I think maybe those are the things that are causing the problem for you.... | (?is:Interface\x2A.*?\x2A\Implementation) ...
If that’s your problem, then I am assuming you started from
| (?s:\x28\x2A.*?\x2A\x29) # Multi Line Comment 2nd variant
and just changed the
\x28
and\x29
portions. If so, you maybe didn’t understand that regex.\x28\x2A
says "match literal characters(*
… ".*?
means "… followed by 0 or more other characters (as few as possible), until … "\x2A\x29
says “… match literal characters*)
”
Where
\x
followed by two hexadecimal digits is a way to represent any character by the hex representation of its ASCII code. So\x2A
in a regex means “match the ASCII character at hex position 2A, which is decimal 42, which is the asterisk character*
”(The Search Engine That Knows All tells me that
(* ... *)
is a valid Pascal multiline comment sequence, so I think my interpretation is reasonable.)If you don’t want to require an asterisk after
Interface
or beforeImplementation
, and don’t want it case sensitive, then use... | (?is:Interface.*?Implementation) ...
So that should probably fix things for you.
----
Useful References
-
@PeterJones said in Help Adding Pascal to Function List:
Looking at your regex, I was confused as to why you have the \x2A in there,
Well, I really have no clue when it comes to Regex… so I thought maybe the asterisks were wild cards… like “include everything”
I literally copied and pasted the line below it and fiddled around with it until it worked.
@PeterJones said in Help Adding Pascal to Function List:
If you don’t want to require an asterisk after Interface or before Implementation, and don’t want it case sensitive, then use
…
| (?is:Interface.*?Implementation)
…So that should probably fix things for you.
That works great! Case insensitive and everything, Thank you!
| (?is:Interface.*?Implementation) # Multi Line Comment 2nd variant to Ignore function definition for Units
What is the best way to submit this so other Pascal users can have the definition header filtered out of their units as well?
-
@James-Richters said in Help Adding Pascal to Function List:
What is the best way to submit this so other Pascal users can have the definition header filtered out of their units as well?
At the very least, create an official issue. If you have the git/github skills necessary, you could create a PR to go with that issue; if not, just attach the updated function list
pascal.xml
to your issue, and if one of the developers likes the idea, they may create the PR for you. ( @dinkumoil may be interested, being the one who originally submitted the pascal function list last month… or @rdipardo has already shown an interest here. My mentioning of specific users makes no guarantee, nor in any way volunteers nor commits them to do something they don’t have the time or inclination for. ) -
@PeterJones Thank you, I’m on GitHub, I’m pretty familiar with it.
Summitted PR#12686
https://github.com/notepad-plus-plus/notepad-plus-plus/pull/12686I appreciate everyone’s help with this, it’s going to be a tremendous help in my large Pascal projects
-
@PeterJones
Of course, I am interested in potential bugs in my function list parser.@James-Richters
However, I can not reproduce your issue.My function list parser is a mixed parser because nowadays most Pascal variants are object oriented, so a class parser and a function parser are put together to form a mixed parser. Since your example code doesn’t contain objects, for you the relevant part of my parser is the
function
XML tag. ItsmainExpr
attribute contains the regular expression to detect function definitions and to avoid procedure and function declarations in theinterface
section (see this line and the following 8 lines) and forward declarations in theimplementation
section (see this line).Avoiding procedure/function declarations in the
interface
section is done by a positive look ahead(?=(...))
that requires a procedure or function header to be followed by at least one of a list of certain keywords (CONST
,TYPE
,VAR
,LABEL
orBEGIN
) or by the definition of an inline procedure or function ((?R)
which triggers regex recursion).Triggered by this forum thread, for testing my parser I used the following source code:
unit FooBar; interface uses System.SysUtils; procedure Foo(AParam: integer); function Bar(const AParam: string): integer; implementation procedure Foo(AParam: integer); begin // Do something end; function Bar(const AParam: string): integer; begin // Do something Result := 0; end; end.
This is minimalistic but similar to the code provided by you as screenshot. It doesn’t produce duplicates in function list.
From your comments I’m not sure if you really tested my parser or if you tested yours. So, please clarify which parser you used. Please note: In an installed version of Notepad++ you have to put the parser’s file to
C:\Users\<UserName>\AppData\Roaming\Notepad++\functionList
.Nevertheless, your solution to include everything between the
interface
andimplementation
statements into thecommentExpr
tag is a clever way to simplify of avoiding procedure/function declarations in theinterface
section are shown in function list panel. If no other person decides to improve my parser, I will try to integrate your solution when I have some spare time. But most likely this will not happen during X-Mas holidays, it may take some weeks if not months because changing and testing regex code I developed and improved over three years is not a quick and easy task, at least for me.BTW: Maybe @guy038 can have a look at my parser. As a regex specialist he may find some potential improvements.
-
@dinkumoil First of all, Thank You for developing a Pascal function list parser! I have some really large pascal projects with hundreds of functions and procedures and it’s going to save me a lot of time navigating them.
@dinkumoil said in Help Adding Pascal to Function List:
From your comments I’m not sure if you really tested my parser or if you tested yours. So, please clarify which parser you used.
As soon as I found out there was already a Pascal function list parser I was available, I abandoned my attempt at it because I really have absolutely no idea, especially when it comes to regex. So I installed Notepad++ v8.4.8 Release Candidate 3, and after that, my Pascal files had no function list anymore, because v8.4.8 did not include your Pascal parser… so I found out it was included in the portable version, so I downloaded that, and copied the missing file to the v8.4.8 installation, and then it worked, but I had duplicates of everything in the interface section, and putting the extra line in the file I just copied over seemed to fix it.
but now I’m questioning if that really fixed it or if I somehow got back on my old attempt? I guess it’s possible… So What I will do to get to the bottom of this, is install Notepad ++ different computer and test it again from scratch on that… that way we can be sure that I didn’t somehow get back on my old parser. I will let you know the results, and if I still have the issue, I will provide a sample program to demonstrate it.
-
@dinkumoil : I installed Notepad++ on another computer and at first I thought the Pascal function list was fine… it wasn’t showing any duplicates from the implementation section, but then I tried to load one of my huge units, and I had duplicates again, but I figured out what happened… I had some procedures commented out from the implementation section, I didn’t need them to be called from outside the unit, or I replaced them with something else… but commenting something out with { curly braces } makes it show everything before the curly braces.
Here I have a very small unit that demonstrates the issue…
Edit: I had an actual small unit here but it got flagged as spam so I made another one based on your example, but then I didn’t have the problem, it took me a while but I figured out that the comments after the End; of the function are needed to reproduce the issue.Here is the sample code:
unit FooBar; Interface Uses CRT,Windows; procedure Foo(AParam: integer); function Bar(const AParam: string): integer; {function Test(Tnum:Double): DWord; //Removeing the curly braces fixes it} procedure Boo(AParam: integer); Implementation procedure Foo(AParam: integer); begin // Do something end; { func. Foo If these comments are not here, it doens't happen } function Bar(const AParam: string): integer; begin // Do something Result := 0; end; { func. Bar } function Test(Tnum:Double): DWord; begin // Do something Result := 0; end;{ func. Test } procedure Boo(AParam: integer); begin // Do something Result := 0; end; Begin end.
You can see things above the commented out section are duplicated. In my original test, I had some units more than 100 functions down commented out like this, so I had a lot of duplicates.
If I would have commented them out with // instead, then it works fine, unfortunately I have a LOT of units and many times I would comment things out with { } just to show what was in the unit… it would be a monumental undertaking to go through and change them all.
Unfortunately, my idea to make a comment out of everything between Interface and Implementation also does not work in this case… but I figured out it’s because multiline comments are not working correctly. See the example below, the multiline comments only work if there is something on the line that begins them, not it it’s on the line above by itself:
Everything should be commented out except Dummy1, but 2 and 5 are not… even though in the code they clearly are.
I don’t have a clue how to fix this. I just don’t know enough about how it all works, but I think if the multiline comments worked the way they are supposed to, even if there is nothing on the line following the beginning of the comment, then this would work, but how to do that?
-
I don’t know how that could happen, but when I commited my Pascal/Delphi FunctionList parser I accidentally commited an outdated version. That’s the reason why I wasn’t able to reproduce your issue locally - with the Notepad++ installations on my machine I use an improved version of the parser. :-(
I created an issue and a related pull request on GitHub that includes the improvements of the parser I use locally.
Hopefully @donho will include this PR into the upcoming v8.4.8 release though it’s commited so late.
-
Bad news, my PR I mentioned above is faulty, it didn’t pass the unit test at GitHub. The version I commited in this PR seems also to be outdated.
Since I’m not at home, I’m not able to fix the issue at the moment, you have to wait at least until next week for a working parser.
-
@dinkumoil No Problem at all, the parser version I have is WAY better than the non-existent parser I had a few days ago :)
So have a nice holiday, and thank you for your efforts. -
Hello, @james-richters, @rdipardo, @peterjones, @michael-vincent, @dinkumoil and All,
Allow me to invite myself in your conversation. I did some texts and the following
Pascal
parser should meet your needs !
- Save the following text, in the
functionList
folder, with namePascal.xml
, as anUTF-8
encoded file :
<?xml version="1.0" encoding="UTF-8" ?> <!-- ================================================================================= | | To learn how to make your own language parser, please check the following link: | | https://npp-user-manual.org/docs/function-list/ | ================================================================================= --> <NotepadPlus> <functionList> <!-- ===================================================== [ Pascal ] =============================================================== --> <parser displayName="Pascal" id ="pascal_function" commentExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?s: \x7B .*? \x7D ) # Multi Line Comment 1st variant | (?is: ^ Interface .*? ^ Implementation ) # Multi Line Comment 2nd variant | (?s: \x28\x2A .*? \x2A\x29 ) # Multi Line Comment 2nd variant | (?-s: \x2F\x2F .* ) # Single Line Comment " > <function mainExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?-s) ^ \h* # optional leading whitespace (?i: PROCEDURE | FUNCTION ) \s* # start-of-function indicator \K # keep the text matched so far, out of the overall match [A-Za-z_] \w* # valid character combination for identifiers (?: \s* \( .*? \) (?: : .+ )? )? ; # parentheses and parameters optional " > <!-- COMMENT out the THREE following lines to display the function with its PARAMETERS --> <functionName> <nameExpr expr="[A-Za-z_]\w*" /> </functionName> </function> </parser> <!-- ================================================================================================================================ --> </functionList> </NotepadPlus>
- Add, as usual, the line below, within the
overrrideMap.xml
file, in thefunctionList
folder :
<association id= "pascal.xml" langID= "11"/>
- Open a new tab and paste the following text in a file, named
Test.pas
:
Procedure Dummy ( const bla); Interface function Test(Tnum:Double): DWord; function Bar(const AParam: string): integer; procedure Foo(AParam: integer); procedure Boo(AParam: integer); Implementation This is a // small test function Bar(const AParam: string): integer; procedure Foo(AParam: integer); function Bar(const AParam: string): integer; This is a (* small test procedure Foo(AParam: integer); function Bar(const AParam: string): integer; function Test(Tnum:Double): DWord; procedure Boo(AParam: integer); to see if *) all is OK function Test(Tnum:Double): DWord; procedure Boo(AParam: integer); This is a { small test procedure Foo(AParam: integer); function Bar(const AParam: string): integer; function Test(Tnum:Double): DWord; procedure Boo(AParam: integer); to verify if } all is OK procedure Guy;
-
Stop and re-start Notepad++
-
Select your
Test.pas
tab -
Click on the
View > Function List
option
=> You should get this picture :
As you can verify :
-
All the declarations, between the
Interface
andInplementation
boundaries, are not listed as expected -
All text between the multi-lines comment
(\*
and\*)
is not listed as expected -
All text between the multi-lines comment
{
and}
is not listed, too, as expected -
All text beginning a single-line comment
//
is not listed in theFunction List
window, as expected
Of course, as I don’t know the
Pascal
language, I may have missed some obvious constructions, which must be seen in theFucntion List
window. Just tell me about it ?See you later,
Best Regards,
guy038
- Save the following text, in the
-
Although I’m on vacation, I found some time to update my parser. Have a look at my PR.
-
I may have missed some obvious constructions, which must be seen in the
Fucntion List
window. Just tell me about it ?After transforming your sample into a valid source file [^1], it seems your parser has lost the ability to find nested procedures.
@dinkumoil’s (corrected) parser finds them, at the expense of also detecting the commented-out ones (i.e., the ones “nested” within block comments.)
As written, the rule is satisfied anytime the first alphabetic sequence in any line happens to be an implementation keyword. The only time the rule fails is when a comment marker is directly followed by an implementation keyword.
There needs to be an expression that can check the previous line for the start of a block comment. Unfortunately, look-behind expressions are explicitly not allowed by the function parser specification.
Even without testing, I’m confident this is reproducible in other languages with function parsers. It’s just a fundamental limitation of the specification.
[^1]:
Unit Guy; {$IFDEF FPC} // Free Pascal {$mode objfpc} {$endif} Interface {$IFDEF DCC} // Delphi uses Winapi.Windows; // DWORD {$endif} Procedure Dummy ( const bla); function Test(Tnum:Double): DWord; function Bar(const AParam: string): integer; procedure Foo(AParam: integer); procedure Boo(AParam: integer); Implementation Procedure Dummy ( const bla); begin end; { This is a small test } function Bar(const AParam: string): integer; begin Result := -1; end; procedure Foo(AParam: integer); begin end; { This is a (* small test procedure Foo(AParam: integer); function Bar(const AParam: string): integer; function Test(Tnum:Double): DWord; procedure Boo(AParam: integer); to see if *) all is OK } function Test(Tnum:Double): DWord; begin Result := 0; end; procedure Boo(AParam: integer); procedure Guy; // locally defined procedure begin end; begin end; (* This is a { small test procedure Foo(AParam: integer); function Bar(const AParam: string): integer; function Test(Tnum:Double): DWord; procedure Boo(AParam: integer); to verify if} all is OK *) end.
-
I can confirm that the new version of my parser causes commented-out procedures/functions to be part of the function list.
There needs to be an expression that can check the previous line for the start of a block comment.
The core problem seems to be how matching and processing block comments is realized in the related C++ code that fills the function list panel. There is a special section in the XML file of function list parsers to define line and block comments. I expect that code, that is recognized as commented-out by the regexes in this section, is not parsed anymore by the regex that identifies procedure/function implementations. But that’s obviously not the case and I’m wondering what’s the sense of the comment-definition section in the parser file, respectively how it is processed.
Maybe the same behaviour of the C++ code that fills the function list panel is the cause that I was not able to define everything between the
interface
andimplementation
keywords as a comment. I used a similar expression like @guy038 (i.e.(?is:^\h* Interface.*?^\h*Implementation\s*)
) but it didn’t work, it was just ignored. Trying this regex in the normal search-and-replace dialog of Notepad++ gave the expected result but in function list parser I had duplicate entries for procedure/function declarations.Maybe someone with C++ knowledge (maybe @rdipardo ?) should have a look at the function list code to check how it processes the regexes from the parser file.
-
Hi, @james-richters, @rdipardo, @peterjones, @michael-vincent, @dinkumoil and All,
Yes, of course: my test and parser examples were deliberately sketchy to only highlight that the declarations in the [Interface - Implementation] range were correctly ignored, as well as all text in comments
@dinkumoil, I also consulted your new Pascal parser and indeed, I am far from matching you on this point !
@dinkumoil, you said :
… I was not able to define everything between the interface and implementation keywords as a comment. I used a similar expression like @guy038 (i.e. (?is:^\h* Interface.?^\hImplementation\s*)) but it didn’t work, it was just ignored.
But, obviously, my simple example shows that it seems to work ?! So, @dinkumoil, @rdipardo, can you enlighten me on this apparent contradiction ?
Please, do not spend too much time on this : I’ll take your word for it !
Best Regards,
guy038
-
@guy038 said in Help Adding Pascal to Function List:
can you enlighten me on this apparent contradiction ?
When I remove the class parser part of my mixed parser (i.e. the whole
classRange
XML node) I’m able to define everything between theinterface
andimplementation
keywords as a comment.So, again it seems necessary to analyse the C++ code that processes function list parser files and fills the tree in function list panel.
But maybe you @guy038 are lucky and can find content for the class parser part that doesn’t cause the comment expression to fail. I already tried to use empty regexes in the class parser, that failed too. Seems like the simple presence of a class parser part is the root cause.
-
-
Meanwhile a fix has been applied to the source code of Notepad++ that allows your fix (to define source code between the
interface
andimplementation
keywords as a multi-line comment) to work reliably if it is used in a mixed function list parser. You can download a preview version (including that fix) of Notepad++ 32 bit >>from here<< and the 64 bit version >>from here<<.Additionally, some fixes have applied to my Pascal/Delphi function list parser (including your fix to define source code between the
interface
andimplementation
keywords as a multi-line comment). You can download it >>from here<<.So, we Delphi developers should have a working solution.
-
@James-Richters said in Help Adding Pascal to Function List:
I am trying to figure out how to add Pascal to the Function List feature.
There are two issues I’m having difficulty figuring out.
The first one is that Pascal has Functions and Procedures, Functions return something, Procedures do not… as far as the Function List is concerned, they are both the same. I can’t figure out how to get both “Function” and “Procedure” to show up in the list.
I can’t paste it as code here because it’s just a giant unreadable mess, but here is a screen shot of what I am trying to do:
I don’t know anything at all about Regex Expressions, but I tried an OR function in an on-line Regex helper and (?i:PROCEDURE\s+)|(?i:FUNCTION\s+) should get a hit on both Procedure and Function… but it doesn’t work, I only get whatever I put there first.
The second thing I’m not sure how to do is eliminate the duplicates caused by the declaration in a Unit. everything between
Interface
And
Implementation
should be ignored, as those are not the actual functions and procedures, they are just a declaration of them.Here’s the code just in case it looks better after I post it:
<association id= "pascal_function" langID="11" /> <!-- ===================================================== [ Pascal ] --> <parser displayName="Pascal" id ="pascal_function" commentExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?is:\x23cs.*?\x23ce) # Multi Line Comment | (?m-s:^\h*;.*?$) # Single Line Comment " > <function mainExpr="(?x) # Utilize inline comments (see `RegEx - Pattern Modifiers`) (?m)^\h* # optional leading whitespace (?i:PROCEDURE|FUNCTION\s+) # start-of-function indicator \K # keep the text matched so far, out of the overall match [A-Za-z_]\w* # valid character combination for identifiers (?:\s*\([^)]*?\))? # parentheses and parameters optional " > <!-- comment out the following node to display the function with its parameters --> <functionName> <nameExpr expr="[A-Za-z_]\w*" /> </functionName> </function> </parser> <!-- ================================================================= -->
I also face this problem.