Best way to find unmatched parentheses

Rowan Sylvester-Bradley

What is the best way to find unmatched parens in Notepad++? I have a PHP program which is now giving me “unexpected end of file”. Usually I have found that this is because some structure has been started but not finished, usually a block surrounded by {}. This is a longish program of 1300 lines. What I have tried so far is to put my cursor just after each { character that appears without any preceding tabs, thus highlighting it in red (this of course assumes that I have indented my code correctly, which since it is interspersed with HTML sections is not easy), and then look through the file to find the red } character, which (assuming that Notepad++ is doing its work properly) is the matching paren. If I find a { that doesn’t have a matching }, then that is the problem. This is quite difficult and time consuming, because searching 1300 lines of code for a red } takes time. It is made worse by the fact that N++ doesn’t seem to reliably carry the red line up the left hand margin through HTML sections.

Do people have a good way of doing this?

If not, maybe we could add a feature to easily find the matching } for any {, or vice versa, maybe a right click on a { brings up an option to “move to end of block” or something?

Thanks for your imputs - Rowan

gstavi

Search -> Go to Matching Brace

Now there are some rules for brace matching that depends on syntax highlighting. Braces must be in the same type of syntax.
If you get problems try to force syntax highlighting to Normal text and hope the html parts does not contain uneven braces.

I implemented a lexer for makefiles that also highlight the background within unmatched parenthesis to help debugging things like:
$(abspath $(addprefix $(DIR), $(subst $(blah), blah, blah)))
Perhaps someone can implement some similar generic lexer for that propose.

Claudia Frank

@Rowan-Sylvester-Bradley

there is a plugin called BracketsCheck which could be useful in such cases.
Haven’t used php for quite a long time but isn’t there a commandline switch
to check the syntax of the file? Doesn’t it report line and/or position of the error?
If so, you could use nppexec plugin to call it and configure output filters to jump
to the location of the problem.

Cheers
Claudia

linpengcheng

Has a plugin like https://github.com/DogLooksGood/parinfer-mode?
Use it without paying attention to parentheses

guy038

Hello, @rowan-sylvester-bradley, and All,

The problem about finding a range of characters, containing juxtaposed and/or nested blocks, all well-balanced, can be solved by using recursive regex patterns, exclusively !!

I’ll going to give, first, the general method. Next, I’ll adapt these general regexes to your particular case !

Of course, I suppose that blocks of text are defined between two different boundaries, of one character each !

So, let’s give, first, four notations :

SB = starting boundary of blocks. Hence, the regex, with possible \ for escaping as literal
- < , if the STARTING boundary is the symbol <
- \( , if the STARTING boundary is the symbol (
- \{ , if the STARTING boundary is the symbol {
- \[ , if the STARTING boundary is the symbol [
EB = ending boundary of blocks. Hence, the regex, with possible \ for escaping as literal
- > , if the ENDING boundary is the symbol >
- \) , if the ENDING boundary is the symbol )
- \} , if the ENDING boundary is the symbol }
- \] , if the ENDING boundary is the symbol ]
AC = Any single Allowed Character, except for the SB and EB boundaries. Hence, the simple negative class character :
- [^<>] , if boundaries are the two symbols < and >
- [^()] , if boundaries are the two symbols ( and )
- [^{}] , if boundaries are the two symbols { and }
- [^][] , if boundaries are the two symbols [ and ]
R# = Recursive call to capturing group #. Hence, the pre-defined regex (?#)

Just note that the (?0) or (?R) syntaxes are a recursive call of the overall regex

Then :

The regex A SB(?:AC++|R0)*EB searches the largest area, even on several lines, between a SB boundary and an EB boundary, which may contain other juxtaposed and/or nested blocks SB....EB correctly balanced
The regex B AC*(SB(?:AC++|R1)*EB)AC* searches the largest area, even on several lines, between a SB boundary and an EB boundary, which may contain juxtaposed and/or nested blocks SB....EB, all correctly balanced, possibly preceded and / or followed by any range of AC
Finally, the regex C (?:AC*(SB(?:AC++|R1)*EB)AC*)+ searches for consecutive largest areas, defined previously. In other words, this regex C finds the largest range of characters, even on several lines, which contains, exactly, the same number of SB and EB boundaries !

Well, Rowan, now, here is the time of practical works ! As I supposed that your SB is the { symbol and that your EB is the } symbol, I imagined the simple example text, below :

Beginning of that
example text :

Mr{{123{456}78}904}0Rowan0{{54{6}4}}SYLVESTER{12345}789{Bradley}67890{}
---><---
Mr{
{123{45
6}78
}90
4}0Rowan0
{{54{6}4
}}SYLVESTER{123
45}789{Bradley
}678
90
{
}
End of that
example text

Paste it, in a new tab, with a couple of blank lines, at beginning of file

With your boundaries of blocks, this implies :

SB = \{ and EB = \}
AC = [^{}]
R0 = (?0) and R1 = (?1)

So, the three general regexes, become, in your case, :

Regex A : \{(?:[^{}]++|(?0))*\}

Regex B : [^{}]*(\{(?:[^{}]++|(?1))*\})[^{}]*

Regex C : (?:[^{}]*(\{(?:[^{}]++|(?1))*\})[^{}]*)+

Move back, first, to the very beginning of text

Try, successively, these 3 regexes, containing recursive patterns, against the example text, using the Find >> button, to easily figure out each step !

Now, let’s try to insert, in the line ---><---, a SB or EB boundary, between the two signs ><. And, using again the 3 regexes A, B and C. Note the true differences, with the previous case. Interesting, isn’t it ?

I do hope that, with the help of these 3 regexes, you’ll be able to easily locate the wrong { or } boundary, which breaks your well-balanced code and give you the Unexpected End of File message ;-))

Best Regards,

guy038

P.S. : ( Added on 12-20-2017 ! )

BTW, the regex C is very useful while using the Mark feature, indeed !!

For instance, let’s use, again, the example text, in a new tab :

Beginning of that
example text :

Mr{{123{456}78}904}0Rowan0{{54{6}4}}SYLVESTER{12345}789{Bradley}67890{}
---><---
Mr{
{123{45
6}78
}90
4}0Rowan0
{{54{6}4
}}SYLVESTER{123
45}789{Bradley
}678
90
{
}
End of that
example text

Now, insert, in the line ---><---, a SB boundary, {, between the two signs ><
Then, open the Mark feature, with the menu option Edit > Mark…

FIND WHAT (?:[^{}]*(\{(?:[^{}]++|(?1))*\})[^{}]*)+

OPTIONS Purge for each search , Wrap around and Regular expression checked

ACTION Click on the Mark All button

=> At once, we can easily see that the {, in the line --->{<--- is the ONLY character, which is NOT marked red. Thus, it means that a EB boundary } should be added, somewhere between this NON-marked { and the END of the file, in order to get a complete text, well-balanced :-)

Similarly, insert, in the line ---><---, the EB boundary }, between the two signs ><
And open the Mark feature, with the menu option Edit > Mark…

FIND WHAT (?:[^{}]*(\{(?:[^{}]++|(?1))*\})[^{}]*)+

OPTIONS Purge for each search , Wrap around and Regular expression checked

ACTION Click on the Mark All button

=> Again, it happens that the }, in the line --->}<--- is the ONLY character, which is NOT marked red. Thus, it means that a SB boundary { should be added, somewhere between BEGINNING of file and this NON-marked }, in order to get a complete text, well-balanced :-)

Alan Kilborn

Cross-link to THIS POSTING where related info is discussed.