Hi, Vittorio,
Thinking again about this topic, I was able to improve, a bit, my previous search regex.
With the regexes below, it’s possible to detect any ODD number of double quotation characters, ", in a sentence or, by default, in a complete line of text :-). Naturally, these new regexes seem rather tricky, but they do work !!
The first regex, below, will select the last double quotation mark, NOT balanced in a sentence, or, by default, in a complete line :
SEARCH (^|\.)(?:([^".\r\n]*)"(?2)")*(?2)\K"(?=(?2)(\.|$))
NOTES :
The first group 1, (^|\.), represents a beginning of line or the possible dot of the previous sentence.
The group (?:([^".\r\n]*)"(?2)") represents any range, even null of well-balanced suites, of the form ....."..."..."......". Note that it’s a non-capturing group, due to the syntax ?:, at beginning of that group.
Therefore, the second group 2 is ([^".\r\n]*), inside the non-capturing group, which represents any range, even null, of characters, different from a double quotation character, a dot character and an EOL character.
The regex of this second group, is re-used, further, in the regex, as a called subroutine (?2) to that group 2. So, writing the syntax (?2) is exactly like writing the regex [^".\r\n]* !
And, like in my previous post, the final regex, searched, is the double quotation, only, after the \K syntax and before the look-ahead (?=(?2)(\.|$)), which looks a range of characters, not ", nor ., till the end of the sentence or the line.
The second regex will stop at the beginning of any line or sentence, which contains an ODD number of double quotation characters :
SEARCH (^|\.)\K(?=(?:([^".\r\n]*)"(?2)")*(?2)"(?2)(\.|$))
NOTES :
This time, that second regex matches the empty string, located, between the a beginning of line ( or a dot of a previous sentence ) and a look-ahead, that tries to detect , FROM this current position, if there an odd number of double quotation marks, till the end of a sentence or a line !
So you’re immediately aware that there’s an unbalanced double quotation character, further on the current line :-)
To see the behaviour of these two regexes, just do a test, on the simple subject text below :
Line 1 "
Line 2 ""
Line 3 """
Line 4 """"
Line 5 """""
Line 6 """""". "Second" "sentence
With the first regex, it should select the last " character of the lines 1, 3 and 5, only, and the ", just before the word sentence.
With the second regex, the cursor should be located, at beginning of the lines 1, 3 and 5, only, and just after the dot , on line 6.
To end with :
You may, of course, change, in the regex, the double quotation mark by a single quotation mark, for instance. However, note that these regexes above, are NOT suitable, when the start and stop character are different, as for the couple ( and ) or even the French quotation marks “ and ” ! It’s an other story… )
If you don’t care about the notion of sentences, you can simplify these regexes, changing the anchor (^|\.) into ^ and the anchor (\.|$) into $
Cheers,
guy038