Finding sentences with open parenthesis that and not closed
-
Ach…that doesn’t seem to quite do it…maybe this one is better:
Find-what zone:
(?-s)\(.+?(?:(\))|[.?!])(?!\))
Replace-with zone:$0?1:\)
…or maybe I should just give up… :-)
-
hello Scott Sumner. Thanks for answer. Your regex seems to be fine for my exemple, but it’s not about search and replace, only search I need.
because in another case, such as this:
What is taxe d'habitation and (do I have to pay its products anytime?
your regex will add a close parenthesis of the end of sentence (but not where I want it). Not so good. So I need only to find the sentences with open parenthesis that and not closed. Just to find them, not to close them.
-
Hello @Robin-cruise, @scott-sumner and All,
From your last post and assuming that :
-
A line may contain several
(....)
correct blocks -
Any
(....)
block is contained in a single line, only
After some tests and building complicated regexes, I thought, as you only need to perform a search, that the best regex would be, simply :
SEARCH
\([^(\r\n]*?\)|\(|\)
This regex matches any correct
(....)
block OR any non-balanced(
OR)
parenthesis !Remark : As correct blocks are searched first, this, automatically, avoids to find their boundaries themselves. Thus, the regex engine correctly looks for any non-balanced remaining parenthesis ;-))
Give it a try, with the sample test, below :
This is a sentence with TWO consecutive blocks between parentheses 00 00 This is a sentence with TWO consecutive blocks between ) parentheses 00 01 This is a sentence with TWO consecutive ( blocks between parentheses 00 10 This is a sentence with TWO consecutive ( blocks between ) parentheses 00 11 This is a ( sentence with TWO consecutive blocks between parentheses 10 00 This is a ( sentence with TWO consecutive blocks between ) parentheses 10 01 This is a ( sentence with TWO consecutive ( blocks between parentheses 10 10 This is a ( sentence with TWO consecutive ( blocks between ) parentheses 10 11 This is a sentence with TWO ) consecutive blocks between parentheses 01 00 This is a sentence with TWO ) consecutive blocks between ) parentheses 01 01 This is a sentence with TWO ) consecutive ( blocks between parentheses 01 10 This is a sentence with TWO ) consecutive ( blocks between ) parentheses 01 11 This is a ( sentence with TWO ) consecutive blocks between parentheses 11 00 This is a ( sentence with TWO ) consecutive blocks between ) parentheses 11 01 This is a ( sentence with TWO ) consecutive ( blocks between parentheses 11 10 This is a ( sentence with TWO ) consecutive ( blocks between ) parentheses 11 11
Cheers,
guy038
-
-
@Robin-Cruise said:
but it’s not about search and replace, only search I need.
You are right; I misread your original post. But…if someone gives you a search+replace expression, can’t you just use the search part if that’s all you want to do? :-)
@guy038, likely overkill… :-D
-
thank you guy038 .
But in case I want to find only the LINES that contains only one single parenthesis, not both of them? Like in this case:
This is a sentence with open parentheses ( blocks between ... This is a sentence with close parentheses ) blocks between ...
not this
This is a ( sentence with TWO ) consecutive blocks between parentheses 11 00
-
Hi, @Robin-cruise and All,
In case of an UNIQUE expected block of text, between parentheses, strangely, the regex seems a bit more complicated !
SEARCH
^[^(\r\n]*\K\)|\((?!(?-s).*\))
Notes :
-
This regex has two alternatives, separated with the
|
regex symbol :-
^[^(\r\n]*\K\)
, which searches for a)
character, if NO(
character, nor\r
or\n
has been found, before, from beginning of current line -
\((?!(?-s).*\))
, which searches for a(
character, if NO)
character can be found, further on, on current line
-
Just test it with the sample text, below :
This is a sentence with ONLY ONE block between parentheses 00 This is a sentence with ONLY ONE block ) between parentheses 01 This is a sentence ( with ONLY ONE block between parentheses 10 This is a sentence ( with ONLY ONE block ) between parentheses 11 ) 01 ( 10 () 11
Cheers,
guy038
-
-
thank you
-
and on small thing, if I want to exclude all the lines that contains
);
how cand I do this?This is a sentence with ( ONLY ONE block between parentheses This is a sentence with ONLY ONE block ) between parentheses
NOT THIS:
This is a sentence with ONLY ONE block between parentheses );
I try to add this to your regex, but doesn’t work
(?![\);])
^[^(\r\n]*\K\)|\((?!(?-s).*\))(?![\);])
-
Hi, @Robin-cruise and All,
In that case, the search of the ending parenthesis (
)
) must have the additional condition that is NOT be followed with a semicolon (;
). Thus, the negative look-ahead(?!;)
must be added after the literal ending parenthesis\)
So, the regex becomes :
SEARCH
^[^(\r\n]*\K\)(?!;)|\((?!(?-s).*\))
BR
guy038
P.S. :
And, with the text, below :
This is a sentence ( with ONLY ONE block between parentheses );
Do you expect to match the opening parenthesis (
(
) or to ignore it ? Presently, it does not match the(
, because of the regex(?!;)
! -
and, in the future, if I want to use negative look-ahead for other sings like
},{
or[,]
what negative look-ahead should I use?because, just for testing, I try to change
);
with)}
so as to find those lines that contains only)}
This is a sentence with ONLY ONE block between parentheses )}
so, in this case I should use something like this
(?!\})
, correct?^[^(\r\n]*\K\)(?!\})|\((?!(?-s).*\))
but is not working -
got it
(?![\}])
or(?![\{])
So this will find all lines that contains only curly bracket
);
or){
or)}
^[^(\r\n]*\K\)(?!;)(?![\}])(?![\{])|\((?!(?-s).*\))
Thank you very much Guy038
-
Hi, @Robin-cruise and All,
Of course, you may add these
3
look-aheads, consecutively, after the litteral\)
ending parenthesis, as you did :(?!;)(?![\}])(?![\{])
Indeed, while evaluating each condition, in each look-ahead, the regex engine location does NOT change ( It is just between the
)
and its next character ! )However, you can, also, use the unique look-head
(?![;{}])
! In addition, when inside a character class[....]
, the{
and}
braces are just literal characters :-))Recall :
Inside a character class
[....]
,4
characters, only, have a special meaning :-
The character
^
, which must be at any position but the first, to be considered as literal or at any position if preceded with the\
escape symbol -
The character
]
, which must be the very first character, after]
, to be taken as literal or at any position if preceded with the\
escape symbol -
The character
-
, which must be at the very beginning or at the the very end of the character class to be considered as literal or at any position if preceded with the\
escape symbol -
The character
\
, which can be at any position of the character class, if preceded, itself, with an other\
escape symbol, to be taken as a literal character -
All the other chracters, inside a character class
[....]
, are just literal chars !
To sum up, assuming an unique block
(.....)
per line, the regex^[^(\r\n]*\K\)(?![;{}])|\((?!(?-s).*\))
would find :-
The ending
)
parenthesis, if not followed with a;
, a}
or a{
character AND if a(
parenthesis has not been found, before, in current line -
The starting
(
parenthesis, if a)
parenthesis cannot be found, further on, in current line
Cheers,
guy038
-