Finding sentences with open parenthesis that and not closed



  • good day. How do I find unclosed parenthesis?

    ex. What is taxe d'habitation and (do I have to pay it?...

    after ? should be )



  • @Robin-Cruise

    Perhaps something like this, if your ( and your sentence-ender (e.g. ? but also could include . or ! or…) is on the same line:

    Find-what zone: (?-s)\(.+?[?!.]
    Replace-with zone: $0\)
    Search mode: Regular expression



  • Ach…that doesn’t seem to quite do it…maybe this one is better:

    Find-what zone: (?-s)\(.+?(?:(\))|[.?!])(?!\))
    Replace-with zone: $0?1:\)

    …or maybe I should just give up… :-)



  • hello Scott Sumner. Thanks for answer. Your regex seems to be fine for my exemple, but it’s not about search and replace, only search I need.

    because in another case, such as this:

    What is taxe d'habitation and (do I have to pay its products anytime?

    your regex will add a close parenthesis of the end of sentence (but not where I want it). Not so good. So I need only to find the sentences with open parenthesis that and not closed. Just to find them, not to close them.



  • Hello @Robin-cruise, @scott-sumner and All,

    From your last post and assuming that :

    • A line may contain several (....) correct blocks

    • Any (....) block is contained in a single line, only

    After some tests and building complicated regexes, I thought, as you only need to perform a search, that the best regex would be, simply :

    SEARCH \([^(\r\n]*?\)|\(|\)

    This regex matches any correct (....) block OR any non-balanced ( OR ) parenthesis !

    Remark : As correct blocks are searched first, this, automatically, avoids to find their boundaries themselves. Thus, the regex engine correctly looks for any non-balanced remaining parenthesis ;-))

    Give it a try, with the sample test, below :

    This is a sentence with TWO consecutive blocks between parentheses            00 00
    This is a sentence with TWO consecutive blocks between ) parentheses          00 01
    This is a sentence with TWO consecutive ( blocks between parentheses          00 10
    This is a sentence with TWO consecutive ( blocks between ) parentheses        00 11
    
    This is a ( sentence with TWO consecutive blocks between parentheses          10 00
    This is a ( sentence with TWO consecutive blocks between ) parentheses        10 01
    This is a ( sentence with TWO consecutive ( blocks between parentheses        10 10
    This is a ( sentence with TWO consecutive ( blocks between ) parentheses      10 11
    
    This is a sentence with TWO ) consecutive blocks between parentheses          01 00
    This is a sentence with TWO ) consecutive blocks between ) parentheses        01 01
    This is a sentence with TWO ) consecutive ( blocks between parentheses        01 10
    This is a sentence with TWO ) consecutive ( blocks between ) parentheses      01 11
    
    This is a ( sentence with TWO ) consecutive blocks between parentheses        11 00
    This is a ( sentence with TWO ) consecutive blocks between ) parentheses      11 01
    This is a ( sentence with TWO ) consecutive ( blocks between parentheses      11 10
    This is a ( sentence with TWO ) consecutive ( blocks between ) parentheses    11 11
    

    Cheers,

    guy038



  • @Robin-Cruise said:

    but it’s not about search and replace, only search I need.

    You are right; I misread your original post. But…if someone gives you a search+replace expression, can’t you just use the search part if that’s all you want to do? :-)

    @guy038, likely overkill… :-D



  • thank you guy038 .

    But in case I want to find only the LINES that contains only one single parenthesis, not both of them? Like in this case:

    This is a sentence with open parentheses  ( blocks between  ... 
    This is a sentence with close parentheses  ) blocks between ...
    

    not this

    This is a ( sentence with TWO ) consecutive blocks between parentheses        11 00


  • Hi, @Robin-cruise and All,

    In case of an UNIQUE expected block of text, between parentheses, strangely, the regex seems a bit more complicated !

    SEARCH ^[^(\r\n]*\K\)|\((?!(?-s).*\))

    Notes :

    • This regex has two alternatives, separated with the | regex symbol :

      • ^[^(\r\n]*\K\), which searches for a ) character, if NO ( character, nor \r or \n has been found, before, from beginning of current line

      • \((?!(?-s).*\)), which searches for a ( character, if NO ) character can be found, further on, on current line

    Just test it with the sample text, below :

    This is a sentence with ONLY ONE block between parentheses          00
    This is a sentence with ONLY ONE block ) between parentheses        01
    This is a sentence ( with ONLY ONE block between parentheses        10
    This is a sentence ( with ONLY ONE block ) between parentheses      11
    
    )                                                                   01
    (                                                                   10
    ()                                                                  11
    

    Cheers,

    guy038



  • thank you



  • and on small thing, if I want to exclude all the lines that contains ); how cand I do this?

     This is a sentence with ( ONLY ONE block  between parentheses       
     This is a sentence with ONLY ONE block ) between parentheses
    

    NOT THIS:

     This is a sentence with ONLY ONE block between parentheses  );
    

    I try to add this to your regex, but doesn’t work (?![\);])

    ^[^(\r\n]*\K\)|\((?!(?-s).*\))(?![\);])



  • Hi, @Robin-cruise and All,

    In that case, the search of the ending parenthesis ( ) ) must have the additional condition that is NOT be followed with a semicolon ( ; ). Thus, the negative look-ahead (?!;) must be added after the literal ending parenthesis \)

    So, the regex becomes :

    SEARCH ^[^(\r\n]*\K\)(?!;)|\((?!(?-s).*\))

    BR

    guy038

    P.S. :

    And, with the text, below :

    This is a sentence ( with ONLY ONE block between parentheses );
    

    Do you expect to match the opening parenthesis ( ( ) or to ignore it ? Presently, it does not match the (, because of the regex (?!;) !



  • and, in the future, if I want to use negative look-ahead for other sings like },{ or [,] what negative look-ahead should I use?

    because, just for testing, I try to change ); with )} so as to find those lines that contains only )}

     This is a sentence with ONLY ONE block between parentheses  )}
    

    so, in this case I should use something like this (?!\}), correct?

    ^[^(\r\n]*\K\)(?!\})|\((?!(?-s).*\)) but is not working



  • got it (?![\}]) or (?![\{])

    So this will find all lines that contains only curly bracket ); or ){ or )}

    ^[^(\r\n]*\K\)(?!;)(?![\}])(?![\{])|\((?!(?-s).*\))

    Thank you very much Guy038



  • Hi, @Robin-cruise and All,

    Of course, you may add these 3 look-aheads, consecutively, after the litteral \) ending parenthesis, as you did :

    (?!;)(?![\}])(?![\{])

    Indeed, while evaluating each condition, in each look-ahead, the regex engine location does NOT change ( It is just between the ) and its next character ! )

    However, you can, also, use the unique look-head (?![;{}]) ! In addition, when inside a character class [....], the { and } braces are just literal characters :-))

    Recall :

    Inside a character class [....], 4 characters, only, have a special meaning :

    • The character ^, which must be at any position but the first, to be considered as literal or at any position if preceded with the \ escape symbol

    • The character ], which must be the very first character, after ], to be taken as literal or at any position if preceded with the \ escape symbol

    • The character -, which must be at the very beginning or at the the very end of the character class to be considered as literal or at any position if preceded with the \ escape symbol

    • The character \, which can be at any position of the character class, if preceded, itself, with an other \ escape symbol, to be taken as a literal character

    • All the other chracters, inside a character class [....], are just literal chars !


    To sum up, assuming an unique block (.....) per line, the regex ^[^(\r\n]*\K\)(?![;{}])|\((?!(?-s).*\)) would find :

    • The ending ) parenthesis, if not followed with a ;, a } or a { character AND if a ( parenthesis has not been found, before, in current line

    • The starting ( parenthesis, if a ) parenthesis cannot be found, further on, in current line

    Cheers,

    guy038


Log in to reply