Searching for case mismatches ONLY



  • I have a huge amount of text to process, and I need to find a certain string inside it. Or, more accurately, I need to find strings that are the same as my search string but without regard to case. HOWEVER, I don’t want to get hits on the correct case versions of my string.

    Example:

    I’m wanting to find all versions of “mismatch” but without finding exactly “mismatch”.

    So some of what I want to find:

    • Mismatch
    • misMatch
    • MISMATCH
    • mismatcH
    • etc.

    but again I don’t want hits on exactly “mismatch”

    Is there a good way?



  • @Alan-Kilborn

    This seems to work:

    Find what zone: (?-i)(?!mismatched)(?i)MiSmAtChEd
    Search mode radio-button: Regular expression

    The exact text you don’t want to see in a hit appears first in the expression. Any cased version of the text appears in the second part.

    Note that (?-i)(?!mismatched)(?i)mismatched also works but with that version I forget which part is which. :-)



  • Hello, @alan-kilborn, @Scott-Sumner ans All,

    Bravo, Scott ! Just for info : the case of the ending word, to look for, must be different from any of the forms which have to be avoided

    For instance, let suppose all the cased versions of the five-letters word table, below :

    table    tablE    tabLe    tabLE
    taBle    taBlE    taBLe    taBLE
    tAble    tAblE    tAbLe    tAbLE
    tABle    tABlE    tABLe    tABLE
    Table    TablE    TabLe    TabLE
    TaBle    TaBlE    TaBLe    TaBLE
    TAble    TAblE    TAbLe    TAbLE
    TABle    TABlE    TABLe    TABLE
    

    Then the regex (?-i)(?!TABLE|Table|table)(?i)TablE would find all cased versions of table, except for TABLE, Table and table.

    • The regex to search for, in any case (?i)TablE is, of course, different from any of the 3 versions to avoid :-))

    • When the regex engine position is right before any cased version of the word table, the part (?-i)(?!TABLE|Table|table) just verifies that, at this position, the word, to be read, is different from, either, the forms TABLE, Table or table, in that exact case. That’s all -:))

    Cheers,

    guy038


Log in to reply