What's so good about Extended search mode?



  • What’s so good about Extended search mode?



  • @Prahlad-Makwana4145

    I assume, that this has historical reasons. Once upon a time there was this npp release which hasn’t regular expression support but users wanted to be able to delete certain chars and one good soul implemented it. And like in tales, if they didn’t die, the functionality is still alive :-)



  • Hello @prahlad-makwana4145, @ekopalypse and All,

    Welcome to the N++ Community !

    Actually, in Extended mode, in addition to the search of standard characters and the 5 specific characters, below :

    Character Syntax
    ---------------- -----------
    Tabulation \t
    New Line \n
    Carriage Return \r
    Backslash \\
    Null \0
    ---------------- -----------

    Within an Unicode encoding file, a particular character, of code-point U+00xx, with xx between 00 and ff, can be found with one of the four syntaxes below :

    Type From To Character Range
    ------------ ----------- ----------- -----------------
    Decimal \d000 \d255 [0-9]
    Octal \o000 \o377 [0-7]
    Binary \b00000000 \b11111111 [0-1]
    Hexadecimal \x00 \xFF [0-9A-F]
    ------------ ----------- ----------- -----------------

    However, within an ANSI encoded file, an unicode character, of code-point U+00xx can be found ONLY IF xx belongs to the range [00-7F] OR to the range [A0-FF]. When xx lies between 80 and 9F, it generally searches for the question mark ( ? ) as it refers to an Unicode char, whose code-point is not handled by the ANSI encoding ! Only, the 5 characters U+0081, U+008D, U+008F, U+0090 and U+009D, without any glyph, are correctly searched !


    Notes :

    • The Extended search mode, as well as the Regular expression one, cannot be used for searching individual bytes of an UTF-8 or UCS-2 encoded character !

    • The replacement zone, in Extended mode, may contain any char, except for the NUL char ( \0 )

    • When using the Extended mode, especially when searching for letters, it is advisable to tick the Match case option

    • Reminder : In the Normal and Extended search mode, it’s best to NOT tick the Match whole word only option, especially when the searched string begins and/or ends with a NON-word character !


    So, for instance, with the Match case option ticked, the Match whole word only option UN-ticked and the Extended [\n, \r, \t, \0, \x...) search mode selected :

    • If you search for the uppercase letter A, you can choose, either, the syntax \d065 or \o101 or \b1000001 or \x41

    • And if you look for the character, with decimal ASCII code 201 ( É ), type in, either, the syntax \d201 or \o311 or \b11001001 or \xC9

    Best Regards,

    guy038

    P.S. :

    Personally, I think that the only advantage of using the Extended mode is when you want to use the \dxxx syntax, where xxx represents the decimal code of the character :

    • Between 000 and 255 ( so in range U+0000 - U+00FF) within a UTF-8 or UCS-2 encoded file

    • Between 000 and 127 or between 160 and 255 ( so in ranges U+0000 - U+007F or U+00A0 - U+00FF ) within an ANSI file

    In all other cases, just prefer the Regular expression search mode ;-))


Log in to reply