Serching ligature "ff" find also non ligature "ff"



  • I’m searching the single character “ff” (LATIN SMALL LIGATURE FF) in multiple files, but as result I obtain also the files containing two ordinary f “ff”. The same happens with “ffl” and “ffi”.

    Is there any method to suppress this unwanted help by np++? (NPP 7.5.6 and files are UTF-8)

    Thanks!
    mario



  • Hello, @mario-valle,

    Indeed, these Latin ligatures begin the Unicode block, named “Alphabetic presentation forms”. Refer to the link below :

    http://www.unicode.org/charts/PDF/UFB00.pdf

    In the text below, the character, located between its hexadecimal code and its name, is the true character ( and not the string ff, ffi, ffl… ! )

    FB00    ff    LATIN SMALL LIGATURE FF
    FB01    fi    LATIN SMALL LIGATURE FI
    FB02    fl    LATIN SMALL LIGATURE FL
    FB03    ffi    LATIN SMALL LIGATURE FFI
    FB04    ffl    LATIN SMALL LIGATURE FFL
    FB05    ſt    LATIN SMALL LIGATURE LONG S T
    FB06    st    LATIN SMALL LIGATURE ST
    

    Depending of your current font used in Notepad, you may not see all these characters. For instance, with my current mono-spaced Courrier New font, only the and are correctly displayed !

    Luckily, whatever the real glyph displayed, any of these characters can be searched with the simple regex \x{FB0n} , with 0 <= n <= 6. If you want to match all the ligatures, simply use the following class character [\x{FB00}-\x{FB06}]

    Of course, don’t forget to set the Regular expression search mode, in the Find/Replace dialog !

    Best Regards,

    guy038



  • @Mario-Valle,

    For me, there’s actually a difference between what it finds and what it highlights. If you searched @guy038’s quoted text

    FB00    ff    LATIN SMALL LIGATURE FF
    FB01    fi    LATIN SMALL LIGATURE FI
    FB02    fl    LATIN SMALL LIGATURE FL
    FB03    ffi    LATIN SMALL LIGATURE FFI
    FB04    ffl    LATIN SMALL LIGATURE FFL
    FB05    ſt    LATIN SMALL LIGATURE LONG S T
    FB06    st    LATIN SMALL LIGATURE ST
    

    it would find one instance of (the ‘LATIN SMALL LIGATURE FI’). But when that gets highlighted, it also highlights the “FI” characters
    Imgur

    So, are you complaining that when you highlight the ligature, it also highlights the non-ligature versions?

    If not, I had thought maybe you were trying to say that Find In Files dialog was behaving differently than the normal Find dialog… But when I make three files in a “lig” subdirectory: “lig1.txt”, “lig2.txt”, and “lig3.txt”, where 1 and 3 are identical to above, and lig2 does not include the text , but does contain the text FI, it correctly says that “lig2.txt” does not contain the .
    Imgur

    Then I realized there might be a difference between the FF and the FI ligatures, and/or case sensitivity, so I switched to “lig1” and “lig3” containing:

    fb00    ff    latin small ligature ff
    fb01    fi    latin small ligature fi
    fb02    fl    latin small ligature fl
    fb03    ffi    latin small ligature ffi
    fb04    ffl    latin small ligature ffl
    fb05    ſt    latin small ligature long s t
    fb06    st    latin small ligature st
    

    and “lig2” containing:

    fb00    xxx  latin small ligature ff
    fb01    xxx  latin small ligature fi
    fb02    fl    latin small ligature fl
    fb03    ffi    latin small ligature ffi
    fb04    ffl    latin small ligature ffl
    fb05    ſt    latin small ligature long s t
    fb06    st    latin small ligature st
    

    But searching that for still only finds hits in “lig1” and “lig3”, and no hits in “lig2”.
    Imgur

    I cannot replicate anything where it does the match incorrectly “in multiple files”.

    Could you provide a couple example files that match for you, and the exact conditions under which you search for them and it finds them?

    I ran some more experiments with the selecting the ligature highlights all the other instances of ligature or non-ligature versions, upper or lower case:

    If that’s annoying, you could turn off the Settings > Preferences > Highlighting > Smart Highlighting > ☐ Enable … but then highlighting something else (such as SMALL in the example text) would also not highlight the other matches.

    If you do Settings > Preferences > Highlighting > Smart Highlighting >, and select

    ☑ Enable
    ☑ Match case
    ☑ Match whole word only
    

    Then you can select the ligature, and it only highlights the ligatures, but not fl or FL; similarly, selecting fl will only find the two lower-case letters, and not or FL; similarly selecting FL matches only two-capitals, not or fl.
    Imgur

    If it’s the smart-highlighting, then requiring it to match case will prevent extra selection. If it’s either the Find or Find In Files dialogs, I cannot replicate your results, so if you need more help, you’ll have to be more explicit about what’s going on. If your “in multiple files” is by some other method, you’ll definitely need to be more explicit.


Log in to reply