Community
    • Login

    Serching ligature "ff" find also non ligature "ff"

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    search in filesunicode
    3 Posts 3 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mario ValleM
      Mario Valle
      last edited by Mario Valle

      I’m searching the single character “ff” (LATIN SMALL LIGATURE FF) in multiple files, but as result I obtain also the files containing two ordinary f “ff”. The same happens with “ffl” and “ffi”.

      Is there any method to suppress this unwanted help by np++? (NPP 7.5.6 and files are UTF-8)

      Thanks!
      mario

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @mario-valle,

        Indeed, these Latin ligatures begin the Unicode block, named “Alphabetic presentation forms”. Refer to the link below :

        http://www.unicode.org/charts/PDF/UFB00.pdf

        In the text below, the character, located between its hexadecimal code and its name, is the true character ( and not the string ff, ffi, ffl… ! )

        FB00    ff    LATIN SMALL LIGATURE FF
        FB01    fi    LATIN SMALL LIGATURE FI
        FB02    fl    LATIN SMALL LIGATURE FL
        FB03    ffi    LATIN SMALL LIGATURE FFI
        FB04    ffl    LATIN SMALL LIGATURE FFL
        FB05    ſt    LATIN SMALL LIGATURE LONG S T
        FB06    st    LATIN SMALL LIGATURE ST
        

        Depending of your current font used in Notepad, you may not see all these characters. For instance, with my current mono-spaced Courrier New font, only the fi and fl are correctly displayed !

        Luckily, whatever the real glyph displayed, any of these characters can be searched with the simple regex \x{FB0n} , with 0 <= n <= 6. If you want to match all the ligatures, simply use the following class character [\x{FB00}-\x{FB06}]

        Of course, don’t forget to set the Regular expression search mode, in the Find/Replace dialog !

        Best Regards,

        guy038

        1 Reply Last reply Reply Quote 1
        • PeterJonesP
          PeterJones
          last edited by

          @Mario-Valle,

          For me, there’s actually a difference between what it finds and what it highlights. If you searched @guy038’s quoted text

          FB00    ff    LATIN SMALL LIGATURE FF
          FB01    fi    LATIN SMALL LIGATURE FI
          FB02    fl    LATIN SMALL LIGATURE FL
          FB03    ffi    LATIN SMALL LIGATURE FFI
          FB04    ffl    LATIN SMALL LIGATURE FFL
          FB05    ſt    LATIN SMALL LIGATURE LONG S T
          FB06    st    LATIN SMALL LIGATURE ST
          

          it would find one instance of fi (the ‘LATIN SMALL LIGATURE FI’). But when that fi gets highlighted, it also highlights the “FI” characters
          Imgur

          So, are you complaining that when you highlight the ligature, it also highlights the non-ligature versions?

          If not, I had thought maybe you were trying to say that Find In Files dialog was behaving differently than the normal Find dialog… But when I make three files in a “lig” subdirectory: “lig1.txt”, “lig2.txt”, and “lig3.txt”, where 1 and 3 are identical to above, and lig2 does not include the text fi, but does contain the text FI, it correctly says that “lig2.txt” does not contain the fi.
          Imgur

          Then I realized there might be a difference between the FF and the FI ligatures, and/or case sensitivity, so I switched to “lig1” and “lig3” containing:

          fb00    ff    latin small ligature ff
          fb01    fi    latin small ligature fi
          fb02    fl    latin small ligature fl
          fb03    ffi    latin small ligature ffi
          fb04    ffl    latin small ligature ffl
          fb05    ſt    latin small ligature long s t
          fb06    st    latin small ligature st
          

          and “lig2” containing:

          fb00    xxx  latin small ligature ff
          fb01    xxx  latin small ligature fi
          fb02    fl    latin small ligature fl
          fb03    ffi    latin small ligature ffi
          fb04    ffl    latin small ligature ffl
          fb05    ſt    latin small ligature long s t
          fb06    st    latin small ligature st
          

          But searching that for ff still only finds hits in “lig1” and “lig3”, and no hits in “lig2”.
          Imgur

          I cannot replicate anything where it does the match incorrectly “in multiple files”.

          Could you provide a couple example files that match for you, and the exact conditions under which you search for them and it finds them?

          I ran some more experiments with the selecting the ligature highlights all the other instances of ligature or non-ligature versions, upper or lower case:

          If that’s annoying, you could turn off the Settings > Preferences > Highlighting > Smart Highlighting > ☐ Enable … but then highlighting something else (such as SMALL in the example text) would also not highlight the other matches.

          If you do Settings > Preferences > Highlighting > Smart Highlighting > , and select

          ☑ Enable
          ☑ Match case
          ☑ Match whole word only
          

          Then you can select the fl ligature, and it only highlights the fl ligatures, but not fl or FL; similarly, selecting fl will only find the two lower-case letters, and not fl or FL; similarly selecting FL matches only two-capitals, not fl or fl.
          Imgur

          If it’s the smart-highlighting, then requiring it to match case will prevent extra selection. If it’s either the Find or Find In Files dialogs, I cannot replicate your results, so if you need more help, you’ll have to be more explicit about what’s going on. If your “in multiple files” is by some other method, you’ll definitely need to be more explicit.

          1 Reply Last reply Reply Quote 1
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors