Community
    • Login

    Notepad++ How to find in page with UTF-8 instead of ANSI ?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 2 Posters 523 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Robin CruiseR Offline
      Robin Cruise
      last edited by Robin Cruise

      hello, I have a lot of words like ştiinţific and stiintific (with and without diacritics/accent marks). How can I search so as to find both versions?

      I can do this in all PDF and MS World files, but in notepad++ I cannot. So, is there a way to do this kind of find and also the replace just with UTF-8 ?

      1 Reply Last reply Reply Quote 0
      • guy038G Offline
        guy038
        last edited by guy038

        Hello, @robin-cruise and All,

        You can achieve this kind of goal with equivalent class structures. Their global syntax is [[=<Single_Letter>=]]

        For instance, the regex [[=A=]] would match any of these 82 Unicode chars : AaªÀÁÂÃÄÅàáâãäåĀāĂ㥹ǍǎǞǟǠǡǺǻȀȁȂȃȦȧȺɐɑɒᴀᴬᵃᵄᶏᶐᶛḀḁẚẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặₐÅ⒜ⒶⓐⱥⱭⱯⱰ, which have a relation, in some way, with the first letter of the Latin alphabet !

        Actually, the regex should be more considered as the [=<Single_Letter>=] syntax, embedded in a usual character class [•••••]. For instance, the regex
        (?-i)[012[=A=]@b-y[=z=]|] matches all the following characters, sorted by ascending Unicode code-point :

        • ASCII chars :

          • 012
          • @
          • A
          • Z
          • a
          • bcdefghijklmnopqrstuvwxy
          • z
          • |
        • ANSI chars

          • ª
          • ÀÁÂÃÄÅ
          • àáâãäå
        • UNICODE chars, with code over \x{00ff}

          • ĀāĂ㥹
          • ŹźŻżŽž
          • Ǎǎ
          • Ǻǻ
          • ẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặ

        So, practically, to match, either, your strings ştiinţific and stiintific, use the regex :

        [[=s=]]tiin[[=t=]]ific

        Best Regards,

        guy038

        1 Reply Last reply Reply Quote 1
        • Robin CruiseR Offline
          Robin Cruise
          last edited by

          yes, nice answer. But very hard , because I need to change almost all words from every sentence:)

          1 Reply Last reply Reply Quote 0

          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

          With your input, this post could be even better 💗

          Register Login
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors