Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    How to let NP++ auto-detect UTF-8 encoding correctly?

    Help wanted · · · – – – · · ·
    3
    4
    2197
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Ben S.
      Ben S. last edited by

      I exported a WhatsApp chat into a text file and transferred it from Android SmartPhone to Win7-PC.

      When I load it into NP++ it is detected as ANSI encoding.
      However all german Umlaute are displayed wrong.
      When I manually switch encoding to UTF-8 everything is fine.

      How can I tell NP++ to AUTOMATICALLY detect all such text files in the future as UTF-8 encoded?

      When I apply “convert to UTF-8” to german Umlaute are NOT converted. They remain as wrong chars.
      So this does not work

      1 Reply Last reply Reply Quote 0
      • SalviaSage
        SalviaSage last edited by SalviaSage

        Try turning on this option.
        https://i.imgur.com/w1996uF.png

        1 Reply Last reply Reply Quote 1
        • Ben S.
          Ben S. last edited by

          Thank you for suggestion, but it does NOT help.

          Surprisingly the file is still identified as ANSI (as can be seen in the lower right part of the status bar).

          BTW: The file line feeds are identifed as Unix(LF) if it helps.

          So is there any other suggestion?

          1 Reply Last reply Reply Quote 0
          • dinkumoil
            dinkumoil last edited by dinkumoil

            @Ben-S

            Automatic encoding detection is a difficult and unreliable thing. The algorithms work heuristically by inspecting the file’s content and can fail under some circumstances.

            If your file names have a special file extension you could use my AutoCodepage plugin, available via Notepad++ PluginManager.

            Otherwise there would be the following workaround:

            1. Open Windows Notepad.

            2. Press and hold the ALT-Key and type at the numeric block of the keyboard the sequence 0239.

            3. Press and hold the ALT-Key and type at the numeric block of the keyboard the sequence 0187.

            4. Press and hold the ALT-Key and type at the numeric block of the keyboard the sequence 0191.

            5. Save the file under the name Header.txt in the folder where your file is stored but avoid to press ENTER before saving.

            6. Open a Windows console and navigate to the folder where your file and the newly created Header.txt are stored.

            7. Execute the following command:

              copy /b “Header.txt” + “<Name-of-your-file>” “Result.txt”

            With this sequence you will add an UTF-8 Byte Order Mark (BOM) to the beginning of your file and store it under the name Result.txt. When you open this file in Notepad++ it should be recognized as UTF-8 encoded.

            1 Reply Last reply Reply Quote 4
            • First post
              Last post
            Copyright © 2014 NodeBB Forums | Contributors