• Login
Community
  • Login

How to let NP++ auto-detect UTF-8 encoding correctly?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 3 Posters 3.0k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    Ben S.
    last edited by Sep 15, 2018, 12:46 PM

    I exported a WhatsApp chat into a text file and transferred it from Android SmartPhone to Win7-PC.

    When I load it into NP++ it is detected as ANSI encoding.
    However all german Umlaute are displayed wrong.
    When I manually switch encoding to UTF-8 everything is fine.

    How can I tell NP++ to AUTOMATICALLY detect all such text files in the future as UTF-8 encoded?

    When I apply “convert to UTF-8” to german Umlaute are NOT converted. They remain as wrong chars.
    So this does not work

    1 Reply Last reply Reply Quote 0
    • S
      SalviaSage
      last edited by SalviaSage Sep 15, 2018, 3:47 PM Sep 15, 2018, 3:47 PM

      Try turning on this option.
      https://i.imgur.com/w1996uF.png

      1 Reply Last reply Reply Quote 1
      • B
        Ben S.
        last edited by Sep 19, 2018, 11:07 AM

        Thank you for suggestion, but it does NOT help.

        Surprisingly the file is still identified as ANSI (as can be seen in the lower right part of the status bar).

        BTW: The file line feeds are identifed as Unix(LF) if it helps.

        So is there any other suggestion?

        1 Reply Last reply Reply Quote 0
        • D
          dinkumoil
          last edited by dinkumoil Sep 19, 2018, 11:49 AM Sep 19, 2018, 11:46 AM

          @Ben-S

          Automatic encoding detection is a difficult and unreliable thing. The algorithms work heuristically by inspecting the file’s content and can fail under some circumstances.

          If your file names have a special file extension you could use my AutoCodepage plugin, available via Notepad++ PluginManager.

          Otherwise there would be the following workaround:

          1. Open Windows Notepad.

          2. Press and hold the ALT-Key and type at the numeric block of the keyboard the sequence 0239.

          3. Press and hold the ALT-Key and type at the numeric block of the keyboard the sequence 0187.

          4. Press and hold the ALT-Key and type at the numeric block of the keyboard the sequence 0191.

          5. Save the file under the name Header.txt in the folder where your file is stored but avoid to press ENTER before saving.

          6. Open a Windows console and navigate to the folder where your file and the newly created Header.txt are stored.

          7. Execute the following command:

            copy /b “Header.txt” + “<Name-of-your-file>” “Result.txt”

          With this sequence you will add an UTF-8 Byte Order Mark (BOM) to the beginning of your file and store it under the name Result.txt. When you open this file in Notepad++ it should be recognized as UTF-8 encoded.

          1 Reply Last reply Reply Quote 4
          4 out of 4
          • First post
            4/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors