Community
    • Login

    2-byte characters recently broken? Or do I misremember?

    Scheduled Pinned Locked Moved General Discussion
    5 Posts 3 Posters 1.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Jay LiboveJ
      Jay Libove
      last edited by

      I was fairly sure that I recalled that Notepad++ supports 2-byte characters (i.e. an “a” with an umlaut over it, “ä”). However, recently, I notice that whenever I type such a character, save the text file in Notepad++, and then re-open the file, the ä gets replaced by a questionmark ?

      PeterJonesP 1 Reply Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @Jay Libove
        last edited by

        @jay-libove ,

        Notepad++, even the newest v8.9.1.2 handles non-ASCII characters just fine.

        You will want to check your encoding – make sure that Notepad++ thinks the encoding is what the file is actually encoded as. For example, if Notepad++ thinks it’s UTF8, but your file is actually one of the ANSI encodings (like the Windows 1252 character set), then the file will have a single byte 0xE4 for ä, but Notepad++ sees that as an incomplete UTF8 sequence, and doesn’t know what to do with it – 0xE4 is actually a byte that says to a UTF8 interpreter “this is the first byte of a 3-byte sequence”, but then there are no more bytes that meet proper UTF8 encoding that follow, so it shows a ? to indicate it’s reaction of “huh, what?”.

        So if you have a file that is showing ? instead of ä, look down in the status bar to see if Notepad++ thinks the file is UTF8 – it will say near the lower-right corner. If it does, try going to Encoding > ANSI and see if that now displays the file as you expect.

        Jay LiboveJ 1 Reply Last reply Reply Quote 1
        • Jay LiboveJ
          Jay Libove @PeterJones
          last edited by

          @peterjones Apologies, I hadn’t seen that you’d replied.
          Weirdness. The encoding is showing as “TIS-620”. (Thai …)
          If I click on Encoding->ANSI or Encoding->UTF-8 the TIS-620 in the status bar does not change.
          At the bottom left it says “Normal text file”.
          Further thoughts appreciated, thanks. (n.b. this is now Notepad++ v8.1.9.3)
          -Jay

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @Jay Libove
            last edited by

            @jay-libove said in 2-byte characters recently broken? Or do I misremember?:

            The encoding is showing as “TIS-620”. (Thai …)

            It is probably your intent that the file is UTF-8?
            And you have autodetection of encoding turned on in the Preferences?
            Hmmm, there’s a known bug where UTF-8 files are detected as TIS-620 … maybe this is happening to you?

            Here are some references to this bug:

            • https://github.com/notepad-plus-plus/notepad-plus-plus/issues/10916
            • https://github.com/notepad-plus-plus/notepad-plus-plus/search?q=TIS-620&type=issues

            Autodetection is not an exact science (well, it hasn’t been proven to be, anyway). I came up with a method to mitigate this bug somewhat, you may want to have a look HERE.

            Another way to “solve” this problem is to turn autodetect of encoding off. Then, with N++ settings as default, your file probably will show UTF-8 on the status bar after loading.

            @jay-libove said in 2-byte characters recently broken? Or do I misremember?:

            If I click on Encoding->ANSI or Encoding->UTF-8 the TIS-620 in the status bar does not change.

            This is because Notepad++ thinks your file is encoded as TIS-620 and you are telling it to reinterpret it (without changing it) as UTF-8. Probably the reinterpret fails because of the corruption the bug has caused?

            1 Reply Last reply Reply Quote 1
            • Jay LiboveJ
              Jay Libove
              last edited by

              Thanks very much @Alan-Kilborn
              I’ll jump in to the other thread (levicki).
              -Jay

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors