• Login
Community
  • Login

accented characters as hex

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
3 Posts 3 Posters 1.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z
    Zsolt Tomo
    last edited by Nov 8, 2022, 11:20 PM

    When opening an xml file in Notepad++ , it converts character É to xC9.
    Opening the same file in Notepad the character É is displayed.
    How can I view the correct character É in Notepad++?

    P G 2 Replies Last reply Nov 8, 2022, 11:52 PM Reply Quote 0
    • P
      PeterJones @Zsolt Tomo
      last edited by PeterJones Nov 8, 2022, 11:53 PM Nov 8, 2022, 11:52 PM

      @Zsolt-Tomo,

      Your encoding is wrong. You have a byte 0xC9 (which is the Windows-1252 and similar encoding for the É) character, but Notepad++ has interpreted the file as UTF-8, and UTF-8 doesn’t have a character that is encoded solely with 0xC9, which is why Notepad++ displays that character as an error.

      acf2be8a-040e-4b02-9f8d-e65796a2a349-image.png

      Notice the lower right says UTF-8.

      So the real question is, what is your real encoding right now, and what do you want the encoding to be?

      If you don’t have a mix of Windows-1252 and UTF-8 in the same file, the easiest fix is to go to the Encoding menu and click on ANSI, so that Notepad++ will re-interpret the file as a correct Windows-1252 ANSI encoding.

      Now the lower right says ANSI.

      d883bb1e-85c1-4961-9346-13450f1a8a4b-image.png

      If you want to convert the file to UTF-8 at this point, use the Encoding menu again, but this time go all the way down to “Convert to UTF-8” (not the “UTF-8” near the top)
      c968b15e-de1f-42e6-85ef-19e8761e05ca-image.png
      This will then change the underlying bytes of the file to correctly use UTF-8, but will still properly show the É. And next time you load it in Notepad++, it should be correct when you load, without this extra effort.

      If you’ve got a mix of character encodings in the same file, it will be harder to help you. We’d have to show you a fancy regex, but to do that, we’d need a better idea of what the mix of properly-encoded and wrongly-encoded characters was.

      1 Reply Last reply Reply Quote 2
      • G
        gerdb42 @Zsolt Tomo
        last edited by Nov 9, 2022, 7:58 AM

        @Zsolt-Tomo,

        your xml file should have an XML-Prolog in the first line which states the encoding to use:
        <?xml version="1.0" encoding="Windows-1252"?>
        If this line is missing, UTF-8 will be assumed since that is the default for XML files. If the prolog is set, the XML-lexer will set the document encoding accordingly.

        1 Reply Last reply Reply Quote 3
        3 out of 3
        • First post
          3/3
          Last post
        The Community of users of the Notepad++ text editor.
        Powered by NodeBB | Contributors