Community
    • Login

    accented characters as hex

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    3 Posts 3 Posters 1.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Zsolt TomoZ
      Zsolt Tomo
      last edited by

      When opening an xml file in Notepad++ , it converts character É to xC9.
      Opening the same file in Notepad the character É is displayed.
      How can I view the correct character É in Notepad++?

      PeterJonesP gerdb42G 2 Replies Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @Zsolt Tomo
        last edited by PeterJones

        @Zsolt-Tomo,

        Your encoding is wrong. You have a byte 0xC9 (which is the Windows-1252 and similar encoding for the É) character, but Notepad++ has interpreted the file as UTF-8, and UTF-8 doesn’t have a character that is encoded solely with 0xC9, which is why Notepad++ displays that character as an error.

        acf2be8a-040e-4b02-9f8d-e65796a2a349-image.png

        Notice the lower right says UTF-8.

        So the real question is, what is your real encoding right now, and what do you want the encoding to be?

        If you don’t have a mix of Windows-1252 and UTF-8 in the same file, the easiest fix is to go to the Encoding menu and click on ANSI, so that Notepad++ will re-interpret the file as a correct Windows-1252 ANSI encoding.

        Now the lower right says ANSI.

        d883bb1e-85c1-4961-9346-13450f1a8a4b-image.png

        If you want to convert the file to UTF-8 at this point, use the Encoding menu again, but this time go all the way down to “Convert to UTF-8” (not the “UTF-8” near the top)
        c968b15e-de1f-42e6-85ef-19e8761e05ca-image.png
        This will then change the underlying bytes of the file to correctly use UTF-8, but will still properly show the É. And next time you load it in Notepad++, it should be correct when you load, without this extra effort.

        If you’ve got a mix of character encodings in the same file, it will be harder to help you. We’d have to show you a fancy regex, but to do that, we’d need a better idea of what the mix of properly-encoded and wrongly-encoded characters was.

        1 Reply Last reply Reply Quote 2
        • gerdb42G
          gerdb42 @Zsolt Tomo
          last edited by

          @Zsolt-Tomo,

          your xml file should have an XML-Prolog in the first line which states the encoding to use:
          <?xml version="1.0" encoding="Windows-1252"?>
          If this line is missing, UTF-8 will be assumed since that is the default for XML files. If the prolog is set, the XML-lexer will set the document encoding accordingly.

          1 Reply Last reply Reply Quote 3
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors