• Login
Community
  • Login

Simple test file is losing its encoding :(

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
5 Posts 4 Posters 2.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    Scott Burkett
    last edited by Jan 10, 2018, 8:24 AM

    Egads! I have a problem that just cropped up - no idea why it is happening all of a sudden, as it has been working fine for years!

    Simple test case, I have a bunch of text surrounded by some color replacement tokens.


    §YThis is a ‘test’!§W
    §YThis is a ‘test’!§W
    §YThis is a ‘test’!§W
    §YThis is a ‘test’!§W
    §YThis is a ‘test’!§W
    §YThis is a ‘test’!§W

    File is initially created in ANSI encoding, and saved that way. Reopened, it seems to “lose” its encoding, and now looks like this!


    即This is a 'test’劬
    即This is a 'test’劬
    即This is a 'test’劬
    即This is a 'test’劬
    即This is a 'test’劬

    And the encoding is gone. :( Any ideas? I’ve tried all manner of things to try and stop it. Running the latest version of N++. Thanks in advance for any guidance!

    1 Reply Last reply Reply Quote 0
    • D
      decoderman
      last edited by Jan 10, 2018, 8:39 AM

      If you have a backup of the file make a backup of it and open it in NPP.
      You can select Encoding > Convert to … to make changes to the file without its content also changing.

      Also, check default Encoding setting in Settings > Preferences. In “New Document” on the right side.
      Format and Encoding set here are for new files only tough.

      1 Reply Last reply Reply Quote 0
      • S
        Scott Burkett
        last edited by Jan 10, 2018, 11:08 AM

        I have a backup of the file, that isn’t a problem. The issue is that every time I make changes to it, save it, then reload it, it gets wonky!

        1 Reply Last reply Reply Quote 0
        • P
          PeterJones
          last edited by Jan 10, 2018, 2:50 PM

          I could not replicate that. If I created a new file, and Encoding > Encode in ANSI, then pasted in the text copied from the first half of your post, and save, when I reload the file (even if I rename it), it properly loads the same as before, and still claims to be encoded in ANSI in both the Encoding menu and the lower-right of the NPP status bar.

          Using the gnuwin32 copy of hexdump , I see the

          C:>hexdump 15054-renamed.txt
          00000000: A7 59 54 68 69 73 20 69 - 73 20 61 20 91 74 65 73 | YThis is a  tes|
          00000010: 74 92 21 A7 57 0D 0A A7 - 59 54 68 69 73 20 69 73 |t ! W   YThis is|
          00000020: 20 61 20 91 74 65 73 74 - 92 21 A7 57 0D 0A A7 59 | a  test ! W   Y|
          00000030: 54 68 69 73 20 69 73 20 - 61 20 91 74 65 73 74 92 |This is a  test |
          00000040: 21 A7 57 0D 0A A7 59 54 - 68 69 73 20 69 73 20 61 |! W   YThis is a|
          00000050: 20 91 74 65 73 74 92 21 - A7 57 0D 0A A7 59 54 68 |  test ! W   YTh|
          00000060: 69 73 20 69 73 20 61 20 - 91 74 65 73 74 92 21 A7 |is is a  test ! |
          00000070: 57 0D 0A A7 59 54 68 69 - 73 20 69 73 20 61 20 91 |W   YThis is a  |
          00000080: 74 65 73 74 92 21 A7 57 -                         |test ! W|
          00000088;
          

          (I also get similar using the xxd.exe that ships with VIM for Windows.)

          That’s exactly what I’d expect to see for ANSI encoding of that file.

          But, note: if I created the file by File > New, but with it in my default Encoding > Encode in UTF-8, and then paste into that new file, then incorrectly do Encoding > Encode in ANSI after pasting, it changes the high-bit characters (the § and smart quotes) into two-byte sequences that look like §Y and similar. If I save it, however, and reload (same name), it will come back in as UTF-8 again, and look right again. As such, it’s

          C:>hexdump 15054-wrong.txt
          00000000: A7 59 54 68 69 73 20 69 - 73 20 61 20 91 74 65 73 | YThis is a  tes|
          00000010: 74 92 21 A7 57 0D 0A A7 - 59 54 68 69 73 20 69 73 |t ! W   YThis is|
          00000020: 20 61 20 91 74 65 73 74 - 92 21 A7 57 0D 0A A7 59 | a  test ! W   Y|
          00000030: 54 68 69 73 20 69 73 20 - 61 20 91 74 65 73 74 92 |This is a  test |
          00000040: 21 A7 57 0D 0A A7 59 54 - 68 69 73 20 69 73 20 61 |! W   YThis is a|
          00000050: 20 91 74 65 73 74 92 21 - A7 57 0D 0A A7 59 54 68 |  test ! W   YTh|
          00000060: 69 73 20 69 73 20 61 20 - 91 74 65 73 74 92 21 A7 |is is a  test ! |
          00000070: 57 0D 0A A7 59 54 68 69 - 73 20 69 73 20 61 20 91 |W   YThis is a  |
          00000080: 74 65 73 74 92 21 A7 57 -                         |test ! W|
          00000088;
          

          That’s exactly what I’d expect for the UTF8-encoding for those characters.

          I can change my 15054-renamed.txt inside NPP to my heart’s content, save it, and reload it, and it still preserves the essential ANSI encoding, and it continues to behave properly on reload.

          For your copy of the file that has the encoding problem, what does hexdump (or similar tool) show?

          1 Reply Last reply Reply Quote 3
          • C
            chcg
            last edited by Jan 10, 2018, 10:23 PM

            Maybe related to this one https://github.com/notepad-plus-plus/notepad-plus-plus/issues/3188 .

            1 Reply Last reply Reply Quote 0
            1 out of 5
            • First post
              1/5
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors