Community
    • Login

    Copy buffer converts NUL to SPACE

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 3 Posters 12.4k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • PeterJonesP Offline
      PeterJones
      last edited by

      I was trying to take a binary file into a hex-edit-type mode, edit the hex strings, and convert it back to a binary.

      I started with the HEX-Editor plugin (0.9.5), but it wouldn’t let me paste into the hex side, and would start crashing NPP (6.8.6) when I’d do something silly like File | New.

      If I Select All, then use TextFX > TextFX Convert > Convert Text to Hex-16, I noticed that all NUL characters converted from 00 to 20.

      At first, I thought it was a bug with TextFX, so I tried with Plugins > Converter > ASCII to HEX, but it does the same thing

      I made a test case: first, I started with an artificially-generated Hex-16:

      "000000000  00 01 02 03 04 05 06 07-08 09 0a 0b 0c 0d 0e 0f   |0123456789abcdef|"
      "000000010  10 11 12 13 14 15 16 17-18 19 1a 1b 1c 1d 1e 1f   |0123456789abcdef|"
      "000000020  20 21 22 23 24 25 26 27-28 29 2a 2b 2c 2d 2e 2f   |0123456789abcdef|"
      "000000030  30 31 32 33 34 35 36 37-38 39 3a 3b 3c 3d 3e 3f   |0123456789abcdef|"
      "000000040  40 41 42 43 44 45 46 47-48 49 4a 4b 4c 4d 4e 4f   |0123456789abcdef|"
      "000000050  50 51 52 53 54 55 56 57-58 59 5a 5b 5c 5d 5e 5f   |0123456789abcdef|"
      "000000060  60 61 62 63 64 65 66 67-68 69 6a 6b 6c 6d 6e 6f   |0123456789abcdef|"
      "000000070  70 71 72 73 74 75 76 77-78 79 7a 7b 7c 7d 7e 7f   |0123456789abcdef|"
      "000000080  54 68 65 20 71 75 69 63-6B 20 62 7b 6F 77 6E 20   |The quick brown |"
      "000000090  66 6F 78 00 41 6E 6F 74-68 65 72 20 73 69 6C 6C   |fox Another sill|"
      "0000000A0  79 20 73 74 72 69 6E 67-00 59 65 74 20 61 6E 6F   |y string Yet ano|"
      "0000000B0  74 68 65 72 20 73 74 72-69 6E 67 20 00 00 00 00   |ther string     |"
      "0000000C0  00 01 02 03 04 05 06 07-08 09 0a 0b 0c 0d 0e 0f   |0123456789abcdef|"
      "0000000D0  10 11 12 13 14 15 16 17-18 19 1a 1b 1c 1d 1e 1f   |0123456789abcdef|"
      "0000000E0  20 21 22 23 24 25 26 27-28 29 2a 2b 2c 2d 2e 2f   |0123456789abcdef|"
      "0000000F0  30 31 32 33 34 35 36 37-38 39 3a 3b 3c 3d 3e 3f   |0123456789abcdef|"
      "000000100  40 41 42 43 44 45 46 47-48 49 4a 4b 4c 4d 4e 4f   |0123456789abcdef|"
      "000000110  50 51 52 53 54 55 56 57-58 59 5a 5b 5c 5d 5e 5f   |0123456789abcdef|"
      "000000120  60 61 62 63 64 65 66 67-68 69 6a 6b 6c 6d 6e 6f   |0123456789abcdef|"
      "000000130  70 71 72 73 74 75 76 77-78 79 7a 7b 7c 7d 7e 7f   |0123456789abcdef|"
      

      Then I select all, and run TextFX > TextFX Convert > Convert Hex to Text: this does what I would expect, and when I show all characters, I can see it properly creates the NUL character throughout.
      So then I try to reverse the process: Select All, TextFX > TextFX Convert > Convert text to Hex-16, and I get

      "000000000  20 01 02 03 04 05 06 07-08 09 0A 0B 0C 0D 0E 0F   | xxxxxxxx..xx.xx|"
      "000000010  10 11 12 13 14 15 16 17-18 19 1A 1B 1C 1D 1E 1F   |xxxxxxxxxxxxxxxx|"
      ...
      "000000080  54 68 65 20 71 75 69 63-6B 20 62 7B 6F 77 6E 20   |The quick b{own |"
      "000000090  66 6F 78 20 41 6E 6F 74-68 65 72 20 73 69 6C 6C   |fox Another sill|"
      "0000000A0  79 20 73 74 72 69 6E 67-20 59 65 74 20 61 6E 6F   |y string Yet ano|"
      "0000000B0  74 68 65 72 20 73 74 72-69 6E 67 20 20 20 20 20   |ther string     |"
      

      … where all the NUL characters became 20 instead of 00 (I edited out the {SOH}{STX}{...} characters and converted them to x’s in the |...|-delimited section on the right for pasting into the forum, but those characters were all correct.)

      If I follow the same procedure, but with Plugins > Converter > ASCII to HEX, I get

      200102030405060708090A0B0C0D0E0F
      101112131415161718191A1B1C1D1E1F
      202122232425262728292A2B2C2D2E2F
      303132333435363738393A3B3C3D3E3F
      404142434445464748494A4B4C4D4E4F
      505152535455565758595A5B5C5D5E5F
      606162636465666768696A6B6C6D6E6F
      707172737475767778797A7B7C7D7E7F
      54686520717569636B20627B6F776E20
      666F7820416E6F746865722073696C6C
      7920737472696E672059657420616E6F
      7468657220737472696E672020202020
      200102030405060708090A0B0C0D0E0F
      101112131415161718191A1B1C1D1E1F
      202122232425262728292A2B2C2D2E2F
      303132333435363738393A3B3C3D3E3F
      404142434445464748494A4B4C4D4E4F
      505152535455565758595A5B5C5D5E5F
      606162636465666768696A6B6C6D6E6F
      707172737475767778797A7B7C7D7E7F
      

      … which once again shows 20 instead of 00 for NUL characters.

      I then took the binary version, and just copy/pasted into another file, and saw that most of the control characters stayed correct, but NUL became a space, too.

      At this point, I assume that the act of copying text into the copy-buffer (which I assume both TextFX Convert and Converter use internally) converts NUL to SPACE. Is there a setting which changes that behavior for the copy buffer, or any way I can “protect” my NUL characters when doing the conversion? Because I cannot just assume that all 20 in the resulting hex-dump are really NUL, not SPACE.

      PeterJonesP 1 Reply Last reply Reply Quote 1
      • PeterJonesP Offline
        PeterJones @PeterJones
        last edited by

        TL;DR?

        Short Version: When I copy a string that includes a NUL character, it appears to be translated into a SPACE before scripts or plugins get access to the buffer. Is there anyway to turn off this behavior, so that the NUL remains present in the copy buffer?

        Claudia FrankC 1 Reply Last reply Reply Quote 0
        • Claudia FrankC Offline
          Claudia Frank @PeterJones
          last edited by

          Hello PeterCJ-AtWork,

          this seems to be a scintilla feature which was introduced in 2013
          as discussed here.

          So, using npp for manipulating binary data doesn’t seem to work.

          Cheers
          Claudia

          1 Reply Last reply Reply Quote 1
          • PeterJonesP Offline
            PeterJones
            last edited by

            Thanks for the info. I guess I won’t be using my otherwise favorite editor for binary edits.

            1 Reply Last reply Reply Quote 0
            • guy038G Offline
              guy038
              last edited by guy038

              Hello, PeterCJ AtWork,

              I think that I found a work-around to your problem :-))

              • First, check the menu option TextFX - TextFX Viz Settings - Viz Paste/Append binary

              • Now, open and select any piece of text, which is part of a binary text

              • Select the menu option TextFX - TextFX Viz - Copy Visible Selection

              • Open a new tab ( CTRL + N )

              • If necessary, select the same encoding as the binary source text

              • Select the last menu option TextFX - TextFX Viz - Paste

              Well, you could think, at first sight, that it’s just OK !

              Unfortunately, ALL the LF ( \x0A )and CR ( \x0D )characters are wiped out, from the pasted text :-((

              Quite weird ! I did some tests but I could NOT find a setting/way which allows copy of, both, NUL and EOL characters


              So, we have to cheat, a bit ! Before copying the selection, with the TextFX command :

              • Change, in the binary source text, any character \n with any string ( of 1 or several characters long ), which does NOT exist, in the binary text

              • Change, in the binary source text, any character \r with any string ( of 1 or several characters long ), which does NOT exist, in the binary text

              • Once the paste action done, with simple regexes, just change the temporary strings, created for \n and \r, in order to get back to the original text -))

              Best Regards

              guy038

              1 Reply Last reply Reply Quote 2
              • PeterJonesP Offline
                PeterJones
                last edited by

                Thanks for that workaround. It helps. (Fortunately, this is something I only do every few months, not on a daily basis. Unfortunately, it means I’ll have to bookmark the workaround, because I’ll forget it every time. :-) )

                1 Reply Last reply Reply Quote 0
                • PeterJonesP Offline
                  PeterJones
                  last edited by PeterJones

                  Revisit three years later: thanks to @Meta-Chuh in Keep line endings on paste (no LF to CRLF or CRLF to LF)?, I learned of the Edit > Paste Special > Copy/Cut/Paste Binary Content feature, which apparently existed since v5.9 (so was available in v6.8.6 three years ago) which does allow for binary copy/paste. It doesn’t work in conjunction with the TextFX process originally described, but I thought I’d point out for all future viewers of this Topic that the copy/paste buffer can work with binary data inside Notepad++.

                  1 Reply Last reply Reply Quote 2

                  Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                  Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                  With your input, this post could be even better 💗

                  Register Login
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors