Community
    • Login

    Copy buffer converts NUL to SPACE

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 3 Posters 11.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • PeterJonesP
      PeterJones
      last edited by

      I was trying to take a binary file into a hex-edit-type mode, edit the hex strings, and convert it back to a binary.

      I started with the HEX-Editor plugin (0.9.5), but it wouldn’t let me paste into the hex side, and would start crashing NPP (6.8.6) when I’d do something silly like File | New.

      If I Select All, then use TextFX > TextFX Convert > Convert Text to Hex-16, I noticed that all NUL characters converted from 00 to 20.

      At first, I thought it was a bug with TextFX, so I tried with Plugins > Converter > ASCII to HEX, but it does the same thing

      I made a test case: first, I started with an artificially-generated Hex-16:

      "000000000  00 01 02 03 04 05 06 07-08 09 0a 0b 0c 0d 0e 0f   |0123456789abcdef|"
      "000000010  10 11 12 13 14 15 16 17-18 19 1a 1b 1c 1d 1e 1f   |0123456789abcdef|"
      "000000020  20 21 22 23 24 25 26 27-28 29 2a 2b 2c 2d 2e 2f   |0123456789abcdef|"
      "000000030  30 31 32 33 34 35 36 37-38 39 3a 3b 3c 3d 3e 3f   |0123456789abcdef|"
      "000000040  40 41 42 43 44 45 46 47-48 49 4a 4b 4c 4d 4e 4f   |0123456789abcdef|"
      "000000050  50 51 52 53 54 55 56 57-58 59 5a 5b 5c 5d 5e 5f   |0123456789abcdef|"
      "000000060  60 61 62 63 64 65 66 67-68 69 6a 6b 6c 6d 6e 6f   |0123456789abcdef|"
      "000000070  70 71 72 73 74 75 76 77-78 79 7a 7b 7c 7d 7e 7f   |0123456789abcdef|"
      "000000080  54 68 65 20 71 75 69 63-6B 20 62 7b 6F 77 6E 20   |The quick brown |"
      "000000090  66 6F 78 00 41 6E 6F 74-68 65 72 20 73 69 6C 6C   |fox Another sill|"
      "0000000A0  79 20 73 74 72 69 6E 67-00 59 65 74 20 61 6E 6F   |y string Yet ano|"
      "0000000B0  74 68 65 72 20 73 74 72-69 6E 67 20 00 00 00 00   |ther string     |"
      "0000000C0  00 01 02 03 04 05 06 07-08 09 0a 0b 0c 0d 0e 0f   |0123456789abcdef|"
      "0000000D0  10 11 12 13 14 15 16 17-18 19 1a 1b 1c 1d 1e 1f   |0123456789abcdef|"
      "0000000E0  20 21 22 23 24 25 26 27-28 29 2a 2b 2c 2d 2e 2f   |0123456789abcdef|"
      "0000000F0  30 31 32 33 34 35 36 37-38 39 3a 3b 3c 3d 3e 3f   |0123456789abcdef|"
      "000000100  40 41 42 43 44 45 46 47-48 49 4a 4b 4c 4d 4e 4f   |0123456789abcdef|"
      "000000110  50 51 52 53 54 55 56 57-58 59 5a 5b 5c 5d 5e 5f   |0123456789abcdef|"
      "000000120  60 61 62 63 64 65 66 67-68 69 6a 6b 6c 6d 6e 6f   |0123456789abcdef|"
      "000000130  70 71 72 73 74 75 76 77-78 79 7a 7b 7c 7d 7e 7f   |0123456789abcdef|"
      

      Then I select all, and run TextFX > TextFX Convert > Convert Hex to Text: this does what I would expect, and when I show all characters, I can see it properly creates the NUL character throughout.
      So then I try to reverse the process: Select All, TextFX > TextFX Convert > Convert text to Hex-16, and I get

      "000000000  20 01 02 03 04 05 06 07-08 09 0A 0B 0C 0D 0E 0F   | xxxxxxxx..xx.xx|"
      "000000010  10 11 12 13 14 15 16 17-18 19 1A 1B 1C 1D 1E 1F   |xxxxxxxxxxxxxxxx|"
      ...
      "000000080  54 68 65 20 71 75 69 63-6B 20 62 7B 6F 77 6E 20   |The quick b{own |"
      "000000090  66 6F 78 20 41 6E 6F 74-68 65 72 20 73 69 6C 6C   |fox Another sill|"
      "0000000A0  79 20 73 74 72 69 6E 67-20 59 65 74 20 61 6E 6F   |y string Yet ano|"
      "0000000B0  74 68 65 72 20 73 74 72-69 6E 67 20 20 20 20 20   |ther string     |"
      

      … where all the NUL characters became 20 instead of 00 (I edited out the {SOH}{STX}{...} characters and converted them to x’s in the |...|-delimited section on the right for pasting into the forum, but those characters were all correct.)

      If I follow the same procedure, but with Plugins > Converter > ASCII to HEX, I get

      200102030405060708090A0B0C0D0E0F
      101112131415161718191A1B1C1D1E1F
      202122232425262728292A2B2C2D2E2F
      303132333435363738393A3B3C3D3E3F
      404142434445464748494A4B4C4D4E4F
      505152535455565758595A5B5C5D5E5F
      606162636465666768696A6B6C6D6E6F
      707172737475767778797A7B7C7D7E7F
      54686520717569636B20627B6F776E20
      666F7820416E6F746865722073696C6C
      7920737472696E672059657420616E6F
      7468657220737472696E672020202020
      200102030405060708090A0B0C0D0E0F
      101112131415161718191A1B1C1D1E1F
      202122232425262728292A2B2C2D2E2F
      303132333435363738393A3B3C3D3E3F
      404142434445464748494A4B4C4D4E4F
      505152535455565758595A5B5C5D5E5F
      606162636465666768696A6B6C6D6E6F
      707172737475767778797A7B7C7D7E7F
      

      … which once again shows 20 instead of 00 for NUL characters.

      I then took the binary version, and just copy/pasted into another file, and saw that most of the control characters stayed correct, but NUL became a space, too.

      At this point, I assume that the act of copying text into the copy-buffer (which I assume both TextFX Convert and Converter use internally) converts NUL to SPACE. Is there a setting which changes that behavior for the copy buffer, or any way I can “protect” my NUL characters when doing the conversion? Because I cannot just assume that all 20 in the resulting hex-dump are really NUL, not SPACE.

      PeterJonesP 1 Reply Last reply Reply Quote 1
      • PeterJonesP
        PeterJones @PeterJones
        last edited by

        TL;DR?

        Short Version: When I copy a string that includes a NUL character, it appears to be translated into a SPACE before scripts or plugins get access to the buffer. Is there anyway to turn off this behavior, so that the NUL remains present in the copy buffer?

        Claudia FrankC 1 Reply Last reply Reply Quote 0
        • Claudia FrankC
          Claudia Frank @PeterJones
          last edited by

          Hello PeterCJ-AtWork,

          this seems to be a scintilla feature which was introduced in 2013
          as discussed here.

          So, using npp for manipulating binary data doesn’t seem to work.

          Cheers
          Claudia

          1 Reply Last reply Reply Quote 1
          • PeterJonesP
            PeterJones
            last edited by

            Thanks for the info. I guess I won’t be using my otherwise favorite editor for binary edits.

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hello, PeterCJ AtWork,

              I think that I found a work-around to your problem :-))

              • First, check the menu option TextFX - TextFX Viz Settings - Viz Paste/Append binary

              • Now, open and select any piece of text, which is part of a binary text

              • Select the menu option TextFX - TextFX Viz - Copy Visible Selection

              • Open a new tab ( CTRL + N )

              • If necessary, select the same encoding as the binary source text

              • Select the last menu option TextFX - TextFX Viz - Paste

              Well, you could think, at first sight, that it’s just OK !

              Unfortunately, ALL the LF ( \x0A )and CR ( \x0D )characters are wiped out, from the pasted text :-((

              Quite weird ! I did some tests but I could NOT find a setting/way which allows copy of, both, NUL and EOL characters


              So, we have to cheat, a bit ! Before copying the selection, with the TextFX command :

              • Change, in the binary source text, any character \n with any string ( of 1 or several characters long ), which does NOT exist, in the binary text

              • Change, in the binary source text, any character \r with any string ( of 1 or several characters long ), which does NOT exist, in the binary text

              • Once the paste action done, with simple regexes, just change the temporary strings, created for \n and \r, in order to get back to the original text -))

              Best Regards

              guy038

              1 Reply Last reply Reply Quote 2
              • PeterJonesP
                PeterJones
                last edited by

                Thanks for that workaround. It helps. (Fortunately, this is something I only do every few months, not on a daily basis. Unfortunately, it means I’ll have to bookmark the workaround, because I’ll forget it every time. :-) )

                1 Reply Last reply Reply Quote 0
                • PeterJonesP
                  PeterJones
                  last edited by PeterJones

                  Revisit three years later: thanks to @Meta-Chuh in Keep line endings on paste (no LF to CRLF or CRLF to LF)?, I learned of the Edit > Paste Special > Copy/Cut/Paste Binary Content feature, which apparently existed since v5.9 (so was available in v6.8.6 three years ago) which does allow for binary copy/paste. It doesn’t work in conjunction with the TextFX process originally described, but I thought I’d point out for all future viewers of this Topic that the copy/paste buffer can work with binary data inside Notepad++.

                  1 Reply Last reply Reply Quote 2
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors