Community
    • Login

    Problem with Hexa editor add-on

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    3 Posts 2 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • cédric freycenonC
      cédric freycenon
      last edited by

      Bug 16bits bin file.PNG
      Hello,

      I open a binary file with NotePad++ Hexa extension and with an other Hexa editor, and I do not get the same result. I best agree with the other one, as the file was generated with the code below, on matlab :

      Signal = zeros(4*64000,1);
      t=1;
      for i=1 : 4000
      Signal(t) = mod(t,200);
      t=t+1;
      end
      FileID = fopen(fullfile(pwd, ‘output4.int16’), ‘w’);
      fwrite(FileID,Signal,‘int16’);
      fclose(FileID);

      Where is the problem ?

      Thanks for the time you will take to consider my interrogation

      Cédric Freycenon

      1 Reply Last reply Reply Quote 0
      • PeterJonesP
        PeterJones
        last edited by

        Notepad++ is a text editor, so it reasonably assumes that anything you open with Notepad++ is a text file; that’s its job, after all. When it sees a consistent alternation of some byte followed by 0x00 byte, for the entire document, it reasonably guesses that you have a UCS-2 LE encoded file (because that’s what UCS-2 LE looks like for text), and treats each pair of bytes as a single character. Then when you run Plugins > Hex Editor > View in Hex, the plugin takes the characters rather than the bytes.

        I just ran an experiment: I created a new true UCS-2 LE w/ BOM file (with text 12345) in Notepad++, and saved it; an external hex dumper xxd.exe shows:

        C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd ucs2le-1234.txt
        00000000: fffe 3100 3200 3300 3400 3500            ..1.2.3.4.5.
        

        it starts with the 0xFF 0xFE BOM, then the two-byte sequences for those 5 characters.
        When i View in Hex on that file, the hex editor plugin shows
        9267eb9c-fd0c-415a-a429-42997d194a98-image.png
        When I turn off the hex editor, it still says UCS-2 LE BOM
        833104ac-a270-4f77-98a4-8a6077fba500-image.png

        However, when I create a binary file with the bytes 01 00 02 00,

        C:\Users\peter.jones\Downloads\TempData\nppCommunity>perl -e "print qq(\x01\x00\x02\x00)" > ucs2le-bytes.txt
        
        C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd ucs2le-bytes.txt
        00000000: 0100 0200                                ....
        

        Notepad++ sees that as auto-interpreted UCS-2 Little Endian (without BOM):
        d8341541-92fe-40e5-bc25-aa973bd0c331-image.png
        And the hex editor plugin only displays the hex of the characters, rather than the hex of the indivdual bytes:
        b56a27d1-bd0e-4520-9a47-812341b3d47e-image.png
        When you stop viewing in hex, the plugin sent back the characters, not the original bytes, and now Notepad++ thinks it’s a normal ANSI file:
        6948f6a7-d52c-45a0-a881-9c8523f3d71c-image.png
        and if i save that to disk, it has gotten rid of the 00 bytes:

        C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd ucs2le-bytes.txt
        00000000: 0102                                     ..
        

        So, unfortunately, it looks like somewhere in the handoff between Notepad++ and the Hex Editor plugin, a UCS-2 LE file without BOM gets converted to ANSI encoding instead of UCS-2 LE encoding, so it drops the zero bytes.

        If you are allowed to put a BOM (writing the bytes 0xFF and 0xFE (255 and 254) at the start of the output file from Matlab (though you then might have to also have matlab strip out those bytes if you later read that file into matlab again), then Notepad++ will truly believe it’s UCS-2 LE BOM, and the hex editor plugin will treat it that way. Or you could go through a temporary converter (not provided), which will add the BOM to the start of the file before you load it in Notepad++/HexEditor, then will strip the BOM after you’re done using it in Notepad++.

        However, as I implied earlier, the fundamental issue is that you are expecting Notepad++, which is a text editor, to read and not mangle a binary file, which is not its primary purpose. If you are careful, it might sometimes work. But Notepad++ was not written to be a binary editor, so you should not expect it to work perfectly for something that it wasn’t intended. That said, the workaround shouldn’t be difficult for someone who can program in matlab.

        1 Reply Last reply Reply Quote 2
        • cédric freycenonC
          cédric freycenon
          last edited by

          Thanks for your answer.
          best regards

          Cédric Freycenon

          1 Reply Last reply Reply Quote 0
          • First post
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors