Community
    • Login

    Plugin/Script to clean up text noise?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 4 Posters 133 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Neko_KaiohN
      Neko_Kaioh
      last edited by

      Hi,

      I’m not sure how else to phrase it. I have some files that easily open with Notepad++ but have a bunch of… noise? I guess? I’m not sure what to call it.

      Is there a plugin or script that will clean out special character noise from a document and leave only standard characters? I’m writing from my phone so I don’t have a sample with me at the moment.

      Any help would be greatly appreciated!

      PeterJonesP 1 Reply Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @Neko_Kaioh
        last edited by

        @Neko_Kaioh said in Plugin/Script to clean up text noise?:

        Hi,

        I’m not sure how else to phrase it. I have some files that easily open with Notepad++ but have a bunch of… noise? I guess? I’m not sure what to call it.

        A picture can be worth 1000 words. You just take a screenshot and paste it into your post.

        I’m guessing you’re seeing something like one of these:

        Those are both examples from our Why Does My … File Look Like Junk in Notepad++ – the image from that FAQ shows files that might contain text (like a Word document or PDF), which aren’t actually text files. Notepad++ requires a text file, like a .txt, or a piece of text-based source-code, like a .py or .cpp or .html. It is not for looking at word processing documents (like .docx or .odt or .xlsx) or PDFs or other such binary files.

        Is there a plugin or script that will clean out special character noise from a document and leave only standard characters?

        Not likely. Extracting just the text from a binary file is not something you should do from within Notepad++. There are some programs out that that look for the little strings of text in an executable or a Word document or similar, and extracts them to a separate file. You could find and run one of those. Or, if you were feeling brave, you could make a copy of your file, and in that copy, run a regular expression search-and-replace like FIND=[^\x20-\x7E\h\r\n] REPLACE=<empty> SEARCH MODE=Regular Expression, which will replace everything that’s not a space, newline, or normal ASCII character. But never do something like that with the original file – it would completely corrupt a .docx or .pdf file.

        I’m writing from my phone so I don’t have a sample with me at the moment.

        Questions like this are best accompanied by examples. If you aren’t able to share the example at the moment, it would behoove you to wait until you can.

        Neko_KaiohN 1 Reply Last reply Reply Quote 4
        • Neko_KaiohN
          Neko_Kaioh @PeterJones
          last edited by

          @PeterJones said in Plugin/Script to clean up text noise?:

          Questions like this are best accompanied by examples. If you aren’t able to share the example at the moment, it would behoove you to wait until you can.

          Fair. But yes, what I’m talking about is mainly that picture you dropped, on the left side.

          I’m not even sure what kindve file it is, its literally just labeled “file” when I look at it. Word opens it the same way as NPP.

          @PeterJones said in Plugin/Script to clean up text noise?:

          There are some programs out that that look for the little strings of text in an executable

          Would you by chance have any suggestions on programs I can look up? I’d really appreciate it.

          Terry RT 1 Reply Last reply Reply Quote 0
          • Terry RT
            Terry R @Neko_Kaioh
            last edited by

            @Neko_Kaioh said in Plugin/Script to clean up text noise?:

            Would you by chance have any suggestions on programs I can look up?

            It is nearly impossible to determine for sure what a file is when it looks like those images. Granted sometimes you get lucky and can see some header information which might give you additional information and possibly the type of file it is.

            However if you decided to search the Internet you may have spotted some sites which purport to specialise in figuring out the type of file based on content.

            https://www.checkfiletype.com/ is one such site. Use of it (or other site) is at your discretion. Probably not a good idea to load confidential information onto these sites unless sure of what they may do with the data, and agree with it.

            Terry

            Neko_KaiohN 1 Reply Last reply Reply Quote 4
            • Neko_KaiohN
              Neko_Kaioh @Terry R
              last edited by

              @Terry-R

              @Terry-R said in Plugin/Script to clean up text noise?:

              https://www.checkfiletype.com/ is one such site. Use of it (or other site) is at your discretion. Probably not a good idea to load confidential information onto these sites unless sure of what they may do with the data, and agree with it.

              Yeah, I used that one. Theres nothing on the file that I need to worry about, but it doesnt tell me anything I personally can use.

              "File Type: MSX Graph Saurus compressed image

              MIME Type: application/octet-stream;
              Suggested file extension(s): bin lha lzh exe class so dll img iso"

              So, if anyone reading this happens to know a program or two that I could try to view the file properly, I’d be very grateful.

              Terry RT Lycan ThropeL 2 Replies Last reply Reply Quote 0
              • Terry RT
                Terry R @Neko_Kaioh
                last edited by

                @Neko_Kaioh

                I think at this point you will need to go elsewhere for good advice. Here we give good advice (and support) on Notepad++. In terms of other applications, they probably have their own forums and you would find the questions are best asked (and answered) there.

                We also try to prevent post talking about too much non-Notepad++ stuff. It’s just “noise” to us!

                Good luck
                Terry

                1 Reply Last reply Reply Quote 4
                • Lycan ThropeL
                  Lycan Thrope @Neko_Kaioh
                  last edited by Lycan Thrope

                  @Neko_Kaioh said in Plugin/Script to clean up text noise?:

                  Yeah, I used that one. Theres nothing on the file that I need to worry about, but it doesnt tell me anything I personally can use.

                  "File Type: MSX Graph Saurus compressed image

                  MIME Type: application/octet-stream;
                  Suggested file extension(s): bin lha lzh exe class so dll img iso"

                  So, if anyone reading this happens to know a program or two that I could try to view the file properly, I’d be very grateful.

                  At best, you could use the Hex Editor plugin, however, from the information and this discussion so far, it is apparent that your abilities will be taxed, since you can’t tell when a file is binary or not, just by looking at it. If it’s like the file on the left that @PeterJones showed you, I noticed right away the first two letters in the file, PK, and to me that looks like a PKWare file, meaning it’s compressed at best, as matches your description of the file type that web site told you it was. That’s a compressed graphic file, meaning it’s encoded and at this point, I don’t see you having the skill set necessary to use a Hex Editor and be able to debug/decrypt a file.

                  Your best bet it to use the file in an application that generated it, and you can do that on your own by using that ubiquitous tool, Google, and putting that file type in the search terms and follow all the links you can until you find a tool that you can use to open that file with so that it can be played, viewed or whatever it is that it is designed to do. As @Terry-R has suggested, we can’t help anymore than this.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors