Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    remove all text except hex characters

    Help wanted · · · – – – · · ·
    4
    9
    1681
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Lonnie Hailey
      Lonnie Hailey last edited by

      Hi,
      How do I remove all text except hex characters and then list all hex characters in one row?

      For example, below is my original file and what I would like to convert the file into:

      <Start original file>
      ;-----------------------------------------------------------------------------
      ; IP in IP Packets for OMX3200
      ;-----------------------------------------------------------------------------
      ; Rev 1.00: May 10, 2018
      : - Initial creation
      ;
      ; MAC Addresses are NetQuest OUI plus 16-bit PDU number plus 0x44 “D” for
      ; destination or 0x52 “S” for source
      ;-----------------------------------------------------------------------------
      ; PDU:01
      ; Eth.ipv4
      ;-----------------------------------------------------------------------------
      00 20 1E 00 01 44 – MAC destination address
      00 20 1E 00 01 52 – MAC source address
      08 00 – Ethertype:IPv4
      – IPv4 Header [20 bytes]
      45 – Version/Internet Header Length (IHL)
      00 – Type Of Service (TOS)
      00 42 – Total length (bytes)
      55 44 – Identifier
      00 – Flags
      00 – Fragment Offset
      80 – Time To Live (TTL)
      11 – IP Protocol: (UDP)
      10 85 – Header Checksum
      c8 c8 c8 eb – Source IP Address
      41 6a 01 c4 – Destination IP Address
      – UDP Header [8 bytes]
      04 9f – Source port
      00 35 – Destination port
      00 2e – Length
      05 d5 – Checksum
      – Payload [38 bytes]
      37 01 01 00 00 01 00 00
      00 00 00 00 03 77 77 77
      0c 6e 65 74 71 75 65 73
      74 63 6f 72 70 03 63 6f
      6d 00 00 01 00 01
      <Endoriginal file>

      Here are the hex characters I want to extract into a single line:

      <Start converted file>
      45
      00
      00
      42
      55
      44
      00
      00
      80
      11
      10
      85
      c8
      c8
      c8
      eb
      41
      6a
      01
      c4
      04
      9f
      00
      35
      00
      2e
      37
      01
      01
      00
      00
      01
      00
      00
      00
      00
      00
      00
      03
      77
      77
      77
      0c
      6e
      65
      74
      71
      75
      65
      73
      74
      63
      6f
      72
      70
      03
      63
      6f
      6d
      00
      00
      01
      00
      01
      <End converted file>

      1 Reply Last reply Reply Quote 0
      • PeterJones
        PeterJones last edited by

        So, your description doesn’t match your data: your converted file didn’t include the hex from the MAC addresses or IPv4 port.

        Also, the “20 bytes” and “38 bytes” will be indistinguishable from hex values, unless the – is indicative of something we can ignore. I am guessing that the semicolon ; indicates a full-line comment (oh, there’s one with a colon : at the start of the line). Does the dash – indicate an end-of-line comment, or is it something meaningful that you just want to strip out.

        Also, because you didn’t quote your text in a way that will prevent Markdown from formatting it (see here for some markdown examples), I cannot know whether the – is really that unicode character, or whether the forum just auto-converted your double-hyphen -- into a single dash –.

        So, we cannot know which you really want. Though I can make some guesses.

        There’s probably magic that would do it all in one, but I’d personally prefer a multi-step process, so you (and I) can see what’s going on.

        All of these require selecting Regular Expression in the Find/Replace dialog:

        1. Eliminate ; and : lines:
          • Find What: (?-s)^\s*[;:].*$ – any line starting with any amount of space (including no space), then a semicolon or colon
          • Replace With: (empty)
        2. Eliminate from the dash – or double-hyphen -- to the end of the line
          • Find What: (?-s)(–|--).*$ – any thing from the horizontal bar to the end of the line
          • Replace With: (empty)

        At this point, all the ignore-these matches (either in apparent comments or apparent decimal after dashes) should be gone. It should just be pairs of hex nibbles at this point:

        1. move each pair to its own line
          • Find: (?is)([0-9a-z]{2})\s+ – any group of 2 case-insensitive hexadecimal digits, followed by one or more space or newline characters
          • Replace: $1\r\n – replace with the group, plus a standard windows EOL (CRLF) sequence
        2. if you also want to get ride of extra newlines:
          • Find: \R+ – find any sequence of one or more newlines (whether CRLF, LF, or CR)
          • Replace: \r\n – and replace it with a single standard windows EOL sequence

        This is based on my best guess of your intention.

        PS: this forum is made of volunteers helping fellow NPP users of their own volition, and usually not as part of their paid job. This is not a code writing service that does your job or your homework for you for free, while you get the paycheck or course credit. It would have been better if you’d shown what you tried, why you thought it would do it, and where it fell short, and then asked us for help in fixing it. I answered out of the “if reasonably asked, I’ll give a freebie” mindset. But if this doesn’t work for you, you’ll have to show some effort. For example, you can follow the links here to find more regex documentation, which will come in handy for understanding or customizing what I’ve written.

        Scott Sumner 1 Reply Last reply Reply Quote 2
        • Scott Sumner
          Scott Sumner @PeterJones last edited by Scott Sumner

          @PeterJones

          I looked at it…but with the problems with the definition (that you correctly pointed out), I decided to pass on it. :-)

          I’ve stopped doing too much “guessing”.

          1 Reply Last reply Reply Quote 1
          • PeterJones
            PeterJones last edited by

            Yes, but I’ve got to keep answering these guesswork ones, otherwise @Terry-R will keep answering and getting the reputation instead of me… and after only 2 months on the forum, Terry is already in the top-10 reputation users. I’ve got to keep my #4 ranking, after all. :-)

            Okay, actually, I’m happy when more users give many high-quality answers.

            I try to answer guesswork questions when I think what I share will be a useful starting point, even if it doesn’t solve the question the OP intended to ask (and I hope that the OP is able to clarify with just one additional post which shows effort and clarity, which will bring about an improved solution in one iteration; hope springs eternal, I know).

            Scott Sumner 1 Reply Last reply Reply Quote 1
            • Scott Sumner
              Scott Sumner @PeterJones last edited by

              @PeterJones @Terry-R … two guys with more reputation points than postings…keep up the high-quality work!

              1 Reply Last reply Reply Quote 0
              • Terry R
                Terry R last edited by Terry R

                Sorry @PeterJones and others, I didn’t mean to steal your thunder. I hadn’t been looking at the stats, but I’m a bit worried at Scott Sumner as he went click happy at my recent postings. My little computer had difficulty in notifying me with dings of all the upvotes. Together with his recent post, I’m wondering if he needs a holiday?

                Seriously though, I do get ‘warm fuzzies’ helping out, and of course also pushing my own boundaries. Part of that is thanks to you lot, guiding me where I’ve been a bit too literal, pulling me back in line when I’ve done something stupid. I take nothing personally, as I hope you don’t if I happen to say something slightly askew.

                Cheers everybody
                Terry

                PS What does OP mean?! you lot are using it, I’ve now started using it but don’t know what the chars stand for, it bugs me!

                Scott Sumner 1 Reply Last reply Reply Quote 2
                • Scott Sumner
                  Scott Sumner @Terry R last edited by

                  @Terry-R

                  OP = Original Poster

                  1 Reply Last reply Reply Quote 1
                  • Terry R
                    Terry R last edited by

                    AAARRRRGGGHHHH! I thought that might have been it but it seemed a bit too obvious.

                    Terry

                    1 Reply Last reply Reply Quote 2
                    • PeterJones
                      PeterJones last edited by

                      It can alternately mean Original Post, as well. You’ve got to take it on context of whether it’s talking about the content or the content creator. :-)

                      1 Reply Last reply Reply Quote 1
                      • First post
                        Last post
                      Copyright © 2014 NodeBB Forums | Contributors