Community
    • Login

    help please, extract columns

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    11 Posts 4 Posters 2.2k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Scott SumnerS
      Scott Sumner @neagal
      last edited by

      @neagal

      If the columns are space-separated, I see EIGHT columns, not SEVEN.

      neagalN 1 Reply Last reply Reply Quote 0
      • neagalN
        neagal @Scott Sumner
        last edited by

        @Scott-Sumner

        “name”:"1 80.97.120.40 180ms [n/a] 21 “type”:“A” @post

        hello friend, are 7 columns

        1 Reply Last reply Reply Quote 0
        • Terry RT
          Terry R
          last edited by Terry R

          @neagal
          @Scott-Sumner said it’s a bit confusing, even more so now as it would appear you have provided different data to the original. So is the 180ms field definitely “180ms” or is it as you originally typed “180 ms”, where there is a space in the middle.

          I’m also a bit unsure of what you intend to happen as the 1 2 to 1 7 differs from the 2 1 to 2 7.
          If we consider each line separately I think what you want is a line with the 1st and 2nd field separated by a space, followed by a line with 1st and 3rd columns, through to the 1st and 7th column. This would be repeated for every original line of data, is this correct?

          So using your original data I produced this result:

          “name”:"1  80.97.120.40 
          “name”:"1  180 ms 
          “name”:"1  [n/a] 
          “name”:"1  21 
          “name”:"1  “type”:“A” 
          “name”:"1  @post
          “asn”:6876  90.97.117.177 
          “asn”:6876  134 ms 
          “asn”:6876  [n/a] 
          “asn”:6876  20 
          “asn”:6876  “type”:“B” 
          “asn”:6876  @post
          “asn”:23650  70.97.120.10 
          “asn”:23650  75 ms 
          “asn”:23650  [n/a] 
          “asn”:23650  21 
          “asn”:23650  “type”:“A” 
          “asn”:23650  @contact
          “ttl”:60  70.97.122.105 
          “ttl”:60  87 ms 
          “ttl”:60  geodnsd.global.sonicwall.com 
          “ttl”:60  20 
          “ttl”:60  “type”:“C” 
          “ttl”:60  @contact
          “asn”:15399  70.97.122.106 
          “asn”:15399  89 ms 
          “asn”:15399  geodnsd.global.sonicwall.com 
          “asn”:15399  19 
          “asn”:15399  “type”:“A” 
          “asn”:15399  @contact
          “ttl”:7896  70.97.122.107 
          “ttl”:7896  95 ms 
          “ttl”:7896  www.psmythe.net 
          “ttl”:7896  21 
          “ttl”:7896  “type”:“B” 
          “ttl”:7896  @post
          “asn”:456  70.97.122.104 
          “asn”:456  98 ms 
          “asn”:456  wasabi.prostructure.com 
          “asn”:456  22 
          “asn”:456  “type”:“C” 
          “asn”:456  @contact
          “name”:"376  70.97.122.108 
          “name”:"376  89 ms 
          “name”:"376  server_ip 
          “name”:"376  21 
          “name”:"376  “type”:“B” 
          “name”:"376  @contact
          “asn”:15343  70.97.122.109 
          “asn”:15343  87 ms 
          “asn”:15343  xcder.com 
          “asn”:15343  21 
          “asn”:15343  “type”:“A” 
          “asn”:15343  @post
          

          Terry

          neagalN 1 Reply Last reply Reply Quote 2
          • neagalN
            neagal @Terry R
            last edited by

            @Terry-R said:

            If we consider each line separately I think what you want is a line with the 1st and 2nd field separated by a space, followed by a line with 1st and 3rd columns, through to the 1st and 7th column. This would be repeated for every original line of data, is this correct?

            it is right

            1 Reply Last reply Reply Quote 0
            • Terry RT
              Terry R
              last edited by

              @neagal
              There’s still 1 unanswered question, is the 180ms field as you originally typed it or as you typed it in response to @Scott-Sumner question?
              So is it:

              180 ms
              or
              180ms
              

              Terry

              neagalN 1 Reply Last reply Reply Quote 0
              • neagalN
                neagal @Terry R
                last edited by

                @Terry-R said:

                180ms

                180ms

                1 Reply Last reply Reply Quote 0
                • Terry RT
                  Terry R
                  last edited by Terry R

                  @neagal
                  Before we start on this we need to add a final blank line to the original data, as my regex (regular expression) expects one at the end of every line.

                  So first we use the Replace function with
                  Find What:\z
                  Replace With:\r\n

                  For all the regex’s provided we need the search mode set to “regular expression” and “wrap around” ticked. You can hit the replace button, or replace all button, it will only find 1 occurrence and add a blank line at the end of the data. You may want to confirm that by checking the bottom of Notepad++ where it shows a count of lines, see if it increases by 1 after running the first regex.

                  So the Replace function to alter your data is.
                  Find What:^([^ ]+?) ([^ ]+?) ([^ ]+ms) ([^ ]+?) ([^ ]+?) ([^ ]+?) ([^ ]+?)(\R)

                  For the Replace field I also have a question, do you want a blank line between each set of fields. So all 6 lines of the original line 1 followed by a blank line, then 6 lines of original line 2, repeated?
                  If blank line between sets then
                  Replace With:\1 \2\8\1 \3\8\1 \4\8\1 \5\8\1 \6\8\1 \7\8\8
                  If no blank line between then
                  Replace With:\1 \2\8\1 \3\8\1 \4\8\1 \5\8\1 \6\8\1 \7\8
                  As before, set search mode to “regular expression” and “wrap around” ticked. Click once on “replace all”, then check the resulting data. You can see the important stuff is outlined in red text with a shadow background. This text can be copied directly using your keyboard or mouse, so no need to type each character individually.

                  Let us know how you got on. If any further issues then you may need to elaborate on the data used, or at least provide some of the examples where the regex did NOT work, so we can amend it to suit.

                  Finally remove that last blank line.

                  Terry

                  1 Reply Last reply Reply Quote 0
                  • Terry RT
                    Terry R
                    last edited by

                    @neagal

                    The following regex, whilst not wrong has “ms” included which isn’t necessary. I had that in originally as I believed your data was of the type “180 ms”. When you confirmed the space in the middle was not actually there I did not remove it efficiently.

                    ^([^ ]+?) ([^ ]+?) ([^ ]+ms) ([^ ]+?) ([^ ]+?) ([^ ]+?) ([^ ]+?)(\R)

                    You will notice all other instances are ([^ ]+?), the one with “ms” could actually be the same. I just did not notice it when posting. I thought it would be best to advertise that fact now so you don’t get the wrong impression of how the regex works and be confused over why I had the “ms” in that instance, yet no other characters in other instances.

                    Terry

                    1 Reply Last reply Reply Quote 1
                    • Scott SumnerS
                      Scott Sumner
                      last edited by

                      Too many questions…too much obscurity…not only do you have to solve the OP’s problem…you have to straighten the OP out to get a coherent problem presented…good that you have a lot of endurance, @Terry-R !

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by

                        Hi, @neagal, @Terry-R, @scott-sumner and All,

                        • About a possible space char, before the ms string, you can delete it, to get a coherent field, with the regex S/R :

                          • SEARCH (?-i)(?<=\d)\x20(?=ms)

                          • REPLACE Leave EMPTY

                        • Now, if we use lazy quantifiers, the Terry’s regex S/R, with a line-break between each block, turns to :

                          • SEARCH (?-s)^(.+?\x20)(.+?)\x20(.+?)\x20(.+?)\x20(.+?)\x20(.+?)\x20(.+)(\R)

                          • REPLACE \1\2\8\1\3\8\1\4\8\1\5\8\1\6\8\1\7\8\8

                        • Eventually, if necessary, you can move back to the #### ms syntax, with a space between, with the regex S/R :

                          • SEARCH ms$

                          • REPLACE \x20$0

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors