• Login
Community
  • Login

help please, extract columns

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
11 Posts 4 Posters 2.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N
    neagal
    last edited by neagal Nov 14, 2018, 1:15 AM Nov 14, 2018, 1:14 AM

    Hi all, I have a big text file and the data in it are in 7 columns, but how can I do to have two in two columns.
    ( 1 2 3 4 5 6 7 columns)
    “name”:"1 80.97.120.40 180 ms [n/a] 21 “type”:“A” @post
    “asn”:6876 90.97.117.177 134 ms [n/a] 20 “type”:“B” @post
    “asn”:23650 70.97.120.10 75 ms [n/a] 21 “type”:“A” @contact
    “ttl”:60 70.97.122.105 87 ms geodnsd.global.sonicwall.com 20 “type”:“C” @contact
    “asn”:15399 70.97.122.106 89 ms geodnsd.global.sonicwall.com 19 “type”:“A” @contact
    “ttl”:7896 70.97.122.107 95 ms www.psmythe.net 21 “type”:“B” @post
    “asn”:456 70.97.122.104 98 ms wasabi.prostructure.com 22 “type”:“C” @contact
    “name”:"376 70.97.122.108 89 ms server_ip 21 “type”:“B” @contact
    “asn”:15343 70.97.122.109 87 ms xcder.com 21 “type”:“A” @post

    I need results like this example
    1 2
    1 3
    1 4
    1 5
    1 6
    1 7

    2 1
    2 3
    2 4
    2 5
    2 6
    2 7

    and so on with the others

    the columns are separated, How could I do with regular expression mode?

    thanks

    S 1 Reply Last reply Nov 14, 2018, 1:27 AM Reply Quote 0
    • S
      Scott Sumner @neagal
      last edited by Nov 14, 2018, 1:27 AM

      @neagal

      If the columns are space-separated, I see EIGHT columns, not SEVEN.

      N 1 Reply Last reply Nov 14, 2018, 1:34 AM Reply Quote 0
      • N
        neagal @Scott Sumner
        last edited by Nov 14, 2018, 1:34 AM

        @Scott-Sumner

        “name”:"1 80.97.120.40 180ms [n/a] 21 “type”:“A” @post

        hello friend, are 7 columns

        1 Reply Last reply Reply Quote 0
        • T
          Terry R
          last edited by Terry R Nov 14, 2018, 1:44 AM Nov 14, 2018, 1:41 AM

          @neagal
          @Scott-Sumner said it’s a bit confusing, even more so now as it would appear you have provided different data to the original. So is the 180ms field definitely “180ms” or is it as you originally typed “180 ms”, where there is a space in the middle.

          I’m also a bit unsure of what you intend to happen as the 1 2 to 1 7 differs from the 2 1 to 2 7.
          If we consider each line separately I think what you want is a line with the 1st and 2nd field separated by a space, followed by a line with 1st and 3rd columns, through to the 1st and 7th column. This would be repeated for every original line of data, is this correct?

          So using your original data I produced this result:

          “name”:"1  80.97.120.40 
          “name”:"1  180 ms 
          “name”:"1  [n/a] 
          “name”:"1  21 
          “name”:"1  “type”:“A” 
          “name”:"1  @post
          “asn”:6876  90.97.117.177 
          “asn”:6876  134 ms 
          “asn”:6876  [n/a] 
          “asn”:6876  20 
          “asn”:6876  “type”:“B” 
          “asn”:6876  @post
          “asn”:23650  70.97.120.10 
          “asn”:23650  75 ms 
          “asn”:23650  [n/a] 
          “asn”:23650  21 
          “asn”:23650  “type”:“A” 
          “asn”:23650  @contact
          “ttl”:60  70.97.122.105 
          “ttl”:60  87 ms 
          “ttl”:60  geodnsd.global.sonicwall.com 
          “ttl”:60  20 
          “ttl”:60  “type”:“C” 
          “ttl”:60  @contact
          “asn”:15399  70.97.122.106 
          “asn”:15399  89 ms 
          “asn”:15399  geodnsd.global.sonicwall.com 
          “asn”:15399  19 
          “asn”:15399  “type”:“A” 
          “asn”:15399  @contact
          “ttl”:7896  70.97.122.107 
          “ttl”:7896  95 ms 
          “ttl”:7896  www.psmythe.net 
          “ttl”:7896  21 
          “ttl”:7896  “type”:“B” 
          “ttl”:7896  @post
          “asn”:456  70.97.122.104 
          “asn”:456  98 ms 
          “asn”:456  wasabi.prostructure.com 
          “asn”:456  22 
          “asn”:456  “type”:“C” 
          “asn”:456  @contact
          “name”:"376  70.97.122.108 
          “name”:"376  89 ms 
          “name”:"376  server_ip 
          “name”:"376  21 
          “name”:"376  “type”:“B” 
          “name”:"376  @contact
          “asn”:15343  70.97.122.109 
          “asn”:15343  87 ms 
          “asn”:15343  xcder.com 
          “asn”:15343  21 
          “asn”:15343  “type”:“A” 
          “asn”:15343  @post
          

          Terry

          N 1 Reply Last reply Nov 14, 2018, 2:06 AM Reply Quote 2
          • N
            neagal @Terry R
            last edited by Nov 14, 2018, 2:06 AM

            @Terry-R said:

            If we consider each line separately I think what you want is a line with the 1st and 2nd field separated by a space, followed by a line with 1st and 3rd columns, through to the 1st and 7th column. This would be repeated for every original line of data, is this correct?

            it is right

            1 Reply Last reply Reply Quote 0
            • T
              Terry R
              last edited by Nov 14, 2018, 2:09 AM

              @neagal
              There’s still 1 unanswered question, is the 180ms field as you originally typed it or as you typed it in response to @Scott-Sumner question?
              So is it:

              180 ms
              or
              180ms
              

              Terry

              N 1 Reply Last reply Nov 14, 2018, 2:29 AM Reply Quote 0
              • N
                neagal @Terry R
                last edited by Nov 14, 2018, 2:29 AM

                @Terry-R said:

                180ms

                180ms

                1 Reply Last reply Reply Quote 0
                • T
                  Terry R
                  last edited by Terry R Nov 14, 2018, 2:42 AM Nov 14, 2018, 2:41 AM

                  @neagal
                  Before we start on this we need to add a final blank line to the original data, as my regex (regular expression) expects one at the end of every line.

                  So first we use the Replace function with
                  Find What:\z
                  Replace With:\r\n

                  For all the regex’s provided we need the search mode set to “regular expression” and “wrap around” ticked. You can hit the replace button, or replace all button, it will only find 1 occurrence and add a blank line at the end of the data. You may want to confirm that by checking the bottom of Notepad++ where it shows a count of lines, see if it increases by 1 after running the first regex.

                  So the Replace function to alter your data is.
                  Find What:^([^ ]+?) ([^ ]+?) ([^ ]+ms) ([^ ]+?) ([^ ]+?) ([^ ]+?) ([^ ]+?)(\R)

                  For the Replace field I also have a question, do you want a blank line between each set of fields. So all 6 lines of the original line 1 followed by a blank line, then 6 lines of original line 2, repeated?
                  If blank line between sets then
                  Replace With:\1 \2\8\1 \3\8\1 \4\8\1 \5\8\1 \6\8\1 \7\8\8
                  If no blank line between then
                  Replace With:\1 \2\8\1 \3\8\1 \4\8\1 \5\8\1 \6\8\1 \7\8
                  As before, set search mode to “regular expression” and “wrap around” ticked. Click once on “replace all”, then check the resulting data. You can see the important stuff is outlined in red text with a shadow background. This text can be copied directly using your keyboard or mouse, so no need to type each character individually.

                  Let us know how you got on. If any further issues then you may need to elaborate on the data used, or at least provide some of the examples where the regex did NOT work, so we can amend it to suit.

                  Finally remove that last blank line.

                  Terry

                  1 Reply Last reply Reply Quote 0
                  • T
                    Terry R
                    last edited by Nov 14, 2018, 5:46 AM

                    @neagal

                    The following regex, whilst not wrong has “ms” included which isn’t necessary. I had that in originally as I believed your data was of the type “180 ms”. When you confirmed the space in the middle was not actually there I did not remove it efficiently.

                    ^([^ ]+?) ([^ ]+?) ([^ ]+ms) ([^ ]+?) ([^ ]+?) ([^ ]+?) ([^ ]+?)(\R)

                    You will notice all other instances are ([^ ]+?), the one with “ms” could actually be the same. I just did not notice it when posting. I thought it would be best to advertise that fact now so you don’t get the wrong impression of how the regex works and be confused over why I had the “ms” in that instance, yet no other characters in other instances.

                    Terry

                    1 Reply Last reply Reply Quote 1
                    • S
                      Scott Sumner
                      last edited by Nov 14, 2018, 1:08 PM

                      Too many questions…too much obscurity…not only do you have to solve the OP’s problem…you have to straighten the OP out to get a coherent problem presented…good that you have a lot of endurance, @Terry-R !

                      1 Reply Last reply Reply Quote 0
                      • G
                        guy038
                        last edited by Nov 14, 2018, 2:14 PM

                        Hi, @neagal, @Terry-R, @scott-sumner and All,

                        • About a possible space char, before the ms string, you can delete it, to get a coherent field, with the regex S/R :

                          • SEARCH (?-i)(?<=\d)\x20(?=ms)

                          • REPLACE Leave EMPTY

                        • Now, if we use lazy quantifiers, the Terry’s regex S/R, with a line-break between each block, turns to :

                          • SEARCH (?-s)^(.+?\x20)(.+?)\x20(.+?)\x20(.+?)\x20(.+?)\x20(.+?)\x20(.+)(\R)

                          • REPLACE \1\2\8\1\3\8\1\4\8\1\5\8\1\6\8\1\7\8\8

                        • Eventually, if necessary, you can move back to the #### ms syntax, with a space between, with the regex S/R :

                          • SEARCH ms$

                          • REPLACE \x20$0

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 2
                        8 out of 11
                        • First post
                          8/11
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors