• Login
Community
  • Login

Duplicate rows according to whats in a column?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
4 Posts 3 Posters 2.6k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    greg marshall
    last edited by Sep 14, 2016, 1:40 AM

    Hi all,

    I have a list of CIDR networks. On each line, it includes a list of the subnets that are included. This is in a CSV. for example:

    network, datacenter,environment,1,2,3,4,5,6,7,8
    10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,

    There are about 800 lines like this.

    What i need to end up with, is one line for each of the subnets. Those are in the last 8 columns. Like this
    10.16.64.0/24,ABC,Prod,10.16.64.0
    10.16.64.0/24,ABC,Prod,10.16.65.0
    10.16.64.0/24,ABC,Prod,10.16.66.0
    10.16.64.0/24,ABC,Prod,10.16.67.0

    Or if easier, just
    ABC,Prod,10.16.64.0
    ABC,Prod,10.16.65.0
    ABC,Prod,10.16.66.0
    ABC,Prod,10.16.67.0

    This is going to be used as a lookup table to match any IP in the network with the first 3 octets in this list and return the data center and environment.

    Since this is going into a vendor tool, I don’t have the luxury of using a function /script/program to return the values programmatically. I need a lookup list.

    Any tips on how I might accomplish this with NPP? It’s odd I know :)

    Thanks!

    1 Reply Last reply Reply Quote 0
    • G
      guy038
      last edited by guy038 Sep 14, 2016, 3:36 AM Sep 14, 2016, 3:29 AM

      Hello Greg,

      Of course, a Python, Lua , NppExec or Gawk script would be the best to achieve that task. However, with two S/R, we could get it right !

      From your example, with the first header line, below :

      network, datacenter,environment,1,2,3,4,5,6,7,8

      it seems that each line, of your file, may have, after the Environment field, between 1 and 8 addresses IPV4 !

      So, I consider the example text below :

      10.16.64.0/24,ABC,Prod,10.16.64.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,
      10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,
      

      If I suppose that the fields 2 and 3 ( datacenter and environment ) begin, always, with a letter, then the following S/R, performed ONCE, only, in Regular expression mode, by clicking on the Replace All button

      SEARCH = (?<=\d),(?=\d)|(,\R)

      REPLACE = \r\n(?1\r\n)

      would change the text, as below :

      10.16.64.0/24,ABC,Prod,10.16.64.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      10.16.66.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      10.16.66.0
      10.16.67.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      10.16.66.0
      10.16.67.0
      10.16.68.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      10.16.66.0
      10.16.67.0
      10.16.68.0
      10.16.69.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      10.16.66.0
      10.16.67.0
      10.16.68.0
      10.16.69.0
      10.16.70.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.65.0
      10.16.66.0
      10.16.67.0
      10.16.68.0
      10.16.69.0
      10.16.70.0
      10.16.71.0
      

      Then, with this second S/R, below, you should get your text as you like to !

      SEARCH = (?-s)(.+?,.+?,.+?,).+\R(?=[^,\r\n]+\R)

      REPLACE = $0\1

      IMPORTANT :

      You’ll have to perform this second S/R, MANY times, till no more occurrence can be found ! That is to say that you’ll just have to click on the Replace All button, repeatedly, at least eight times, as you may have up to 8 addresses, in your file !

      So, after eight consecutive replacements, we obtain the final text below :

      10.16.64.0/24,ABC,Prod,10.16.64.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      10.16.64.0/24,ABC,Prod,10.16.66.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      10.16.64.0/24,ABC,Prod,10.16.66.0
      10.16.64.0/24,ABC,Prod,10.16.67.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      10.16.64.0/24,ABC,Prod,10.16.66.0
      10.16.64.0/24,ABC,Prod,10.16.67.0
      10.16.64.0/24,ABC,Prod,10.16.68.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      10.16.64.0/24,ABC,Prod,10.16.66.0
      10.16.64.0/24,ABC,Prod,10.16.67.0
      10.16.64.0/24,ABC,Prod,10.16.68.0
      10.16.64.0/24,ABC,Prod,10.16.69.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      10.16.64.0/24,ABC,Prod,10.16.66.0
      10.16.64.0/24,ABC,Prod,10.16.67.0
      10.16.64.0/24,ABC,Prod,10.16.68.0
      10.16.64.0/24,ABC,Prod,10.16.69.0
      10.16.64.0/24,ABC,Prod,10.16.70.0
      
      10.16.64.0/24,ABC,Prod,10.16.64.0
      10.16.64.0/24,ABC,Prod,10.16.65.0
      10.16.64.0/24,ABC,Prod,10.16.66.0
      10.16.64.0/24,ABC,Prod,10.16.67.0
      10.16.64.0/24,ABC,Prod,10.16.68.0
      10.16.64.0/24,ABC,Prod,10.16.69.0
      10.16.64.0/24,ABC,Prod,10.16.70.0
      10.16.64.0/24,ABC,Prod,10.16.71.0
      

      Best Regards,

      But, presently, I must go to bed ! So, just be patient, till tomorrow, for some explanations, about these regexes !

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 0
      • J
        Joe Furlong
        last edited by Sep 14, 2016, 3:51 AM

        What about using a find and replace with a regex? (I’m not a regex expert, but this appeared to work for me. )
        Simple example. (In the replace box the “Search Mode” is “Regular expression”)
        Set the “Find what:” to:
        ^(.?,.?,.?),(.?),(.?),(.?),(.?),(.?),(.?),(.?),(.*?)$
        Set the “Replace with :” to:
        \1,\2\r\n\1,\3\r\n\1,\4\r\n\1,\5\r\n\1,\6\r\n\1,\7\r\n\1,\8\r\n\1,\9

        Then press the “Replace” or “Replace All” button.

        Hope this helps.

        1 Reply Last reply Reply Quote 0
        • G
          guy038
          last edited by guy038 Sep 14, 2016, 8:57 PM Sep 14, 2016, 7:02 PM

          Hi Greg and Joe,

          Joe, concerning the SEARCH regex, I suppose that you meant, ( with some more star meta-characters ! ) :

          SEARCH = ^(.*?,.*?,.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),$

          Then, with :

          REPLACE = \1,\2\r\n\1,\3\r\n\1,\4\r\n\1,\5\r\n\1,\6\r\n\1,\7\r\n\1,\8\r\n\1,\9

          it would, indeed, match the line, with 8 addresses, below

          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,
          

          and changed it into the block of 8 lines, below :

          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          10.16.64.0/24,ABC,Prod,10.16.67.0
          10.16.64.0/24,ABC,Prod,10.16.68.0
          10.16.64.0/24,ABC,Prod,10.16.69.0
          10.16.64.0/24,ABC,Prod,10.16.70.0
          10.16.64.0/24,ABC,Prod,10.16.71.0
          

          Unfortunately, if a line, of Greg’s text, has less than eight addresses, at the end, your regex would not work :-((

          But, thanks to you, you put me in the good direction and gave me an idea for getting all the work done, with ONE regex ONLY !!

          So, I start with the example text, below, which contains all possible cases, from no address to 8 addresses, after the environment field, of each line :

          10.16.64.0/24,ABC,Prod,
          10.16.64.0/24,ABC,Prod,10.16.64.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,
          10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,
          

          The following S/R, with search mode = Regular expression, will realize everything, after clicking, ONCE only, on the Replace All button !!

          SEARCH = (.*?,.*?,.*?,)(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?

          REPLACE = \1(?2\2)\r\n(?3\1\3\r\n)(?4\1\4\r\n)(?5\1\5\r\n)(?6\1\6\r\n)(?7\1\7\r\n)(?8\1\8\r\n)(?9\1\9\r\n)

          So, in one go, the resulting text would be :

          10.16.64.0/24,ABC,Prod,
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          10.16.64.0/24,ABC,Prod,10.16.67.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          10.16.64.0/24,ABC,Prod,10.16.67.0
          10.16.64.0/24,ABC,Prod,10.16.68.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          10.16.64.0/24,ABC,Prod,10.16.67.0
          10.16.64.0/24,ABC,Prod,10.16.68.0
          10.16.64.0/24,ABC,Prod,10.16.69.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          10.16.64.0/24,ABC,Prod,10.16.67.0
          10.16.64.0/24,ABC,Prod,10.16.68.0
          10.16.64.0/24,ABC,Prod,10.16.69.0
          10.16.64.0/24,ABC,Prod,10.16.70.0
          
          10.16.64.0/24,ABC,Prod,10.16.64.0
          10.16.64.0/24,ABC,Prod,10.16.65.0
          10.16.64.0/24,ABC,Prod,10.16.66.0
          10.16.64.0/24,ABC,Prod,10.16.67.0
          10.16.64.0/24,ABC,Prod,10.16.68.0
          10.16.64.0/24,ABC,Prod,10.16.69.0
          10.16.64.0/24,ABC,Prod,10.16.70.0
          10.16.64.0/24,ABC,Prod,10.16.71.0
          

          Et voilà ! Well, although I practised regexes, since many years, I, always, surprised by their incredible power !!


          Notes :

          • If the original line is only 10.16.64.0/24,ABC,Prod,, without any address, after, the S/R just rewrites that line

          • In the search regex, the 8 parts (.*?) are groups from 2 to 9, each containing a possible address, which must be rewritten, in replacement

          • These 8 groups, followed by a comma, are embedded in an optional non-capturing group (?:(.*?),)? ( the group may be present [1] or absent [0] )

          • At the beginning, the part (.*?,.*?,.*?,) looks for the first three fields : network, datacenter and environment, followed by a comma

          • In replacement, the syntax (?#\1\#\r\n) is called a conditional replacement. It means that IF the group # exists, then it rewrites the group 1, followed by the group #, followed by the Windows EOL characters \r\n

          • At the beginning, the part \1(?2\2)\r\n writes, first, the group 1. Then it writes the group 2, ONLY IF it exists ( the first IPV4 address ), and, finally rewrites the Windows EOL characters \r\n

          Remarks :

          • Contrary to my previous post, with this new S/R, there is no condition, any more, about the contents of fields 2 and 3 ! They could be, for instance : 10.16.64.0/24,99999,00000000,… :-)

          • Any possible leading indentation, of a line, is just kept, by this S/R !

          • You can click, either, on the Replace All button AND/OR on the Replace button, for a step by step replacement :-)

          Cheers,

          guy038

          1 Reply Last reply Reply Quote 0
          4 out of 4
          • First post
            4/4
            Last post
          The Community of users of the Notepad++ text editor.
          Powered by NodeBB | Contributors