Duplicate rows according to whats in a column?

greg marshall

Hi all,

I have a list of CIDR networks. On each line, it includes a list of the subnets that are included. This is in a CSV. for example:

network, datacenter,environment,1,2,3,4,5,6,7,8
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,

There are about 800 lines like this.

What i need to end up with, is one line for each of the subnets. Those are in the last 8 columns. Like this
10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0

Or if easier, just
ABC,Prod,10.16.64.0
ABC,Prod,10.16.65.0
ABC,Prod,10.16.66.0
ABC,Prod,10.16.67.0

This is going to be used as a lookup table to match any IP in the network with the first 3 octets in this list and return the data center and environment.

Since this is going into a vendor tool, I don’t have the luxury of using a function /script/program to return the values programmatically. I need a lookup list.

Any tips on how I might accomplish this with NPP? It’s odd I know :)

Thanks!

guy038

Hello Greg,

Of course, a Python, Lua , NppExec or Gawk script would be the best to achieve that task. However, with two S/R, we could get it right !

From your example, with the first header line, below :

network, datacenter,environment,1,2,3,4,5,6,7,8

it seems that each line, of your file, may have, after the Environment field, between 1 and 8 addresses IPV4 !

So, I consider the example text below :

10.16.64.0/24,ABC,Prod,10.16.64.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,

If I suppose that the fields 2 and 3 ( datacenter and environment ) begin, always, with a letter, then the following S/R, performed ONCE, only, in Regular expression mode, by clicking on the Replace All button

SEARCH = (?<=\d),(?=\d)|(,\R)

REPLACE = \r\n(?1\r\n)

would change the text, as below :

10.16.64.0/24,ABC,Prod,10.16.64.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0
10.16.66.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0
10.16.66.0
10.16.67.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0
10.16.66.0
10.16.67.0
10.16.68.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0
10.16.66.0
10.16.67.0
10.16.68.0
10.16.69.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0
10.16.66.0
10.16.67.0
10.16.68.0
10.16.69.0
10.16.70.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.65.0
10.16.66.0
10.16.67.0
10.16.68.0
10.16.69.0
10.16.70.0
10.16.71.0

Then, with this second S/R, below, you should get your text as you like to !

SEARCH = (?-s)(.+?,.+?,.+?,).+\R(?=[^,\r\n]+\R)

REPLACE = $0\1

IMPORTANT :

You’ll have to perform this second S/R, MANY times, till no more occurrence can be found ! That is to say that you’ll just have to click on the Replace All button, repeatedly, at least eight times, as you may have up to 8 addresses, in your file !

So, after eight consecutive replacements, we obtain the final text below :

10.16.64.0/24,ABC,Prod,10.16.64.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0
10.16.64.0/24,ABC,Prod,10.16.70.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0
10.16.64.0/24,ABC,Prod,10.16.70.0
10.16.64.0/24,ABC,Prod,10.16.71.0

Best Regards,

But, presently, I must go to bed ! So, just be patient, till tomorrow, for some explanations, about these regexes !

Best Regards,

guy038

Joe Furlong

What about using a find and replace with a regex? (I’m not a regex expert, but this appeared to work for me. )
Simple example. (In the replace box the “Search Mode” is “Regular expression”)
Set the “Find what:” to:
^(.?,.?,.?),(.?),(.?),(.?),(.?),(.?),(.?),(.?),(.*?)$
Set the “Replace with :” to:
\1,\2\r\n\1,\3\r\n\1,\4\r\n\1,\5\r\n\1,\6\r\n\1,\7\r\n\1,\8\r\n\1,\9

Then press the “Replace” or “Replace All” button.

Hope this helps.

guy038

Hi Greg and Joe,

Joe, concerning the SEARCH regex, I suppose that you meant, ( with some more star meta-characters ! ) :

SEARCH = ^(.*?,.*?,.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),$

Then, with :

REPLACE = \1,\2\r\n\1,\3\r\n\1,\4\r\n\1,\5\r\n\1,\6\r\n\1,\7\r\n\1,\8\r\n\1,\9

it would, indeed, match the line, with 8 addresses, below

10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,

and changed it into the block of 8 lines, below :

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0
10.16.64.0/24,ABC,Prod,10.16.70.0
10.16.64.0/24,ABC,Prod,10.16.71.0

Unfortunately, if a line, of Greg’s text, has less than eight addresses, at the end, your regex would not work :-((

But, thanks to you, you put me in the good direction and gave me an idea for getting all the work done, with ONE regex ONLY !!

So, I start with the example text, below, which contains all possible cases, from no address to 8 addresses, after the environment field, of each line :

10.16.64.0/24,ABC,Prod,
10.16.64.0/24,ABC,Prod,10.16.64.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,

The following S/R, with search mode = Regular expression, will realize everything, after clicking, ONCE only, on the Replace All button !!

SEARCH = (.*?,.*?,.*?,)(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?

REPLACE = \1(?2\2)\r\n(?3\1\3\r\n)(?4\1\4\r\n)(?5\1\5\r\n)(?6\1\6\r\n)(?7\1\7\r\n)(?8\1\8\r\n)(?9\1\9\r\n)

So, in one go, the resulting text would be :

10.16.64.0/24,ABC,Prod,

10.16.64.0/24,ABC,Prod,10.16.64.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0
10.16.64.0/24,ABC,Prod,10.16.70.0

10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0
10.16.64.0/24,ABC,Prod,10.16.68.0
10.16.64.0/24,ABC,Prod,10.16.69.0
10.16.64.0/24,ABC,Prod,10.16.70.0
10.16.64.0/24,ABC,Prod,10.16.71.0

Et voilà ! Well, although I practised regexes, since many years, I, always, surprised by their incredible power !!

Notes :

If the original line is only 10.16.64.0/24,ABC,Prod,, without any address, after, the S/R just rewrites that line
In the search regex, the 8 parts (.*?) are groups from 2 to 9, each containing a possible address, which must be rewritten, in replacement
These 8 groups, followed by a comma, are embedded in an optional non-capturing group (?:(.*?),)? ( the group may be present [1] or absent [0] )
At the beginning, the part (.*?,.*?,.*?,) looks for the first three fields : network, datacenter and environment, followed by a comma
In replacement, the syntax (?#\1\#\r\n) is called a conditional replacement. It means that IF the group # exists, then it rewrites the group 1, followed by the group #, followed by the Windows EOL characters \r\n
At the beginning, the part \1(?2\2)\r\n writes, first, the group 1. Then it writes the group 2, ONLY IF it exists ( the first IPV4 address ), and, finally rewrites the Windows EOL characters \r\n

Remarks :

Contrary to my previous post, with this new S/R, there is no condition, any more, about the contents of fields 2 and 3 ! They could be, for instance : 10.16.64.0/24,99999,00000000,… :-)
Any possible leading indentation, of a line, is just kept, by this S/R !
You can click, either, on the Replace All button AND/OR on the Replace button, for a step by step replacement :-)

Cheers,

guy038