Duplicate rows according to whats in a column?
-
Hi all,
I have a list of CIDR networks. On each line, it includes a list of the subnets that are included. This is in a CSV. for example:
network, datacenter,environment,1,2,3,4,5,6,7,8
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,There are about 800 lines like this.
What i need to end up with, is one line for each of the subnets. Those are in the last 8 columns. Like this
10.16.64.0/24,ABC,Prod,10.16.64.0
10.16.64.0/24,ABC,Prod,10.16.65.0
10.16.64.0/24,ABC,Prod,10.16.66.0
10.16.64.0/24,ABC,Prod,10.16.67.0Or if easier, just
ABC,Prod,10.16.64.0
ABC,Prod,10.16.65.0
ABC,Prod,10.16.66.0
ABC,Prod,10.16.67.0This is going to be used as a lookup table to match any IP in the network with the first 3 octets in this list and return the data center and environment.
Since this is going into a vendor tool, I don’t have the luxury of using a function /script/program to return the values programmatically. I need a lookup list.
Any tips on how I might accomplish this with NPP? It’s odd I know :)
Thanks!
-
Hello Greg,
Of course, a Python, Lua , NppExec or Gawk script would be the best to achieve that task. However, with two S/R, we could get it right !
From your example, with the first header line, below :
network, datacenter,environment,1,2,3,4,5,6,7,8
it seems that each line, of your file, may have, after the
Environment
field, between 1 and 8 addresses IPV4 !So, I consider the example text below :
10.16.64.0/24,ABC,Prod,10.16.64.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,
If I suppose that the fields 2 and 3 (
datacenter
andenvironment
) begin, always, with a letter, then the following S/R, performedONCE
, only, in Regular expression mode, by clicking on the Replace All buttonSEARCH =
(?<=\d),(?=\d)|(,\R)
REPLACE =
\r\n(?1\r\n)
would change the text, as below :
10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.66.0 10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.66.0 10.16.67.0 10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.66.0 10.16.67.0 10.16.68.0 10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.66.0 10.16.67.0 10.16.68.0 10.16.69.0 10.16.70.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.65.0 10.16.66.0 10.16.67.0 10.16.68.0 10.16.69.0 10.16.70.0 10.16.71.0
Then, with this second S/R, below, you should get your text as you like to !
SEARCH =
(?-s)(.+?,.+?,.+?,).+\R(?=[^,\r\n]+\R)
REPLACE =
$0\1
IMPORTANT :
You’ll have to perform this second S/R,
MANY times
, till no more occurrence can be found ! That is to say that you’ll just have to click on the Replace All button, repeatedly, at least eight times, as you may have up to 8 addresses, in your file !So, after eight consecutive replacements, we obtain the final text below :
10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.70.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.70.0 10.16.64.0/24,ABC,Prod,10.16.71.0
Best Regards,
But, presently, I must go to bed ! So, just be patient, till tomorrow, for some explanations, about these regexes !
Best Regards,
guy038
-
What about using a find and replace with a regex? (I’m not a regex expert, but this appeared to work for me. )
Simple example. (In the replace box the “Search Mode” is “Regular expression”)
Set the “Find what:” to:
^(.?,.?,.?),(.?),(.?),(.?),(.?),(.?),(.?),(.?),(.*?)$
Set the “Replace with :” to:
\1,\2\r\n\1,\3\r\n\1,\4\r\n\1,\5\r\n\1,\6\r\n\1,\7\r\n\1,\8\r\n\1,\9Then press the “Replace” or “Replace All” button.
Hope this helps.
-
Hi Greg and Joe,
Joe, concerning the SEARCH regex, I suppose that you meant, ( with some more star meta-characters ! ) :
SEARCH =
^(.*?,.*?,.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),(.*?),$
Then, with :
REPLACE =
\1,\2\r\n\1,\3\r\n\1,\4\r\n\1,\5\r\n\1,\6\r\n\1,\7\r\n\1,\8\r\n\1,\9
it would, indeed, match the line, with 8 addresses, below
10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,
and changed it into the block of 8 lines, below :
10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.70.0 10.16.64.0/24,ABC,Prod,10.16.71.0
Unfortunately, if a line, of Greg’s text, has less than eight addresses, at the end, your regex would not work :-((
But, thanks to you, you put me in the good direction and gave me an idea for getting all the work done, with ONE regex ONLY !!
So, I start with the example text, below, which contains all possible cases, from no address to 8 addresses, after the environment field, of each line :
10.16.64.0/24,ABC,Prod, 10.16.64.0/24,ABC,Prod,10.16.64.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0, 10.16.64.0/24,ABC,Prod,10.16.64.0,10.16.65.0,10.16.66.0,10.16.67.0,10.16.68.0,10.16.69.0,10.16.70.0,10.16.71.0,
The following S/R, with search mode = Regular expression, will realize everything, after clicking, ONCE only, on the Replace All button !!
SEARCH =
(.*?,.*?,.*?,)(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?(?:(.*?),)?
REPLACE =
\1(?2\2)\r\n(?3\1\3\r\n)(?4\1\4\r\n)(?5\1\5\r\n)(?6\1\6\r\n)(?7\1\7\r\n)(?8\1\8\r\n)(?9\1\9\r\n)
So, in one go, the resulting text would be :
10.16.64.0/24,ABC,Prod, 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.70.0 10.16.64.0/24,ABC,Prod,10.16.64.0 10.16.64.0/24,ABC,Prod,10.16.65.0 10.16.64.0/24,ABC,Prod,10.16.66.0 10.16.64.0/24,ABC,Prod,10.16.67.0 10.16.64.0/24,ABC,Prod,10.16.68.0 10.16.64.0/24,ABC,Prod,10.16.69.0 10.16.64.0/24,ABC,Prod,10.16.70.0 10.16.64.0/24,ABC,Prod,10.16.71.0
Et voilà ! Well, although I practised regexes, since many years, I, always, surprised by their incredible power !!
Notes :
-
If the original line is only 10.16.64.0/24,ABC,Prod,, without any address, after, the S/R just rewrites that line
-
In the search regex, the 8 parts
(.*?)
are groups from 2 to 9, each containing a possible address, which must be rewritten, in replacement -
These 8 groups, followed by a comma, are embedded in an optional non-capturing group
(?:(.*?),)?
( the group may be present [1] or absent [0] ) -
At the beginning, the part
(.*?,.*?,.*?,)
looks for the first three fields : network, datacenter and environment, followed by a comma -
In replacement, the syntax
(?#\1\#\r\n)
is called a conditional replacement. It means that IF the group#
exists, then it rewrites the group1
, followed by the group#
, followed by the Windows EOL characters\r\n
-
At the beginning, the part
\1(?2\2)\r\n
writes, first, the group 1. Then it writes the group 2, ONLY IF it exists ( the first IPV4 address ), and, finally rewrites the Windows EOL characters\r\n
Remarks :
-
Contrary to my previous post, with this new S/R, there is no condition, any more, about the contents of fields 2 and 3 ! They could be, for instance : 10.16.64.0/24,99999,00000000,… :-)
-
Any possible leading indentation, of a line, is just kept, by this S/R !
-
You can click, either, on the Replace All button AND/OR on the Replace button, for a step by step replacement :-)
Cheers,
guy038
-