Feature Request: Sort by IP Address (CIDR Notation)
-
Hello Dustin-Cook,
so, duplicates, in terms of same ip and mask can be deleted/ignored and should not
appear in new list, correct?
I can think of a way using python script to make this work but this means
you have to install python script plugin. (Here is what needs to be done) Is this the way to go?Cheers
Claudia -
I have no problem installing the Python script. This would work for me.
-
Hello Dustin-Cook,
ok, the code would look like this
ipList = [] # used to save the ips and do sorting def create_ip_list(line_content, line_number, total_lines): # function gets called for each line if line_content.find('/') > -1: # simple check ip, mask = line_content.split('/') # first split mask bits from ip o1, o2, o3, o4 = [int(x) for x in ip.split('.')] # split ip to its octets if not (o1, o2, o3, o4, int(mask)) in ipList: # looking for duplicates, check if ip is already in list ipList.append((o1, o2, o3, o4, int(mask))) # not found in list, so append to list editor.forEachLine(create_ip_list) # main function starts here ipList.sort() # we have all ips, let's sort it editor.beginUndoAction() # set an undo point, in case of there is a need to undo all editor.clearAll() # clear editor content for ip in ipList: # iterating over ip list previously saved and editor.appendText('{0}.{1}.{2}.{3}/{4}\n'.format(*ip)) # write sorted ips to the editor editor.endUndoAction() # inform editor about action end
The comments should be descriptive enough, isn’t it.
Take care about the used tabs, python is strict about it.If anything is unclear let me know.
Cheers
Claudia -
Hello Dustin Cook,
Sure that the Claudia’s Python script does the job just fine. And, with the numerous comments, it would, certainly, help me when, at least, I decide to code with that powerful plugin. However, I’m thinking of an other simple way to get this job done !
Given the list of IPV4 addresses, in your previous post, do you expect this sorted list, like below, without any duplicate ?
12.39.106.161/32 12.154.41.101/32 12.154.41.102/32 62.140.221.0/24 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.192.196/32 68.232.192.198/32 68.232.193.146/32 68.232.193.193/32 68.232.199.1/32 68.232.207.63/32 72.26.195.64/27 74.63.47.96/27 82.163.81.5/32 82.163.81.7/32 82.163.81.11/32 82.163.81.12/32 82.163.81.13/32 82.163.81.14/32 96.43.144.64/31 96.43.147.64/28 96.43.148.64/31 96.43.151.64/28 119.9.27.88/32 119.9.52.35/32 136.146.128.64/28 173.231.138.192/27 173.231.139.0/24 173.231.176.0/21 173.231.184.0/21 182.50.78.64/28 185.41.44.40/32 185.41.46.0/24 185.41.46.10/32 185.41.46.17/32 185.41.46.72/32 198.2.128.0/18 198.245.88.98/32 199.122.120.170/32 204.14.232.64/28 204.14.234.64/28 205.201.128.0/20
If so, I can reach this result, with, successively :
-
A first regex S/R, performed once only
-
A simple lexicographically sort
-
A second regex S/R, performed TWICE
Later, I’ll try to build a macro, that could do all the job, in one go !
See you later
Best regards,
guy038
-
-
Hi, guy038
as I have written and posted the code I was thinking about, could it be done using regex?
Well I assume you would take the same way- split the ip octets and mask
- sort the text
- delete duplicate (this was the step which I couldn’t solve theoratically)
- join octets and mask,
wouldn’t you?
Eager to see the result.
Cheers
Claudia -
Claudia, you are my hero for the day. The Python script worked flawlessly for my needs. Thank you very much!
Plus, I know a bit of Python, so I can expand upon this as necessary. I didn’t even realize there was a Python plug-in for Notepad++!
-
Hi, Dustin Cook and Claudia,
Seemingly, as you have some knowledge of Python, it’s obvious, that the Claudia’s script should be the definitive answer to your problem !
But, to have a glance at the power of regular expressions and to satisfy the Claudia’s curiosity, here is my regex method :-))
So, we start from your original list below :
199.122.120.170/32 185.41.46.10/32 72.26.195.64/27 74.63.47.96/27 173.231.138.192/27 173.231.139.0/24 173.231.176.0/21 173.231.184.0/21 205.201.128.0/20 198.2.128.0/18 62.140.221.0/24 68.232.199.1/32 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.193.146/32 68.232.193.193/32 68.232.192.196/32 68.232.192.198/32 198.245.88.98/32 185.41.46.72/32 185.41.46.17/32 12.39.106.161/32 185.41.46.17/32 12.39.106.161/32 185.41.46.0/24 82.163.81.11/32 82.163.81.5/32 12.154.41.101/32 12.154.41.102/32 96.43.144.64/31 96.43.147.64/28 96.43.148.64/31 96.43.151.64/28 136.146.128.64/28 182.50.78.64/28 204.14.232.64/28 204.14.234.64/28 119.9.52.35/32 119.9.27.88/32 82.163.81.14/32 82.163.81.13/32 82.163.81.12/32 185.41.44.40/32 185.41.44.40/32 68.232.207.63/32 82.163.81.7/32 68.232.207.63/32
My first idea was to add some digits 0, in front of numbers, with less than three digits; I finally realized that it was more simple to add a classical space character, which are never part of an IPV4 address !
Then, the first S/R to perform is :
Find what : (?:^\h*|\.)\K((\d)?\d)(?=\.|/) Replace with : (?2: ) \1
-
Go back to the very beginning of your IPv4 addresses list
-
Select the Regular expression search mode
-
Click on the Replace All button
You should obtain this well formatted list below :
199.122.120.170/32 185. 41. 46. 10/32 72. 26.195. 64/27 74. 63. 47. 96/27 173.231.138.192/27 173.231.139. 0/24 173.231.176. 0/21 173.231.184. 0/21 205.201.128. 0/20 198. 2.128. 0/18 62.140.221. 0/24 68.232.199. 1/32 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.193.146/32 68.232.193.193/32 68.232.192.196/32 68.232.192.198/32 198.245. 88. 98/32 185. 41. 46. 72/32 185. 41. 46. 17/32 12. 39.106.161/32 185. 41. 46. 17/32 12. 39.106.161/32 185. 41. 46. 0/24 82.163. 81. 11/32 82.163. 81. 5/32 12.154. 41.101/32 12.154. 41.102/32 96. 43.144. 64/31 96. 43.147. 64/28 96. 43.148. 64/31 96. 43.151. 64/28 136.146.128. 64/28 182. 50. 78. 64/28 204. 14.232. 64/28 204. 14.234. 64/28 119. 9. 52. 35/32 119. 9. 27. 88/32 82.163. 81. 14/32 82.163. 81. 13/32 82.163. 81. 12/32 185. 41. 44. 40/32 185. 41. 44. 40/32 68.232.207. 63/32 82.163. 81. 7/32 68.232.207. 63/32
NOTES :
-
The first part of the regex,
(?:^\h*|\.)\K
, is a non-capturing group, that tries to match the beginning of each line, followed by possible horizontal blank characters OR one dot. This match is, immediately, forgotten by the regex engine, due to the\K
syntax. That is to say, it just matches a zero-length location, just before the first digit of each subgroup of an IPV4 address -
The final part,
(?=\.|/)
is a look-ahead, which must be satisfied, although it’s NOT part of the final match. It simply looks for a dot or a slash, after each subgroup of the IPV4 address -
So, the middle part
((\d)?\d)
matches a subgroup of one or two digit(s), only, which is our final match. Note that, when the inner group 2 exists, this means that we have matched a two digits number -
In replacement, the part
(?2: )
is a conditional replacement, that means :-
If group 2 exists, we do nothing ( part THEN, between the number of the group and the colon )
-
If group 2 doesn’t exist ( case of aone digit number ), we add one space character, ( part ELSE, between the colon and the ending round bracket )
-
-
Finally, the syntax
\1
, with a space before \1, re-writes the one or two digits subgroup, preceded with a space character
Now, run a simple sort operation : menu option Edit - Line Operations - Sort Lines Lexicographically Ascending. You should obtain the sorted list, as below :
12. 39.106.161/32 12. 39.106.161/32 12.154. 41.101/32 12.154. 41.102/32 62.140.221. 0/24 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.192.196/32 68.232.192.198/32 68.232.193.146/32 68.232.193.193/32 68.232.199. 1/32 68.232.207. 63/32 68.232.207. 63/32 72. 26.195. 64/27 74. 63. 47. 96/27 82.163. 81. 5/32 82.163. 81. 7/32 82.163. 81. 11/32 82.163. 81. 12/32 82.163. 81. 13/32 82.163. 81. 14/32 96. 43.144. 64/31 96. 43.147. 64/28 96. 43.148. 64/31 96. 43.151. 64/28 119. 9. 27. 88/32 119. 9. 52. 35/32 136.146.128. 64/28 173.231.138.192/27 173.231.139. 0/24 173.231.176. 0/21 173.231.184. 0/21 182. 50. 78. 64/28 185. 41. 44. 40/32 185. 41. 44. 40/32 185. 41. 46. 0/24 185. 41. 46. 10/32 185. 41. 46. 17/32 185. 41. 46. 17/32 185. 41. 46. 72/32 198. 2.128. 0/18 198.245. 88. 98/32 199.122.120.170/32 204. 14.232. 64/28 204. 14.234. 64/28 205.201.128. 0/20
Good ! Now, you just have :
-
To get rid of all the space characters, needed for our previous sort
-
To suppress any extra identical IPV4 addresses
We can perform these two operations in one go, with the S/R below :
Find what : \x20+|(?-s)(^.+\R)\1+ Replace with : ?1\1
As above :
-
Go back to the very beginning of your IPv4 addresses list
-
Select the Regular expression search mode
-
Click on the Replace All button, TWICE ( very IMPORTANT )
You should get the final list of IPV4 adresses, below :
12.39.106.161/32 12.154.41.101/32 12.154.41.102/32 62.140.221.0/24 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.192.196/32 68.232.192.198/32 68.232.193.146/32 68.232.193.193/32 68.232.199.1/32 68.232.207.63/32 72.26.195.64/27 74.63.47.96/27 82.163.81.5/32 82.163.81.7/32 82.163.81.11/32 82.163.81.12/32 82.163.81.13/32 82.163.81.14/32 96.43.144.64/31 96.43.147.64/28 96.43.148.64/31 96.43.151.64/28 119.9.27.88/32 119.9.52.35/32 136.146.128.64/28 173.231.138.192/27 173.231.139.0/24 173.231.176.0/21 173.231.184.0/21 182.50.78.64/28 185.41.44.40/32 185.41.46.0/24 185.41.46.10/32 185.41.46.17/32 185.41.46.72/32 198.2.128.0/18 198.245.88.98/32 199.122.120.170/32 204.14.232.64/28 204.14.234.64/28 205.201.128.0/20
NOTES :
-
The search regex
\x20+|(?-s)(^.+\R)\1+
looks, simultaneously, from cursor position, for, either :-
A list of consecutive spaces, that have to be suppressed
-
A list of consecutive identical IPV4 addresses, that should be deleted, except for the first element of that list
-
-
In the second alternative, the form
(^.+\R)\1+
tries to match a complete line, with its EOL characters, stored as group 1, followed by any non null number of that specific line -
And the modifier
(?-s)
forces the dot meta-character to consider standard characters, only, even if you have checked the . matches newline option, by mistake ! -
As the two alternatives are mutually exclusive and, as we repeated this search TWICE, we are sure, at the end, that the regex engine examine these two alternatives, on every line of the list, whatever which alternative was chosen first !
-
The simple replacement part
?1\1
is, again, a conditional replacement :-
If the group 1 doesn’t exist, then, we’re looking for spaces. So, we do nothing, as all these space characters have to be deleted
-
If group 1 exists, then, we just have to keep the first IPV4 address, represented by the
\1
syntax, of each block of identical addresses
-
Best Regards
guy038
P.S. :
-
If some blank characters are written, before each IPV4 address of that list, just take care not to mix lines with space characters with lines with tabulation characters. Indeed, in that case, the list would NOT be sorted correctly !
-
When I said, that the two alternatives of the regex
\x20+|(?-s)(^.+\R)\1+
are mutually exclusive, I meant :-
If the regex engine began to match spaces(s), which are, then, deleted, the current line can NOT be, now, identical to the next line, EVEN IF it was the case, just before the space(s) have been suppressed. So, it will continue to look for possible space(s) to delete, till the end of the current line
-
If the regex engine began to match a block of identical lines, it just rewrites the first line of that block. This line may contain some space characters, which will, only, be deleted, on the second turn !
-
-
Of course, any extra-click on the Replace All button, after the second one, does NOT find any occurrence
-
-
That is very impressive, guy038! I knew regular expressions were fairly powerful, but I didn’t realize they were that powerful. You’ve inspired me to learn more.
Thank you for the write-up and alternative solution.
Claudia, I’m starting work on modifying your Python script to add in CIDR merging to it. So, for instance, if I have these addresses:
66.137.24.194/32
66.137.24.195/32The script would know to merge them into a single CIDR range: 66.137.24.194/31. If I succeed, I’ll post the results here in case anyone else could make use of it.
Thanks again, everyone!
-
Hi, Dustin Cook and Claudia,
Dustin, from your example of your previous post and with the help of the Wikipedia article, below :
https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation
we can deduce, on the same way, that the four CIDR addresses, below :
xxx.xxx.xxx.194/32 xxx.xxx.xxx.195/32 xxx.xxx.xxx.196/32 xxx.xxx.xxx.197/32
can be merged in the unique CIDR range :
xxx.xxx.xxx.194/30
And, that the two CIDR addresses :
xxx.xxx.xxx.194/31 xxx.xxx.xxx.196/31
can, also, be merged in the same CIDR
xxx.xxx.xxx.194/30
Here, we have a good example of the limits of regular expressions :-(( Indeed, these merge actions need some calculus, that cannot be performed by any regex , so that you need to code with a [ script ] language, anyway !
So, Claudia, just be at ease : There are, still, numerous cases, where your loving Python script will be needed :-)))
Cheers,
guy038
-
@Dustin-Cook,
good to see that it is helpful and even better to hear that you will extend its functionality.
As guy038 already said, I’m curios to see the changes ;-))@guy038,
another great example of the power of regular expression and your devotion for explanation.
Chapeau. I see the time come when an os boots from a regex ;-))))))Cheers
Claudia -
Claudia, are you familiar with Python’s netaddr module? I seem to be having all sorts of trouble getting this to work. Here is what I have so far (including your code at the top).
ipList = [] # used to save the ips and do sorting def create_ip_list(line_content, line_number, total_lines): # function gets called for each line if line_content.find('/') > -1: # simple check ip, mask = line_content.split('/') # first split mask bits from ip o1, o2, o3, o4 = [int(x) for x in ip.split('.')] # split ip to its octets if not (o1, o2, o3, o4, int(mask)) in ipList: # looking for duplicates, check if ip is already in list ipList.append((o1, o2, o3, o4, int(mask))) # not found in list, so append to list editor.forEachLine(create_ip_list) # main function starts here ipList.sort() # we have all ips, let's sort it editor.beginUndoAction() # set an undo point, in case of there is a need to undo all editor.clearAll() # clear editor content for ip in ipList: # iterating over ip list previously saved and editor.appendText('{0}.{1}.{2}.{3}/{4}\n'.format(*ip)) # write sorted ips to the editor editor.endUndoAction() # inform editor about action end ipRange = [] def create_range(line_content, line_number, total_lines): # function gets called for each line ipRange.append(IPNetwork(line_content)) # append to list editor.forEachLine(create_range) # main function starts here cidr_merge(ipRange) # use netaddr cidr_merge to merge CIDR ranges editor.beginUndoAction() # set an undo point, in case there is a need to undo all editor.clearAll() # clear editor content for ip in ipRange: # iterating over ip list previously saved and editor.appentText(ip) # write merged CIDR ranges to the editor editor.endUndoAction() # inform editor about action end
I keep getting:
raise AddrFormatError('invalid IPNetwork %s' % addr) netaddr.core.AddrFormatError: Invalid IPNetwork
It’s like IPNetwork doesn’t realize ‘ip’ is a string in the format it wants for some reason.
-
Hello Dustin,
no, don’t have used it yet but could it be that it is strict about additional eols?
As i converted ip and mask to int, any eol char has been stripped silently.
Maybe give it a try withipRange.append(IPNetwork(line_content.strip()))
Which netaddr version do you use?
Cheers
Claudia -
Hello Dustin,
I’ve downloaded netaddr (0.7.18) and it isn’t strict about the eol.
But it is strict about getting nothing ;-)
I assume you have an empty line, one reason why I used the simple check in my create_ip_list function ;-)Cheers
Claudia -
Alright, here is the final script that does everything I need it to.
from netaddr import * # import everything from the netaddr module ipList = [] # initialize the array to store the IPNetwork objects def createIpList(lineContents, lineNumber, totalLines): # function used to fill ipList with each line if lineContents.find('/') > -1: # verify that it is not a blank line by checking for the presence of the "/" in a CIDR range ipList.append(IPNetwork(lineContents)) # append to ipList editor.forEachLine(createIpList) # main function starts here result = cidr_merge(ipList) # use netaddr cidr_merge to merge CIDR ranges. It auto sorts and de-duplicates. editor.beginUndoAction() # set an undo point, in case there is a need to undo all editor.clearAll() # clear editor content for ip in result: # iterating over ipList previously saved and editor.appendText(ip) # write merged CIDR ranges to the editor editor.appendText("\n") # add a newline since the IPNetwork object doesn't include one editor.endUndoAction() # inform editor about action end
Thanks to everyone, especially Claudia, I now have something that goes beyond my original intentions and fully automates my task. All I have to do is copy/paste/run. So happy!
Also, it turns out cidr_merge sorts and de-duplicates, as well, so I only needed that one function.
-
Hello Dustin,
nice to see that you did it and thank you for pointing me to the netaddr module.
I have played a little with it and I can already see two dns tasks which can take usage
of it.Cheers
Claudia -
I would greatly appreciate this feature too, and preferably out-of-the box, i.e. without any plugins. Or it’ll be good to built-in this functionality into a plugin.
I noticed that TextFX plugin is dying, so is there any other perspective plugin that can implement this?
guy038, it’s a cool method that was proposed by you, and it works out-of-the-box, which is an advantage over Pythonscript! Thanks for your effort! Brilliant! -
BTW, I noticed that last version of Pythonscript was published in 2014, and it worries me a lot. I wasn’t able to install it from plugin manager (it threw unknown exception) and had to do it manually.
So with the every new version of NPP compatibility issues will be bigger and bigger, and we should search for a replacement anyway. -
I don’t know that you should be “worried, a lot” about the Pythonscript plugin being last published in 2014. Perhaps that just means it is very stable and has few bugs in need of fixing. :-)
What does “search for a replacement” mean??
@Dave-Brotherstone is the author of the Pythonscript plugin, as well as the PluginManager plugin. He has recently been working a lot on updating the PluginManager and finding a new good site for hosting the plugins it manages. I think this work also includes heading toward a build of a 64-bit version…so he is busy, but I’m guessing that when that work is complete he will also strive to achieve a 64-bit build of Pythonscript, as well. So I think it is far from dead, even though it hasn’t been updated since 2014. Just guesses, but maybe educated guesses?
I can think of a few pieces of software that I use that haven’t had new releases since 2007-2009, so to me 2014 is relatively recent!
-
@Scott-Sumner said:
Perhaps that just means it is very stable and has few bugs in need of fixing. :-)
Bold claim!)
For me it often means that project is simply abandoned. Abandoned and not maintained for a long time, which means that:- the amount of bugs increased exponentially since then
- new features haven’t been implemented for a long time too, whereas technologies are moving forward.
- not to mention that in most cases those projects have poor community where it’s impossible to resolve any issue.
But this is a common case and maybe Pythonscipt is another thing.
P.S. Considering that author of Plugin Manager and author of Pythonscript is a same person, and he’s been working a lot on updating the PluginManager, it is especially strange to face issues while installing PS from Manager. I was unable to do it, and not only I.
-
It IS rather strange that the Pythonscript plugin has trouble installing via PluginManager, given a common author, I’ll grant you that.
I don’t know the full explanation of the Pythonscript + PluginManager problems, but I haven’t let that discourage me. As you have discovered, installing Pythonscript another way works. I have been using Pythonscript deeply since my very first day of using Notepad++, and I can say I’ve found bugs, but not many. Workarounds are the mainstay of Notepad++ (power) users. Bugs exist in software. Features are lacking. These are just facts.