Feature Request: Sort by IP Address (CIDR Notation)
-
Hi, Dustin Cook and Claudia,
Seemingly, as you have some knowledge of Python, it’s obvious, that the Claudia’s script should be the definitive answer to your problem !
But, to have a glance at the power of regular expressions and to satisfy the Claudia’s curiosity, here is my regex method :-))
So, we start from your original list below :
199.122.120.170/32 185.41.46.10/32 72.26.195.64/27 74.63.47.96/27 173.231.138.192/27 173.231.139.0/24 173.231.176.0/21 173.231.184.0/21 205.201.128.0/20 198.2.128.0/18 62.140.221.0/24 68.232.199.1/32 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.193.146/32 68.232.193.193/32 68.232.192.196/32 68.232.192.198/32 198.245.88.98/32 185.41.46.72/32 185.41.46.17/32 12.39.106.161/32 185.41.46.17/32 12.39.106.161/32 185.41.46.0/24 82.163.81.11/32 82.163.81.5/32 12.154.41.101/32 12.154.41.102/32 96.43.144.64/31 96.43.147.64/28 96.43.148.64/31 96.43.151.64/28 136.146.128.64/28 182.50.78.64/28 204.14.232.64/28 204.14.234.64/28 119.9.52.35/32 119.9.27.88/32 82.163.81.14/32 82.163.81.13/32 82.163.81.12/32 185.41.44.40/32 185.41.44.40/32 68.232.207.63/32 82.163.81.7/32 68.232.207.63/32
My first idea was to add some digits 0, in front of numbers, with less than three digits; I finally realized that it was more simple to add a classical space character, which are never part of an IPV4 address !
Then, the first S/R to perform is :
Find what : (?:^\h*|\.)\K((\d)?\d)(?=\.|/) Replace with : (?2: ) \1
-
Go back to the very beginning of your IPv4 addresses list
-
Select the Regular expression search mode
-
Click on the Replace All button
You should obtain this well formatted list below :
199.122.120.170/32 185. 41. 46. 10/32 72. 26.195. 64/27 74. 63. 47. 96/27 173.231.138.192/27 173.231.139. 0/24 173.231.176. 0/21 173.231.184. 0/21 205.201.128. 0/20 198. 2.128. 0/18 62.140.221. 0/24 68.232.199. 1/32 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.193.146/32 68.232.193.193/32 68.232.192.196/32 68.232.192.198/32 198.245. 88. 98/32 185. 41. 46. 72/32 185. 41. 46. 17/32 12. 39.106.161/32 185. 41. 46. 17/32 12. 39.106.161/32 185. 41. 46. 0/24 82.163. 81. 11/32 82.163. 81. 5/32 12.154. 41.101/32 12.154. 41.102/32 96. 43.144. 64/31 96. 43.147. 64/28 96. 43.148. 64/31 96. 43.151. 64/28 136.146.128. 64/28 182. 50. 78. 64/28 204. 14.232. 64/28 204. 14.234. 64/28 119. 9. 52. 35/32 119. 9. 27. 88/32 82.163. 81. 14/32 82.163. 81. 13/32 82.163. 81. 12/32 185. 41. 44. 40/32 185. 41. 44. 40/32 68.232.207. 63/32 82.163. 81. 7/32 68.232.207. 63/32
NOTES :
-
The first part of the regex,
(?:^\h*|\.)\K
, is a non-capturing group, that tries to match the beginning of each line, followed by possible horizontal blank characters OR one dot. This match is, immediately, forgotten by the regex engine, due to the\K
syntax. That is to say, it just matches a zero-length location, just before the first digit of each subgroup of an IPV4 address -
The final part,
(?=\.|/)
is a look-ahead, which must be satisfied, although it’s NOT part of the final match. It simply looks for a dot or a slash, after each subgroup of the IPV4 address -
So, the middle part
((\d)?\d)
matches a subgroup of one or two digit(s), only, which is our final match. Note that, when the inner group 2 exists, this means that we have matched a two digits number -
In replacement, the part
(?2: )
is a conditional replacement, that means :-
If group 2 exists, we do nothing ( part THEN, between the number of the group and the colon )
-
If group 2 doesn’t exist ( case of aone digit number ), we add one space character, ( part ELSE, between the colon and the ending round bracket )
-
-
Finally, the syntax
\1
, with a space before \1, re-writes the one or two digits subgroup, preceded with a space character
Now, run a simple sort operation : menu option Edit - Line Operations - Sort Lines Lexicographically Ascending. You should obtain the sorted list, as below :
12. 39.106.161/32 12. 39.106.161/32 12.154. 41.101/32 12.154. 41.102/32 62.140.221. 0/24 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.192.196/32 68.232.192.198/32 68.232.193.146/32 68.232.193.193/32 68.232.199. 1/32 68.232.207. 63/32 68.232.207. 63/32 72. 26.195. 64/27 74. 63. 47. 96/27 82.163. 81. 5/32 82.163. 81. 7/32 82.163. 81. 11/32 82.163. 81. 12/32 82.163. 81. 13/32 82.163. 81. 14/32 96. 43.144. 64/31 96. 43.147. 64/28 96. 43.148. 64/31 96. 43.151. 64/28 119. 9. 27. 88/32 119. 9. 52. 35/32 136.146.128. 64/28 173.231.138.192/27 173.231.139. 0/24 173.231.176. 0/21 173.231.184. 0/21 182. 50. 78. 64/28 185. 41. 44. 40/32 185. 41. 44. 40/32 185. 41. 46. 0/24 185. 41. 46. 10/32 185. 41. 46. 17/32 185. 41. 46. 17/32 185. 41. 46. 72/32 198. 2.128. 0/18 198.245. 88. 98/32 199.122.120.170/32 204. 14.232. 64/28 204. 14.234. 64/28 205.201.128. 0/20
Good ! Now, you just have :
-
To get rid of all the space characters, needed for our previous sort
-
To suppress any extra identical IPV4 addresses
We can perform these two operations in one go, with the S/R below :
Find what : \x20+|(?-s)(^.+\R)\1+ Replace with : ?1\1
As above :
-
Go back to the very beginning of your IPv4 addresses list
-
Select the Regular expression search mode
-
Click on the Replace All button, TWICE ( very IMPORTANT )
You should get the final list of IPV4 adresses, below :
12.39.106.161/32 12.154.41.101/32 12.154.41.102/32 62.140.221.0/24 68.232.192.104/32 68.232.192.105/32 68.232.192.106/32 68.232.192.196/32 68.232.192.198/32 68.232.193.146/32 68.232.193.193/32 68.232.199.1/32 68.232.207.63/32 72.26.195.64/27 74.63.47.96/27 82.163.81.5/32 82.163.81.7/32 82.163.81.11/32 82.163.81.12/32 82.163.81.13/32 82.163.81.14/32 96.43.144.64/31 96.43.147.64/28 96.43.148.64/31 96.43.151.64/28 119.9.27.88/32 119.9.52.35/32 136.146.128.64/28 173.231.138.192/27 173.231.139.0/24 173.231.176.0/21 173.231.184.0/21 182.50.78.64/28 185.41.44.40/32 185.41.46.0/24 185.41.46.10/32 185.41.46.17/32 185.41.46.72/32 198.2.128.0/18 198.245.88.98/32 199.122.120.170/32 204.14.232.64/28 204.14.234.64/28 205.201.128.0/20
NOTES :
-
The search regex
\x20+|(?-s)(^.+\R)\1+
looks, simultaneously, from cursor position, for, either :-
A list of consecutive spaces, that have to be suppressed
-
A list of consecutive identical IPV4 addresses, that should be deleted, except for the first element of that list
-
-
In the second alternative, the form
(^.+\R)\1+
tries to match a complete line, with its EOL characters, stored as group 1, followed by any non null number of that specific line -
And the modifier
(?-s)
forces the dot meta-character to consider standard characters, only, even if you have checked the . matches newline option, by mistake ! -
As the two alternatives are mutually exclusive and, as we repeated this search TWICE, we are sure, at the end, that the regex engine examine these two alternatives, on every line of the list, whatever which alternative was chosen first !
-
The simple replacement part
?1\1
is, again, a conditional replacement :-
If the group 1 doesn’t exist, then, we’re looking for spaces. So, we do nothing, as all these space characters have to be deleted
-
If group 1 exists, then, we just have to keep the first IPV4 address, represented by the
\1
syntax, of each block of identical addresses
-
Best Regards
guy038
P.S. :
-
If some blank characters are written, before each IPV4 address of that list, just take care not to mix lines with space characters with lines with tabulation characters. Indeed, in that case, the list would NOT be sorted correctly !
-
When I said, that the two alternatives of the regex
\x20+|(?-s)(^.+\R)\1+
are mutually exclusive, I meant :-
If the regex engine began to match spaces(s), which are, then, deleted, the current line can NOT be, now, identical to the next line, EVEN IF it was the case, just before the space(s) have been suppressed. So, it will continue to look for possible space(s) to delete, till the end of the current line
-
If the regex engine began to match a block of identical lines, it just rewrites the first line of that block. This line may contain some space characters, which will, only, be deleted, on the second turn !
-
-
Of course, any extra-click on the Replace All button, after the second one, does NOT find any occurrence
-
-
That is very impressive, guy038! I knew regular expressions were fairly powerful, but I didn’t realize they were that powerful. You’ve inspired me to learn more.
Thank you for the write-up and alternative solution.
Claudia, I’m starting work on modifying your Python script to add in CIDR merging to it. So, for instance, if I have these addresses:
66.137.24.194/32
66.137.24.195/32The script would know to merge them into a single CIDR range: 66.137.24.194/31. If I succeed, I’ll post the results here in case anyone else could make use of it.
Thanks again, everyone!
-
Hi, Dustin Cook and Claudia,
Dustin, from your example of your previous post and with the help of the Wikipedia article, below :
https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation
we can deduce, on the same way, that the four CIDR addresses, below :
xxx.xxx.xxx.194/32 xxx.xxx.xxx.195/32 xxx.xxx.xxx.196/32 xxx.xxx.xxx.197/32
can be merged in the unique CIDR range :
xxx.xxx.xxx.194/30
And, that the two CIDR addresses :
xxx.xxx.xxx.194/31 xxx.xxx.xxx.196/31
can, also, be merged in the same CIDR
xxx.xxx.xxx.194/30
Here, we have a good example of the limits of regular expressions :-(( Indeed, these merge actions need some calculus, that cannot be performed by any regex , so that you need to code with a [ script ] language, anyway !
So, Claudia, just be at ease : There are, still, numerous cases, where your loving Python script will be needed :-)))
Cheers,
guy038
-
@Dustin-Cook,
good to see that it is helpful and even better to hear that you will extend its functionality.
As guy038 already said, I’m curios to see the changes ;-))@guy038,
another great example of the power of regular expression and your devotion for explanation.
Chapeau. I see the time come when an os boots from a regex ;-))))))Cheers
Claudia -
Claudia, are you familiar with Python’s netaddr module? I seem to be having all sorts of trouble getting this to work. Here is what I have so far (including your code at the top).
ipList = [] # used to save the ips and do sorting def create_ip_list(line_content, line_number, total_lines): # function gets called for each line if line_content.find('/') > -1: # simple check ip, mask = line_content.split('/') # first split mask bits from ip o1, o2, o3, o4 = [int(x) for x in ip.split('.')] # split ip to its octets if not (o1, o2, o3, o4, int(mask)) in ipList: # looking for duplicates, check if ip is already in list ipList.append((o1, o2, o3, o4, int(mask))) # not found in list, so append to list editor.forEachLine(create_ip_list) # main function starts here ipList.sort() # we have all ips, let's sort it editor.beginUndoAction() # set an undo point, in case of there is a need to undo all editor.clearAll() # clear editor content for ip in ipList: # iterating over ip list previously saved and editor.appendText('{0}.{1}.{2}.{3}/{4}\n'.format(*ip)) # write sorted ips to the editor editor.endUndoAction() # inform editor about action end ipRange = [] def create_range(line_content, line_number, total_lines): # function gets called for each line ipRange.append(IPNetwork(line_content)) # append to list editor.forEachLine(create_range) # main function starts here cidr_merge(ipRange) # use netaddr cidr_merge to merge CIDR ranges editor.beginUndoAction() # set an undo point, in case there is a need to undo all editor.clearAll() # clear editor content for ip in ipRange: # iterating over ip list previously saved and editor.appentText(ip) # write merged CIDR ranges to the editor editor.endUndoAction() # inform editor about action end
I keep getting:
raise AddrFormatError('invalid IPNetwork %s' % addr) netaddr.core.AddrFormatError: Invalid IPNetwork
It’s like IPNetwork doesn’t realize ‘ip’ is a string in the format it wants for some reason.
-
Hello Dustin,
no, don’t have used it yet but could it be that it is strict about additional eols?
As i converted ip and mask to int, any eol char has been stripped silently.
Maybe give it a try withipRange.append(IPNetwork(line_content.strip()))
Which netaddr version do you use?
Cheers
Claudia -
Hello Dustin,
I’ve downloaded netaddr (0.7.18) and it isn’t strict about the eol.
But it is strict about getting nothing ;-)
I assume you have an empty line, one reason why I used the simple check in my create_ip_list function ;-)Cheers
Claudia -
Alright, here is the final script that does everything I need it to.
from netaddr import * # import everything from the netaddr module ipList = [] # initialize the array to store the IPNetwork objects def createIpList(lineContents, lineNumber, totalLines): # function used to fill ipList with each line if lineContents.find('/') > -1: # verify that it is not a blank line by checking for the presence of the "/" in a CIDR range ipList.append(IPNetwork(lineContents)) # append to ipList editor.forEachLine(createIpList) # main function starts here result = cidr_merge(ipList) # use netaddr cidr_merge to merge CIDR ranges. It auto sorts and de-duplicates. editor.beginUndoAction() # set an undo point, in case there is a need to undo all editor.clearAll() # clear editor content for ip in result: # iterating over ipList previously saved and editor.appendText(ip) # write merged CIDR ranges to the editor editor.appendText("\n") # add a newline since the IPNetwork object doesn't include one editor.endUndoAction() # inform editor about action end
Thanks to everyone, especially Claudia, I now have something that goes beyond my original intentions and fully automates my task. All I have to do is copy/paste/run. So happy!
Also, it turns out cidr_merge sorts and de-duplicates, as well, so I only needed that one function.
-
Hello Dustin,
nice to see that you did it and thank you for pointing me to the netaddr module.
I have played a little with it and I can already see two dns tasks which can take usage
of it.Cheers
Claudia -
I would greatly appreciate this feature too, and preferably out-of-the box, i.e. without any plugins. Or it’ll be good to built-in this functionality into a plugin.
I noticed that TextFX plugin is dying, so is there any other perspective plugin that can implement this?
guy038, it’s a cool method that was proposed by you, and it works out-of-the-box, which is an advantage over Pythonscript! Thanks for your effort! Brilliant! -
BTW, I noticed that last version of Pythonscript was published in 2014, and it worries me a lot. I wasn’t able to install it from plugin manager (it threw unknown exception) and had to do it manually.
So with the every new version of NPP compatibility issues will be bigger and bigger, and we should search for a replacement anyway. -
I don’t know that you should be “worried, a lot” about the Pythonscript plugin being last published in 2014. Perhaps that just means it is very stable and has few bugs in need of fixing. :-)
What does “search for a replacement” mean??
@Dave-Brotherstone is the author of the Pythonscript plugin, as well as the PluginManager plugin. He has recently been working a lot on updating the PluginManager and finding a new good site for hosting the plugins it manages. I think this work also includes heading toward a build of a 64-bit version…so he is busy, but I’m guessing that when that work is complete he will also strive to achieve a 64-bit build of Pythonscript, as well. So I think it is far from dead, even though it hasn’t been updated since 2014. Just guesses, but maybe educated guesses?
I can think of a few pieces of software that I use that haven’t had new releases since 2007-2009, so to me 2014 is relatively recent!
-
@Scott-Sumner said:
Perhaps that just means it is very stable and has few bugs in need of fixing. :-)
Bold claim!)
For me it often means that project is simply abandoned. Abandoned and not maintained for a long time, which means that:- the amount of bugs increased exponentially since then
- new features haven’t been implemented for a long time too, whereas technologies are moving forward.
- not to mention that in most cases those projects have poor community where it’s impossible to resolve any issue.
But this is a common case and maybe Pythonscipt is another thing.
P.S. Considering that author of Plugin Manager and author of Pythonscript is a same person, and he’s been working a lot on updating the PluginManager, it is especially strange to face issues while installing PS from Manager. I was unable to do it, and not only I.
-
It IS rather strange that the Pythonscript plugin has trouble installing via PluginManager, given a common author, I’ll grant you that.
I don’t know the full explanation of the Pythonscript + PluginManager problems, but I haven’t let that discourage me. As you have discovered, installing Pythonscript another way works. I have been using Pythonscript deeply since my very first day of using Notepad++, and I can say I’ve found bugs, but not many. Workarounds are the mainstay of Notepad++ (power) users. Bugs exist in software. Features are lacking. These are just facts.
-
@Dustin-Cook BTw, the script from Dustin doesn’t work for me (as well as the first ones), it just cleans the screen with the test data. Maybe I’m doing smth wrong?
-
@Scott-Sumner said:
@Dave-Brotherstone is the author of the Pythonscript plugin, as well as the PluginManager plugin
It appears Dave may be @bruderstein on this forum…maybe if he sees this he could comment on the status of Pythonscript…I’m curious about the state of even the 32-bit version as I just got bit again by the notepad.getFiles() bug. :(
That bug is detailed here: https://github.com/bruderstein/PythonScript/issues/22
I’d fix the bug in a local build as a temporary measure, but thus far I have been unable to build Pythonscript from source (okay, I haven’t worked extensively on getting it to build, but Notepad++ is an easy one to build, and I was hoping for the same from Pythonscript).
-
Scott, what is the error you get? In the past I compiled it a couple of times
and can’t remember that there was some trouble doing it.Cheers
Claudia -
Since we’re already way off topic, can we take a discussion of building Pythonscript offline to email? If an offline discussion turns up something really interesting, I can post back here (under a new topic) to share with the Community.
-
sure, @guy038 as far as I remember you already in contact with Scott,
would you be so kind and forward my email to him?Thanks
Claudia -
Hello Claudia,
No problem : I’ve just sent your e-mail address to Scott !
Good Python discussions, with Scott ;-)
Cheers,
guy038