Check which numbers are not included
-
Hello all together,
I need to go through a list and check which numbers are not included.
F.e.:
Input:
1
3
4
5The number 2 is missing so the output should be 2
As the list I have has around 9k missing numbers this would take to much time to do it manually.
Is there a plugin I can use to make this easier?
Thanks and greetings from Germany!
-
Notepad++ isn’t going to be able to directly help you with that. You need a programming language. Of course you can use one of Notepad++'s scripting languages to do it on the contents of the current editor buffer, or a standalone language using your disk file as input works well, too. Good luck.
-
Hello @miko-merz and All,
I think that I found out a way to do it…, from within Notepad++, but, first, I would like some details on your real text :
- From the list, below :
1 2 4 5 6 7 9 10 12
Do you expect the following results ?
3 8 11
- Or, may be, you just need a complete list of consecutive numbers, between two boundaries
n1
andn2
?
So, a short extract of, both, your original text and the final text expected, would be welcome !
See you later,
Best Regards,
guy038
-
I’m not exactly sure where @guy038 is heading, but his reply made me rethink my original nay-saying answer.
Here’s something you might try:
-
Create a new editor tab window in Notepad++
-
Use the Edit (menu) -> Column Editor… -> Number to Insert feature to create a column of numbers that completely covers your range of interest into the new editor tab
-
Add a line of 8 dashes at the bottom of this complete number list:
--------
-
Copy your real dataset (the one with the missing numbers) into the new tab BELOW the line of dashes
-
Run the following redmarking/bookmarking operation:
Invoke Mark dialog (Search (menu) -> Mark…) <–Suggestion: Tie ctrl+m to this action
Find what zone:(?s)^(\d+)$(?=.*?-{8}.*?^\1$)
Mark line checkbox: ticked
Purge for each search checkbox: ticked (not strictly necessary but good if you are experimenting…)
Wrap around checkbox: ticked
Search mode selection: Regular expression
Action: Press Find All buttonAt this point you should see in the complete list – from the top of the file to the line of dashes – the numbers that are missing from your real dataset remain un-redmarked and on lines that are not bookmarked, and you can do with them what you will from that point…
So, taking @guy038’s list and running the above on it, I obtained the following:
[Ignore the orange and green blobs in the margin, and your bookmarking symbol will be a blue circle instead of my cool-looking real bookmark symbol.]
If this is just a one-time or an occasional need, this technique could satisfy it.
-
-
@Scott-Sumner, @guy038 etc.
When I saw the OP I thought it should be a fairly simple regex, once the file was configured with a FULL number list above, a delimiter and the area below to compare too, but then this:
@Miko-Merz said:
As the list I have has around 9k missing numbers this would take to much time to do it manually.
got me concerned. We are potentially talking about creating a number sequence list running possibly into the millions. I looked online and one website will make lists for free, however their list can only number 10000 consecutive numbers. With 9000 numbers possibly missing I couldn’t see an easy way to generate the full list to start with. It seemed like cracking a walnut with a 10lb sledgehammer.
I think my regex was almost verbatim to Scott’s one. I guess we’ll have to see if Guy can pull another white rabbit out of the hat.
Would NOT a python script or similar do it very easily? It would only need start and end numbers, add 1 each time and test, print the answer if not found! I guess I need to broaden my horizons.
Terry
-
I guess I missed the “9k” thing in the OP, but in the best case this could mean only 18000 numbers. :-)
It is always kind of iffy to offer up a regex solution when you don’t know if the user is talking about a “big data” situation. We’ve seen problems with this before here on the Community and I believe this kind of thing can happen when the data gets “too big”.
A script could do it, as I originally said, but I’m not writing one. I typically only do that when it can be of some use to me as well…sorry for my selfish attitude. :-) I guess I played around with the regex not because it would be useful to me, but just to see if it could be done.
-
Off Topic Answer:
-
install Strawberry Perl
-
Save the file below as
npp16392.pl
-
perl npp16392.pl FILENAME
- or give it no filename, and paste the data to STDIN:
perl npp16392.pl
- or also give it output redirection:
perl npp16392.pl INFILE > OUTFILE
use warnings;
use strict;my $max = 0;
my %n;
while(<>) {
chomp;
$max = $_ if $_ > $max;
$n{$}++;
}
my @missing = grep { !exists $n{$} } 1 … $max;
print join “\n”, @missing; - or give it no filename, and paste the data to STDIN:
-
-
Curious where the
16392
came from? :-) -
Hi, @miko-merz @scott-sumner, @peterjones and All,
Bngo, Scott ;-)) Indeed, I was thinking to something very similar !
So, Nico, let’s suppose that your numbers are all, in the range
[300....700]
-
Open a new tab, in N++ (
Ctrl + N
) -
Hit the
Enter
key to create a first line-break -
Open the Replace dialog (
Ctrl + H
)-
SEARCH
\R
-
REPLACE
\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click
3
times on theReplace All
button => You should get 1000 ( =10^3
) pure blank lines -
Hit the
Esc
key to close the Replace dialog
-
-
Now, open the Column editor (
Alt + C
)-
Select the option
Number to Insert
-
Type in
300
as the Initial number -
Type in
1
, in the Increase by field -
Leave the options
Repeat
andLeading zeros
empty -
The format is
decimal
, by default -
Click on the
OK
button
-
-
Then, remove all blank characters at end of lines (
Edit > Blank Operations > Trim Trailing Space
) -
Open the Go To dialog
-
Type in
400
( 700-300 ) -
Hit the
Enter
key=> Cursor is on line 699 !
-
-
Place your cursor at beginning of line
701
-
Select all lines till the end of file (
Ctrl + Shift + End
) and delete them
At this point, we have a 401-lines file, numbered from
300
to700
-
Now, add a separation line of, at least,
3
dashes---
-
Finally, and it’s the main point, UNDER the dashes line, add your OWN list of numbers ( where some numbers are absent ! )
-
Open, again, the Replace dialog (
Ctrl + H
)-
SEARCH
(?s)(^\d+\R)(?=.*---.*^\1)|---.*
-
REPLACE
Leave EMPTY
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click, ONCE, on the
Replace All
button
-
=> You should obtain, only, the short list of integers, between
300
and700
, which are absent from your original list ;-))Remark : In order to test my regex, I, simply, copy the consecutive list of numbers, under the
---
and deleted some numbers ( for example : 302 , 361 , 426 , 491 , 653 )Cheers,
guy038
-
-
That’s easy: from the URL.
https://notepad-plus-plus.org/community/topic/16392/check-which-numbers-are-not-included/9 ^^^^^