• Login
Community
  • Login

Check which numbers are not included

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
10 Posts 5 Posters 4.9k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    Miko Merz
    last edited by Sep 27, 2018, 3:50 AM

    Hello all together,

    I need to go through a list and check which numbers are not included.

    F.e.:
    Input:
    1
    3
    4
    5

    The number 2 is missing so the output should be 2

    As the list I have has around 9k missing numbers this would take to much time to do it manually.

    Is there a plugin I can use to make this easier?

    Thanks and greetings from Germany!

    S 1 Reply Last reply Sep 27, 2018, 11:56 AM Reply Quote 0
    • S
      Scott Sumner @Miko Merz
      last edited by Sep 27, 2018, 11:56 AM

      @Miko-Merz

      Notepad++ isn’t going to be able to directly help you with that. You need a programming language. Of course you can use one of Notepad++'s scripting languages to do it on the contents of the current editor buffer, or a standalone language using your disk file as input works well, too. Good luck.

      1 Reply Last reply Reply Quote 1
      • G
        guy038
        last edited by Sep 28, 2018, 3:12 AM

        Hello @miko-merz and All,

        I think that I found out a way to do it…, from within Notepad++, but, first, I would like some details on your real text :

        • From the list, below :
        1
        2
        4
        5
        6
        7
        9
        10
        12
        

        Do you expect the following results ?

        3
        8
        11
        
        • Or, may be, you just need a complete list of consecutive numbers, between two boundaries n1 and n2 ?

        So, a short extract of, both, your original text and the final text expected, would be welcome !

        See you later,

        Best Regards,

        guy038

        1 Reply Last reply Reply Quote 1
        • S
          Scott Sumner
          last edited by Sep 28, 2018, 12:56 PM

          I’m not exactly sure where @guy038 is heading, but his reply made me rethink my original nay-saying answer.

          Here’s something you might try:

          • Create a new editor tab window in Notepad++

          • Use the Edit (menu) -> Column Editor… -> Number to Insert feature to create a column of numbers that completely covers your range of interest into the new editor tab

          • Add a line of 8 dashes at the bottom of this complete number list: --------

          • Copy your real dataset (the one with the missing numbers) into the new tab BELOW the line of dashes

          • Run the following redmarking/bookmarking operation:

          Invoke Mark dialog (Search (menu) -> Mark…) <–Suggestion: Tie ctrl+m to this action
          Find what zone: (?s)^(\d+)$(?=.*?-{8}.*?^\1$)
          Mark line checkbox: ticked
          Purge for each search checkbox: ticked (not strictly necessary but good if you are experimenting…)
          Wrap around checkbox: ticked
          Search mode selection: Regular expression
          Action: Press Find All button

          At this point you should see in the complete list – from the top of the file to the line of dashes – the numbers that are missing from your real dataset remain un-redmarked and on lines that are not bookmarked, and you can do with them what you will from that point…

          So, taking @guy038’s list and running the above on it, I obtained the following:

          Imgur

          [Ignore the orange and green blobs in the margin, and your bookmarking symbol will be a blue circle instead of my cool-looking real bookmark symbol.]

          If this is just a one-time or an occasional need, this technique could satisfy it.

          1 Reply Last reply Reply Quote 2
          • T
            Terry R
            last edited by Terry R Sep 28, 2018, 8:25 PM Sep 28, 2018, 8:24 PM

            @Scott-Sumner, @guy038 etc.

            When I saw the OP I thought it should be a fairly simple regex, once the file was configured with a FULL number list above, a delimiter and the area below to compare too, but then this:

            @Miko-Merz said:

            As the list I have has around 9k missing numbers this would take to much time to do it manually.

            got me concerned. We are potentially talking about creating a number sequence list running possibly into the millions. I looked online and one website will make lists for free, however their list can only number 10000 consecutive numbers. With 9000 numbers possibly missing I couldn’t see an easy way to generate the full list to start with. It seemed like cracking a walnut with a 10lb sledgehammer.

            I think my regex was almost verbatim to Scott’s one. I guess we’ll have to see if Guy can pull another white rabbit out of the hat.

            Would NOT a python script or similar do it very easily? It would only need start and end numbers, add 1 each time and test, print the answer if not found! I guess I need to broaden my horizons.

            Terry

            S 1 Reply Last reply Sep 28, 2018, 8:36 PM Reply Quote 1
            • S
              Scott Sumner @Terry R
              last edited by Sep 28, 2018, 8:36 PM

              @Terry-R

              I guess I missed the “9k” thing in the OP, but in the best case this could mean only 18000 numbers. :-)

              It is always kind of iffy to offer up a regex solution when you don’t know if the user is talking about a “big data” situation. We’ve seen problems with this before here on the Community and I believe this kind of thing can happen when the data gets “too big”.

              A script could do it, as I originally said, but I’m not writing one. I typically only do that when it can be of some use to me as well…sorry for my selfish attitude. :-) I guess I played around with the regex not because it would be useful to me, but just to see if it could be done.

              1 Reply Last reply Reply Quote 2
              • P
                PeterJones
                last edited by PeterJones Sep 28, 2018, 8:55 PM Sep 28, 2018, 8:54 PM

                Off Topic Answer:

                1. install Strawberry Perl

                2. Save the file below as npp16392.pl

                3. perl npp16392.pl FILENAME

                  • or give it no filename, and paste the data to STDIN: perl npp16392.pl
                  • or also give it output redirection: perl npp16392.pl INFILE > OUTFILE

                  use warnings;
                  use strict;

                  my $max = 0;
                  my %n;
                  while(<>) {
                  chomp;
                  $max = $_ if $_ > $max;
                  $n{$}++;
                  }
                  my @missing = grep { !exists $n{$
                  } } 1 … $max;
                  print join “\n”, @missing;

                S 1 Reply Last reply Sep 28, 2018, 9:26 PM Reply Quote 1
                • S
                  Scott Sumner @PeterJones
                  last edited by Sep 28, 2018, 9:26 PM

                  @PeterJones

                  Curious where the 16392 came from? :-)

                  1 Reply Last reply Reply Quote 2
                  • G
                    guy038
                    last edited by guy038 Sep 28, 2018, 10:47 PM Sep 28, 2018, 10:37 PM

                    Hi, @miko-merz @scott-sumner, @peterjones and All,

                    Bngo, Scott ;-)) Indeed, I was thinking to something very similar !


                    So, Nico, let’s suppose that your numbers are all, in the range [300....700]

                    • Open a new tab, in N++ ( Ctrl + N )

                    • Hit the Enter key to create a first line-break

                    • Open the Replace dialog ( Ctrl + H )

                      • SEARCH \R

                      • REPLACE \r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n

                      • Tick the Wrap around option

                      • Select the Regular expression search mode

                      • Click 3 times on the Replace All button => You should get 1000 ( = 10^3 ) pure blank lines

                      • Hit the Esc key to close the Replace dialog

                    • Now, open the Column editor ( Alt + C )

                      • Select the option Number to Insert

                      • Type in 300 as the Initial number

                      • Type in 1 , in the Increase by field

                      • Leave the options Repeat and Leading zeros empty

                      • The format is decimal, by default

                      • Click on the OK button

                    • Then, remove all blank characters at end of lines ( Edit > Blank Operations > Trim Trailing Space )

                    • Open the Go To dialog

                      • Type in 400 ( 700-300 )

                      • Hit the Enter key=> Cursor is on line 699 !

                    • Place your cursor at beginning of line 701

                    • Select all lines till the end of file ( Ctrl + Shift + End ) and delete them

                    At this point, we have a 401-lines file, numbered from 300 to 700

                    • Now, add a separation line of, at least, 3 dashes ---

                    • Finally, and it’s the main point, UNDER the dashes line, add your OWN list of numbers ( where some numbers are absent ! )

                    • Open, again, the Replace dialog ( Ctrl + H )

                      • SEARCH (?s)(^\d+\R)(?=.*---.*^\1)|---.*

                      • REPLACE Leave EMPTY

                      • Tick the Wrap around option

                      • Select the Regular expression search mode

                      • Click, ONCE, on the Replace All button

                    => You should obtain, only, the short list of integers, between 300 and 700, which are absent from your original list ;-))

                    Remark : In order to test my regex, I, simply, copy the consecutive list of numbers, under the --- and deleted some numbers ( for example : 302 , 361 , 426 , 491 , 653 )

                    Cheers,

                    guy038

                    1 Reply Last reply Reply Quote 1
                    • P
                      PeterJones
                      last edited by Oct 1, 2018, 1:02 AM

                      @Scott-Sumner,

                      That’s easy: from the URL.

                      https://notepad-plus-plus.org/community/topic/16392/check-which-numbers-are-not-included/9
                                                                    ^^^^^
                      
                      1 Reply Last reply Reply Quote 2
                      8 out of 10
                      • First post
                        8/10
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors