Can I find missing lines with regular expression?



  • hello, and happy new year everyone. I have a question. Is it possible to find missing lines with regular expression?

    So, I have a lot of lines with numbers from 1 to 2000. There are some id’s extracted from a ranking php script. And I want to find out if I missed a number.

    In this example I write only 8 lines, with different numbers. 2 numbers are missing 5 and 9.

    1
    2
    3
    4
    6
    7
    8
    10

    My desire output result [5,9]. So, to do this, I believe I should use a regex that will count the lines of notepad and the numbers that I have at each line. Then compare. And if there is a number missing, the cursor should stop at the line before that is missing. Something like this.

    Can anyone help me?



  • Hello, @robin-cruise, and All,

    Happy Near Year to you and to any N++ user, too !

    Regarding your problem, if all lines of your file begin with a line number, here is my solution :

    • Copy/paste your file contents in a new tab

    • At the end of this new tab, after your contents, add 2,000 blank lines, with multiple line-break and/or several Ctrl-C/Ctrl-V actions

    • Now, move again to the first blank line, after your contents

    • With the column editor ( ALT + C ), insert increasing numbers to all the remaining blank lines ( Check the leading zeros option if necessary and verify that the final number is equal to the total of your contents ( 2000 )

    • On the entire file, perform a classical sort ( option Edit > Line Operations > Sort lines Lexicographically Ascending )

    • Now, either :

      • Move to any missing line on every occurrence of the search regex ^(\d+)\R(?!\1)

      • Execute the regex S/R, below, to see the complete list of missing lines :

    SEARCH : (?-s)^(?:(\d+)\R\1.*\R)+

    REPLACE Leave EMPTY

    Hope that these two work-around will helps you ;-))

    Best Regards,

    guy038



  • hello, @guy038 . I follow all your steps. I have numbers from 5,000 to 6,090. Edit > Line Operations > Sort lines Lexicographically Ascending. Ok. Now, If I search ^(\d+)\R(?!\1) it’s ok, jumps over the duplicate numbers or over those lines that are not in order numbers. Works great. Beautiful.

    But the second regex, does not working. Please check this out. https://snag.gy/VC7LKk.jpg



  • sorry, both regex are working fine. Thank you very much.


Log in to reply