Regex - Select the first 10 lines (included blank ones) in 200 files



  • I try a regex to select in more then 200 .txt, the first 10 lines (included blank ones). I will use Search and Replace all. I try something like this, but doesn’t work:

    ^\A(.*\n){10}



  • Hi, vasile,

    Are your 200 files be Windows files with CR LF line break or only LF like in Unix files ?

    If you’re using the default Windows End of Line, I think that a correct regex would be \A(?-s)(?:^.*\R){10}

    I just used a non-capturing group as I supposed that the contents of the tenth line are not needed ! If you need to re-use all its ten-lines block, in replacement, just surround it with parentheses : \A(?-s)((?:^.*\R){10})

    BEWARE : Due to the very bad handling of the backward assertions, by our present Boost regex engine, this regex, WRONGLY, matches all the 10 lines-blocks, of the current file, instead of matching the first 10 lines of the file, ONLY :-((

    I tried to find out an equivalent regex, which would not use the \A syntax, but this is just impossible !

    So, unfortunately, the remaining solutions are :

    • Use the François-R Boyer regex engine with a N++ version <= 6.9, which correctly handle backwards assertions

    • Use a Lua or Python script ( Moreover, I know that the Python regex engine, correctly, support the \A assertion )

    • Create a N++ macro, to select the first ten lines of each file

    Best Regards

    guy038



  • hello guy38. Happy new year…almost !!

    So, your regex are good…but doesn’t work to well on Replace All. Practically will replace 10 by 10 lines, until the end of file, and this will delete all the file.

    But, I find a very good answer at the internet, the best answer:

    Remove the first 10 lines, but you will need to match the whole document text but only capure lines from 11 to the end.

    Find: \A.*(?:\R.*){9}\R?([\s\S]*)
    Replace: $1

    To remove the last 10 lines:

    Find: ^.*(?:\R.*){9}\z
    Replace: empty



  • Hi, Vasile,

    I already thought about this nice solution : To catch all the contents of the file and, with parentheses grouping, split these contents in two parts the first ten lines, in one part and the remainder of the file, as the other part

    I didn’t speak about this solution for two reasons :

    • Firstly, you didn’t say, in your previous post, what you wanted to do, exactly, about theses 10 lines, through your 200 files

    • Secondly, I didn’t know the medium size of your files, and, as the group 1 must catch, practically, all the contents of each file, I suppose that it would be a problem to apply such a regex on a very big file !?

    My regex version, which gives the same result, would be :

    SEARCH \A(?-s)(?:^.*\R){10}(?s)(.*)

    REPLACE \1

    I just added, the part (?s)(.*), which stores any text, after the first ten lines, in group 1

    These new regexes certainly work fine, when using the Find in Files feature. However, for a single file, be aware about the behaviour of my regex and your regex :

    • If you redo the S/R, it just deletes the next other block of ten lines ! ( So the initial lines from 11 to 20 ) Quite normal, as the very beginning of the file has changed !

    • If you move the caret at beginning of a line, let’s say, in the middle of the file, it will, simply, delete the next ten lines, too !


    For deleting the last ten lines of a file, I propose :

    SEARCH (?:^.*\R){10}\z

    REPLACE Empty

    Of course, as the forward assertions, like \z are not bugged, the regex is more simple :-))

    BTW, your regex, for deleting the last lines of a file, seems to delete 9 lines, only !

    Cheers,

    guy038



  • your last regex , to delete the last 10 line (?:^.*\R){10}\z doesn’t work at my place.



  • Vasile, and All,

    Ah yes ! I suppose that you tested my regex against a file, whose the last line doest not end with a line break ! We have to slightly change the regex by adding an ? after \R

    So, the final general regex to delete the Nth last lines, of a file, should be (?:^.*\R?){N}\z :-)) Just replace le upper-case letter N by the suitable integer !

    It gives me the opportunity to wish you, and all people of the N++ Community, a very Happy New Year 2017 !

    Although, the very recent events , in Turkey, means that 2017 is not going to be better than 2016 :-(( What an insult to Intelligence and Wisdom !

    Cheers

    guy038



  • (?:^.*\R?){10}\z

    WORKS ! Thanks


Log in to reply