Keep 1st,4th,8th,12th.....so on lines, multiple of 4 and delete rest.



  • I have around 500 text files with average 150 urls lines each.
    I want to keep only 1st,4th, 8th …so on lines and delete rest line
    how i can do this? sure i can create macros to perform on multiple text files if know how to do this on one file.
    Thanks.



  • @R-k said in Keep 1st,4th,8th,12th.....so on lines, multiple of 4 and delete rest.:

    I want to keep only 1st,4th, 8th …so on lines and delete rest line

    I have a solution. It involves adding an incrementing number to the start of the lines which repeat 4 times. Then a regex (regular expression) will allow you to mark the first 3 of each set of 4. Then once you confirm that the marked lines are correct (mainly need to verify 1st and last few lines as they aren’t exact multiples of 4) you can remove them.

    So the steps are:

    1. Place cursor in the first position in the first line. Select “Column editor” under the Edit Menu and select “Number to Insert”, start of 1, increase by 1 and repeat 4. Also select leading zeroes as this will keep the length of the number constant (useful for later).
      Select OK to create the numbers.
    2. Now using the “Mark” function (usually Ctrl-M) use the regex
      Find What:^(\d+).+\R(?=\1)
      tick the bookmark line and click “Mark All”.
    3. Verify the lines marked are correct. You WILL need to unmark line 1 and also check the last few lines in each file, especially as it’s unlikely they will end on an exact multiple of 4.
    4. Once the marked lines are correct you can use the “Remove Bookmarked Lines” which is under Search menu, Bookmark.
    5. Now you will need to remove the numbers added to the start of the line. This can be achieved a number of ways, regex is one (Find What ^\d+ and empty replace with field), column deletion is another (hold ctrl and alt keys, then mouse over the first column down the length of the file, then delete).
      Please come back afterwards and let us know how you got on.

    Terry



  • @Terry-R said in Keep 1st,4th,8th,12th.....so on lines, multiple of 4 and delete rest.:

    I have a solution.

    Bear in mind that this works well on 1 file. As for setting it up for 500 files, I’ll leave that up to you. I think’s it’s possible, however it will likely involve some manual steps unless you know exactly how to deal with 1st and last few lines progrmmatically.

    Terry



  • @Terry-R @Terry-R It worked Thanks!, Actually there was lot of additional work before this. The 500 text files I am talking about were .html page sources with thousands of lines. I kept the lines with direct links i wanted and delete rest source code, I used lot of search,regex, mark, bookmark steps. Ended with plain .jpg urls links on average 150 images links in each file.

    After following your steps now every file have 30-40 links, exactly what i wanted. It was not necessary to exactly keep 1,4,8 line multiples of 4, i just wanted to loose from 150 to 30 or 40 lines.

    I also found this before posting this question.
    Find what: .+\R(.+)
    Replace with: $1 It can run it two times to get desired results.

    I saved all steps in Macros. It works on single text file. But how i can run it on multiple files at once?
    My last step in macros was Ctrl+Tab.

    Edit: When i run macros multiple files, i get empty files. It delets every thing. But works file on single text file.



  • @R-k said in Keep 1st,4th,8th,12th.....so on lines, multiple of 4 and delete rest.:

    When i run macros multiple files, i get empty files. It delets every thing. But works file on single text file.

    I actually came up with another idea. When used with the “Find In Files” function it should work.
    You would record the following in the macro.

    1. Ctrl + Home keys (takes you to the first line)
    2. Line down by using down arrow, cursor should be in the first position.
    3. Add a space and a carriage return/line feed
    4. Use the “Find In Files” function with
      Find What:(?-s)^(.+\R){3}(.+\R)
      Replace With:\2
      This selects groups of 4 lines and leaves just the last one. As the first line is now followed by a blank line it’s the blank line that is deleted, not the first line.
    5. As nothing has been mentioned with regards any remaining lines (less than 4) this step is yet to be determined. It could be anther regex for when 3 lines exist, or 2 lines, or even just 1 line, returning how many required and deleting the others.

    So for your next question, likely you get empty files as using ctrl-tab cycles through the open files within Notepad++. f you don’t close (and save) that tab you will find it cycles back to that tab again I think and performs the same operation again. Although I’d suggest it would leave just 1 line in each tab, yes?

    When recording a macro you select the function, then supply the regex. I think a better option would be to use the Find in Files function. This will work through files in 1 or more folders. The files don’t need to be opened in Notepad++. So you will need to create a new macro using that function, unless you are good at editing the macro, they are fiendish to do so.

    Terry


Log in to reply