How to remove X numbers of lines and put them one line above?
-
Hello… First of all, thank you very much for your time… What I’m trying to do is… I have for example a .txt with 100 lines in this format…
Test1
Test2
Test3
Test4
Test5
Test6
Test7
Test8
Test9
Test10
Test11
Test12
…What I want to achive is this…
Test1Test2Test3Test4
Test5Test6Test7Test8
Test9Test10Test11Test12In this example the cutoff point was every 4 lines… But want to do this for example every 3000 lines, in a 100000 lines file.
Thank you very much for your time.
-
(All regex below require the Search Mode = Regular expression is selected.)
Solving the simplified version is easy, because you can keep four lines in regex buffer memory with no problem.
- FIND WHAT =
(?-s)^(.*)\R(.*)\R(.*)\R(.*)\R
This will find 4 lines, putting the contents into group#1-4 (the four sets of parentheses) - REPLACE WITH =
$1$2$3$4\r\n
This will place those four lines all on one line, - REPLACE ALL
- You might have to manually join the last 3-4 lines, because I didn’t add in the logic to handle it, especially since I present a multi-step process below that will always work
But if you’re really talking about merging 3000 lines each, with 100000 lines in the file, it’s likely to overrun memory.
Using this three-step process, I was able to create a file of about 160k lines (16384 groups of 10 lines each), and join every 3000 lines into a single line.
- Use
Ctrl+Home
to go to the first line in the file. - Add a special character every 3000 lines (I chose the group separator ASCII character
GS
, Unicode codepoint U+001D):- FIND WHAT =
(?-s)^(.*\R){3000}\K
This will find 3000 lines, then reset the search so the regex cursor is after those 3000 lines.
If you want a different number than 3000, put it inside the curly braces in that expression, where the3000
is now. - REPLACE WITH =
\x{001D}
This is the group separator character, chosen because of its ASCII meaning, and because it’s not likely to be in your text file already. - REPLACE ALL
Because the regex uses\K
, it has to use REPLACE ALL.
With thousands of lines, this will take a long time.
- FIND WHAT =
- Remove all newlines
- FIND WHAT =
\R
This expression is shorthand for\r\n
(windows CRLF line ending),\n
(linux LF line ending) or\n
(old Mac CR line ending) - REPLACE WITH = (leave empty)
- REPLACE ALL
- FIND WHAT =
- Convert the
GS
characters into newlines- FIND WHAT =
\x{001D}
- REPLACE WITH =
\r\n
- REPLACE ALL
- FIND WHAT =
- FIND WHAT =
-
I forgot to add the links the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ.
-
@PeterJones ,
Hello Peter… Thank you for your response and the info… I am grateful for it … I believe it works flawlessly!..I tried the simplified version first …
What was important was to merge about 3000 lines each, that was the point… The 100000 lines can be adapted on the run.
But… In any case your solution seems to be very performance.
Tryed it with 60000 lines… 3000 lines each…
Point 2: Was instant.
Point 3: I think it taked about 10 seconds… not much more.
Point 4: I think it tooked about 5 seconds… not much more.Have not tried yet with 100000 lines in a run. Only with 60000 lines.
Thank you very much!.
-
A small script for the jN plugin that will do this.
It will also add a menu “scripts jN” -> “convertText”
https://github.com/sieukrem/jn-npp-pluginvar scriptsMenu_jN = Editor.addMenu("scripts jN"); function convertText() { //debugger; var vStep = 0; var vText = '', vTextLine = ''; var vLines = Editor.currentView.lines; for(var i = 0; i< vLines.count; i++, vStep++) { vTextLine = vLines.get(i).text; if (vTextLine.charCodeAt(vTextLine.length-1) == 10) vTextLine = vTextLine.substring(0, vTextLine.length-1); // kill \n if(vStep < 4) { vText = vText + vTextLine; } else { vText = vText + "\n" + vTextLine; vStep = 0; } } Editor.currentView.text = vText; } scriptsMenu_jN.addItem({ text:"convertText", cmd:convertText });