regex help with reverse line

namx3249

hi everybody,
i need to reverse line on my document with over 5000 lines. work fine Reverse line order (from Line operation) but i need to select manually each block so not easy for me!
i have block like this one:

(blank empty line)
aaaaaaa
bbbbbb
ccccccc
(blank empty line)
eeeeeeeee
wwwwww
ppppppp
(blank empty line)

i need to reverse line 2 with line 4 like this one:

(blank empty line)
ccccccc
bbbbbb
aaaaaaa
(blank empty line)
ppppppp
wwwwww
eeeeeeeee
(blank empty line)

now my current regex is:
find what: (?-s)^(^.*\R)(.+\R)((?:.+\R)*?)(.+\R)(?=^)
replace: \1\4\3\2
but this regex reverse line 2 with 3, not 2 with 4
what wrong on my regex? thanks for your attention

Alan Kilborn

@namx3249 said in regex help with reverse line:

(blank empty line)
aaaaaaa
bbbbbb
ccccccc
(blank empty line)
eeeeeeeee
wwwwww
ppppppp
(blank empty line)

I see no reason that this won’t work on your data:

Find: ^$blank empty line$\R\K(.+\R)(.+\R)(.+\R)
Replace: ${3}${2}${1}
Search mode: Regular expression

namx3249

thanks for your reply, but maybe my mistake, i expressed badly myself: with (blank empty line) i mean blank line like this one:


aaaaaaa
bbbbbb
ccccccc

eeeeeeeee
wwwwww
ppppppp

sorry for that

Mark Olson

@namx3249
find/replace (.+\R)(.+\R)(.+)(\R)? with \3(?4\4:\r\n)\2\1 aught to work fine.

Note that the much simpler find/replace (.+\R)(.+\R)(.+\R) with \3\2\1 will also work fine, but it will miss the edge case where the EOF comes after the third line in the group.

namx3249

oh great. both regex work fine for my document.
only i don’t understand from second simply regex:

… it will miss the edge case where the EOF comes after the third line in the group

what is EOF ? sorry for my ignorance …!

Mark Olson

@namx3249
EOF = end of file

namx3249

ok understand.

another small question: if i want put each block line to single line only, like this one:

aaaaaaa bbbbbb ccccccc

eeeeeeeee wwwwww ppppppp

which regex does this? works well with Line Operations - Join Lines but this requires selecting each individual block and this isn’t easy to do if the document has many blocks

PeterJones

@namx3249 ,

Assuming that you want to collapse any newline that isn’t preceded or followed by a newline into a space, I would say,

FIND = (?<!\r\n)(?!\A)(\r\n)(?!\r\n)
REPLACE = \x20
SEARCH MODE = regular expression

The search says “replace any Windows newline not preceded by newline; not preceded by beginning of the file, and not followed by a newline” and the replacement is a single space character (I showed an escape, because it can be copy/pasted easily from the forum; really, you could just type the space character in your replacement field.)

before


aaaaaaa
bbbbbb
ccccccc

eeeeeeeee
wwwwww
ppppppp

after


aaaaaaa bbbbbb ccccccc

eeeeeeeee wwwwww ppppppp

namx3249

wow, awesome. thank you for this regex

I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

Luckily for me there is this wonderful forum!
Thank you all for your attention and greetings to everybody

PeterJones

@namx3249 said in regex help with reverse line:

wow, awesome. thank you for this regex

I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

There is more than one way to do it.

FIND = (.+)\R(.+)\R(.+)$
REPLACE = $1 $2 $3
un-checkmark . matches newline

would do it, too. My previous solution made no assumption about number of lines in each block. This variant assumes exactly three in a block.

namx3249

oh nice. thanks for this clarification !
i hope, step by step, to understand the amazing world of regex, very hard for me !
Regards

sky 247

The regular expression you provided is reversing the second and third lines because of the way it captures the text blocks. To reverse the second and fourth lines, you need to modify the regular expression to capture the second and fourth lines as separate groups. Here’s a modified version of the regular expression that should work:

Find what: (?-s)^(\h*\R|\R?\n)(.+)(\R)(.+)(\h*\R|\R?\n)

Replace with: \1\4\3\2\5

Explanation of the regular expression:

(?-s) - Disables dot-matches-all mode.

^ - Matches the start of a line.

(\h*\R|\R?\n) - Matches a blank line. \h* matches zero or more horizontal whitespace characters, and \R|\R?\n matches either a line break sequence (CR, LF, CRLF, or Unicode line separator) or an optional CR followed by an LF.

(.+) - Matches the first non-blank line and captures it in group 2.

(\R) - Matches the line break sequence after the first non-blank line and captures it in group 3.

(.+) - Matches the second non-blank line and captures it in group 4.

(\h*\R|\R?\n) - Matches another blank line.

The replace pattern:

\1\4\3\2\5 - Replaces the match with the captured groups in the desired order. The first and fifth groups (the blank lines) remain in their original positions, and the second and fourth groups (the non-blank lines) are swapped by reversing their order. click here to see live example

I hope this helps! Let me know if you have any further questions.

guy038

Hello, @namx3249, @alan-kilborn, @mark-olson, @peterjones, @sky-247 and All,

I found out a general method to reverse the lines of sections, separated with a pure empty line :

Whatever the number of lines of each section
Whatever the number of sections

Let’s go :

We start with the following INPUT text :


01
02
03
04

aaaaa
bbbbb
ccccc
ddddd
eeeee
fffff
ggggg
hhhhh
iiiii

05
06

07
08
09
10
11

01
02
03

FIRST Line
Second line
Third line
Fourth line
Fifth line
LAST line

Note the empty line at the very beginning of the data ! ( Important )

With this first regex S/R, we replace any EOL chars, not followed with other EOL chars, with a colon character

SEARCH (?x) \R (?! ^\R )
REPLACE :

We get this temporary text :

:01:02:03:04
:aaaaa:bbbbb:ccccc:ddddd:eeeee:fffff:ggggg:hhhhh:iiiii
:05:06
:07:08:09:10:11
:01:02:03
:FIRST Line:Second line:Third line:Fourth line:Fifth line:LAST line

As you can see :

Any section is rewritten in a single line
Any previous line is simply preceded with a colon character
Any line must end with text without a colon character

Now, with this second regex S/R, we separate each line in two parts :

A first part between the first colon of the line and right before the last colon
A second part from after the last colon till the end of current line
In the replacement phase, we rewrite these two parts, in reverse order, with a leading slash

SEARCH (?x) ( : .+ ) : ( .+ )

REPLACE /\2\1

Click of the Replace All button as many times as the maximum number of lines in sections

Regarding our example, you should click nine times on the Replace All button !

You may also hit the Alt + A shortcut, repeatedly, till the message Replace All: 0 occurrence were replaced... occurs

And we get this temporary text below :

/04/03/02:01
/iiiii/hhhhh/ggggg/fffff/eeeee/ddddd/ccccc/bbbbb:aaaaa
/06:05
/11/10/09/08:07
/03/02:01
/LAST line/Fifth line/Fourth line/Third line/Second line:FIRST Line

Finally, let’s come back to the normal displaying of your data, with this third regex S/R which simply replaces the colon and slash chracters with a line-break

SEARCH [:/]

REPLACE \r\n

Anc here is your expected OUTPUT text :


04
03
02
01

iiiii
hhhhh
ggggg
fffff
eeeee
ddddd
ccccc
bbbbb
aaaaa

06
05

11
10
09
08
07

03
02
01

LAST line
Fifth line
Fourth line
Third line
Second line
FIRST Line

Notes :

The trivial cases, of a single data section only or sections of one line only, are correctly handled, too !
Any additional line-breaks, between sections, are preserved in your OUTPUT text
Of course, you can use any char, instead of the colon and the slash characters :
- Provided that they cannot be found in your present INPUT data
- Provided that you modify the regexes, accordingly
As said above, the second regex S/R needs N successive searches/replacements, where N is the number of lines of the longest section, in your data
BTW, if you redo all the same process, you get the original order of each section !!

Best Regards,

guy038