Community
    • Login

    regex help with reverse line

    Scheduled Pinned Locked Moved General Discussion
    13 Posts 6 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • namx3249N
      namx3249
      last edited by

      thanks for your reply, but maybe my mistake, i expressed badly myself: with (blank empty line) i mean blank line like this one:

      
      aaaaaaa
      bbbbbb
      ccccccc
      
      eeeeeeeee
      wwwwww
      ppppppp
      
      

      sorry for that

      Mark OlsonM 1 Reply Last reply Reply Quote 1
      • Mark OlsonM
        Mark Olson @namx3249
        last edited by

        @namx3249
        find/replace (.+\R)(.+\R)(.+)(\R)? with \3(?4\4:\r\n)\2\1 aught to work fine.

        Note that the much simpler find/replace (.+\R)(.+\R)(.+\R) with \3\2\1 will also work fine, but it will miss the edge case where the EOF comes after the third line in the group.

        1 Reply Last reply Reply Quote 3
        • namx3249N
          namx3249
          last edited by

          oh great. both regex work fine for my document.
          only i don’t understand from second simply regex:

          … it will miss the edge case where the EOF comes after the third line in the group

          what is EOF ? sorry for my ignorance …!

          Mark OlsonM 1 Reply Last reply Reply Quote 0
          • Mark OlsonM
            Mark Olson @namx3249
            last edited by

            @namx3249
            EOF = end of file

            1 Reply Last reply Reply Quote 3
            • namx3249N
              namx3249
              last edited by namx3249

              ok understand.

              another small question: if i want put each block line to single line only, like this one:

              aaaaaaa bbbbbb ccccccc
              
              eeeeeeeee wwwwww ppppppp
              

              which regex does this? works well with Line Operations - Join Lines but this requires selecting each individual block and this isn’t easy to do if the document has many blocks

              PeterJonesP 1 Reply Last reply Reply Quote 0
              • PeterJonesP
                PeterJones @namx3249
                last edited by PeterJones

                @namx3249 ,

                Assuming that you want to collapse any newline that isn’t preceded or followed by a newline into a space, I would say,

                • FIND = (?<!\r\n)(?!\A)(\r\n)(?!\r\n)
                • REPLACE = \x20
                • SEARCH MODE = regular expression

                The search says “replace any Windows newline not preceded by newline; not preceded by beginning of the file, and not followed by a newline” and the replacement is a single space character (I showed an escape, because it can be copy/pasted easily from the forum; really, you could just type the space character in your replacement field.)

                before

                
                aaaaaaa
                bbbbbb
                ccccccc
                
                eeeeeeeee
                wwwwww
                ppppppp
                
                

                after

                
                aaaaaaa bbbbbb ccccccc
                
                eeeeeeeee wwwwww ppppppp 
                
                
                1 Reply Last reply Reply Quote 3
                • namx3249N
                  namx3249
                  last edited by

                  wow, awesome. thank you for this regex

                  I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

                  Luckily for me there is this wonderful forum!
                  Thank you all for your attention and greetings to everybody

                  PeterJonesP 1 Reply Last reply Reply Quote 1
                  • namx3249N namx3249 referenced this topic on
                  • PeterJonesP
                    PeterJones @namx3249
                    last edited by

                    @namx3249 said in regex help with reverse line:

                    wow, awesome. thank you for this regex

                    I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

                    There is more than one way to do it.

                    • FIND = (.+)\R(.+)\R(.+)$
                    • REPLACE = $1 $2 $3
                    • un-checkmark . matches newline

                    would do it, too. My previous solution made no assumption about number of lines in each block. This variant assumes exactly three in a block.

                    1 Reply Last reply Reply Quote 3
                    • namx3249N
                      namx3249
                      last edited by

                      oh nice. thanks for this clarification !
                      i hope, step by step, to understand the amazing world of regex, very hard for me !
                      Regards

                      1 Reply Last reply Reply Quote 1
                      • sky 247S
                        sky 247
                        last edited by

                        The regular expression you provided is reversing the second and third lines because of the way it captures the text blocks. To reverse the second and fourth lines, you need to modify the regular expression to capture the second and fourth lines as separate groups. Here’s a modified version of the regular expression that should work:

                        Find what: (?-s)^(\h*\R|\R?\n)(.+)(\R)(.+)(\h*\R|\R?\n)

                        Replace with: \1\4\3\2\5

                        Explanation of the regular expression:

                        (?-s) - Disables dot-matches-all mode.

                        ^ - Matches the start of a line.

                        (\h*\R|\R?\n) - Matches a blank line. \h* matches zero or more horizontal whitespace characters, and \R|\R?\n matches either a line break sequence (CR, LF, CRLF, or Unicode line separator) or an optional CR followed by an LF.

                        (.+) - Matches the first non-blank line and captures it in group 2.

                        (\R) - Matches the line break sequence after the first non-blank line and captures it in group 3.

                        (.+) - Matches the second non-blank line and captures it in group 4.

                        (\h*\R|\R?\n) - Matches another blank line.

                        The replace pattern:

                        \1\4\3\2\5 - Replaces the match with the captured groups in the desired order. The first and fifth groups (the blank lines) remain in their original positions, and the second and fourth groups (the non-blank lines) are swapped by reversing their order. click here to see live example

                        I hope this helps! Let me know if you have any further questions.

                        1 Reply Last reply Reply Quote 1
                        • guy038G
                          guy038
                          last edited by guy038

                          Hello, @namx3249, @alan-kilborn, @mark-olson, @peterjones, @sky-247 and All,

                          I found out a general method to reverse the lines of sections, separated with a pure empty line :

                          • Whatever the number of lines of each section

                          • Whatever the number of sections


                          Let’s go :

                          We start with the following INPUT text :

                          
                          01
                          02
                          03
                          04
                          
                          aaaaa
                          bbbbb
                          ccccc
                          ddddd
                          eeeee
                          fffff
                          ggggg
                          hhhhh
                          iiiii
                          
                          05
                          06
                          
                          07
                          08
                          09
                          10
                          11
                          
                          01
                          02
                          03
                          
                          FIRST Line
                          Second line
                          Third line
                          Fourth line
                          Fifth line
                          LAST line
                          

                          Note the empty line at the very beginning of the data ! ( Important )


                          With this first regex S/R, we replace any EOL chars, not followed with other EOL chars, with a colon character

                          • SEARCH (?x) \R (?! ^\R )

                          • REPLACE :

                          We get this temporary text :

                          :01:02:03:04
                          :aaaaa:bbbbb:ccccc:ddddd:eeeee:fffff:ggggg:hhhhh:iiiii
                          :05:06
                          :07:08:09:10:11
                          :01:02:03
                          :FIRST Line:Second line:Third line:Fourth line:Fifth line:LAST line
                          

                          As you can see :

                          • Any section is rewritten in a single line

                          • Any previous line is simply preceded with a colon character

                          • Any line must end with text without a colon character


                          Now, with this second regex S/R, we separate each line in two parts :

                          • A first part between the first colon of the line and right before the last colon

                          • A second part from after the last colon till the end of current line

                          • In the replacement phase, we rewrite these two parts, in reverse order, with a leading slash

                          SEARCH (?x) ( : .+ ) : ( .+ )

                          REPLACE /\2\1

                          Click of the Replace All button as many times as the maximum number of lines in sections

                          Regarding our example, you should click nine times on the Replace All button !

                          You may also hit the Alt + A shortcut, repeatedly, till the message Replace All: 0 occurrence were replaced... occurs

                          And we get this temporary text below :

                          /04/03/02:01
                          /iiiii/hhhhh/ggggg/fffff/eeeee/ddddd/ccccc/bbbbb:aaaaa
                          /06:05
                          /11/10/09/08:07
                          /03/02:01
                          /LAST line/Fifth line/Fourth line/Third line/Second line:FIRST Line
                          

                          Finally, let’s come back to the normal displaying of your data, with this third regex S/R which simply replaces the colon and slash chracters with a line-break

                          SEARCH [:/]

                          REPLACE \r\n

                          Anc here is your expected OUTPUT text :

                          
                          04
                          03
                          02
                          01
                          
                          iiiii
                          hhhhh
                          ggggg
                          fffff
                          eeeee
                          ddddd
                          ccccc
                          bbbbb
                          aaaaa
                          
                          06
                          05
                          
                          11
                          10
                          09
                          08
                          07
                          
                          03
                          02
                          01
                          
                          LAST line
                          Fifth line
                          Fourth line
                          Third line
                          Second line
                          FIRST Line
                          

                          Notes :

                          • The trivial cases, of a single data section only or sections of one line only, are correctly handled, too !

                          • Any additional line-breaks, between sections, are preserved in your OUTPUT text

                          • Of course, you can use any char, instead of the colon and the slash characters :

                            • Provided that they cannot be found in your present INPUT data

                            • Provided that you modify the regexes, accordingly

                          • As said above, the second regex S/R needs N successive searches/replacements, where N is the number of lines of the longest section, in your data

                          • BTW, if you redo all the same process, you get the original order of each section !!

                          Best Regards,

                          guy038

                          1 Reply Last reply Reply Quote 2
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors