Community
    • Login

    regex help with reverse line

    Scheduled Pinned Locked Moved General Discussion
    13 Posts 6 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Mark OlsonM
      Mark Olson @namx3249
      last edited by

      @namx3249
      find/replace (.+\R)(.+\R)(.+)(\R)? with \3(?4\4:\r\n)\2\1 aught to work fine.

      Note that the much simpler find/replace (.+\R)(.+\R)(.+\R) with \3\2\1 will also work fine, but it will miss the edge case where the EOF comes after the third line in the group.

      1 Reply Last reply Reply Quote 3
      • namx3249N
        namx3249
        last edited by

        oh great. both regex work fine for my document.
        only i don’t understand from second simply regex:

        … it will miss the edge case where the EOF comes after the third line in the group

        what is EOF ? sorry for my ignorance …!

        Mark OlsonM 1 Reply Last reply Reply Quote 0
        • Mark OlsonM
          Mark Olson @namx3249
          last edited by

          @namx3249
          EOF = end of file

          1 Reply Last reply Reply Quote 3
          • namx3249N
            namx3249
            last edited by namx3249

            ok understand.

            another small question: if i want put each block line to single line only, like this one:

            aaaaaaa bbbbbb ccccccc
            
            eeeeeeeee wwwwww ppppppp
            

            which regex does this? works well with Line Operations - Join Lines but this requires selecting each individual block and this isn’t easy to do if the document has many blocks

            PeterJonesP 1 Reply Last reply Reply Quote 0
            • PeterJonesP
              PeterJones @namx3249
              last edited by PeterJones

              @namx3249 ,

              Assuming that you want to collapse any newline that isn’t preceded or followed by a newline into a space, I would say,

              • FIND = (?<!\r\n)(?!\A)(\r\n)(?!\r\n)
              • REPLACE = \x20
              • SEARCH MODE = regular expression

              The search says “replace any Windows newline not preceded by newline; not preceded by beginning of the file, and not followed by a newline” and the replacement is a single space character (I showed an escape, because it can be copy/pasted easily from the forum; really, you could just type the space character in your replacement field.)

              before

              
              aaaaaaa
              bbbbbb
              ccccccc
              
              eeeeeeeee
              wwwwww
              ppppppp
              
              

              after

              
              aaaaaaa bbbbbb ccccccc
              
              eeeeeeeee wwwwww ppppppp 
              
              
              1 Reply Last reply Reply Quote 3
              • namx3249N
                namx3249
                last edited by

                wow, awesome. thank you for this regex

                I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

                Luckily for me there is this wonderful forum!
                Thank you all for your attention and greetings to everybody

                PeterJonesP 1 Reply Last reply Reply Quote 1
                • namx3249N namx3249 referenced this topic on
                • PeterJonesP
                  PeterJones @namx3249
                  last edited by

                  @namx3249 said in regex help with reverse line:

                  wow, awesome. thank you for this regex

                  I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

                  There is more than one way to do it.

                  • FIND = (.+)\R(.+)\R(.+)$
                  • REPLACE = $1 $2 $3
                  • un-checkmark . matches newline

                  would do it, too. My previous solution made no assumption about number of lines in each block. This variant assumes exactly three in a block.

                  1 Reply Last reply Reply Quote 3
                  • namx3249N
                    namx3249
                    last edited by

                    oh nice. thanks for this clarification !
                    i hope, step by step, to understand the amazing world of regex, very hard for me !
                    Regards

                    1 Reply Last reply Reply Quote 1
                    • sky 247S
                      sky 247
                      last edited by

                      The regular expression you provided is reversing the second and third lines because of the way it captures the text blocks. To reverse the second and fourth lines, you need to modify the regular expression to capture the second and fourth lines as separate groups. Here’s a modified version of the regular expression that should work:

                      Find what: (?-s)^(\h*\R|\R?\n)(.+)(\R)(.+)(\h*\R|\R?\n)

                      Replace with: \1\4\3\2\5

                      Explanation of the regular expression:

                      (?-s) - Disables dot-matches-all mode.

                      ^ - Matches the start of a line.

                      (\h*\R|\R?\n) - Matches a blank line. \h* matches zero or more horizontal whitespace characters, and \R|\R?\n matches either a line break sequence (CR, LF, CRLF, or Unicode line separator) or an optional CR followed by an LF.

                      (.+) - Matches the first non-blank line and captures it in group 2.

                      (\R) - Matches the line break sequence after the first non-blank line and captures it in group 3.

                      (.+) - Matches the second non-blank line and captures it in group 4.

                      (\h*\R|\R?\n) - Matches another blank line.

                      The replace pattern:

                      \1\4\3\2\5 - Replaces the match with the captured groups in the desired order. The first and fifth groups (the blank lines) remain in their original positions, and the second and fourth groups (the non-blank lines) are swapped by reversing their order. click here to see live example

                      I hope this helps! Let me know if you have any further questions.

                      1 Reply Last reply Reply Quote 1
                      • guy038G
                        guy038
                        last edited by guy038

                        Hello, @namx3249, @alan-kilborn, @mark-olson, @peterjones, @sky-247 and All,

                        I found out a general method to reverse the lines of sections, separated with a pure empty line :

                        • Whatever the number of lines of each section

                        • Whatever the number of sections


                        Let’s go :

                        We start with the following INPUT text :

                        
                        01
                        02
                        03
                        04
                        
                        aaaaa
                        bbbbb
                        ccccc
                        ddddd
                        eeeee
                        fffff
                        ggggg
                        hhhhh
                        iiiii
                        
                        05
                        06
                        
                        07
                        08
                        09
                        10
                        11
                        
                        01
                        02
                        03
                        
                        FIRST Line
                        Second line
                        Third line
                        Fourth line
                        Fifth line
                        LAST line
                        

                        Note the empty line at the very beginning of the data ! ( Important )


                        With this first regex S/R, we replace any EOL chars, not followed with other EOL chars, with a colon character

                        • SEARCH (?x) \R (?! ^\R )

                        • REPLACE :

                        We get this temporary text :

                        :01:02:03:04
                        :aaaaa:bbbbb:ccccc:ddddd:eeeee:fffff:ggggg:hhhhh:iiiii
                        :05:06
                        :07:08:09:10:11
                        :01:02:03
                        :FIRST Line:Second line:Third line:Fourth line:Fifth line:LAST line
                        

                        As you can see :

                        • Any section is rewritten in a single line

                        • Any previous line is simply preceded with a colon character

                        • Any line must end with text without a colon character


                        Now, with this second regex S/R, we separate each line in two parts :

                        • A first part between the first colon of the line and right before the last colon

                        • A second part from after the last colon till the end of current line

                        • In the replacement phase, we rewrite these two parts, in reverse order, with a leading slash

                        SEARCH (?x) ( : .+ ) : ( .+ )

                        REPLACE /\2\1

                        Click of the Replace All button as many times as the maximum number of lines in sections

                        Regarding our example, you should click nine times on the Replace All button !

                        You may also hit the Alt + A shortcut, repeatedly, till the message Replace All: 0 occurrence were replaced... occurs

                        And we get this temporary text below :

                        /04/03/02:01
                        /iiiii/hhhhh/ggggg/fffff/eeeee/ddddd/ccccc/bbbbb:aaaaa
                        /06:05
                        /11/10/09/08:07
                        /03/02:01
                        /LAST line/Fifth line/Fourth line/Third line/Second line:FIRST Line
                        

                        Finally, let’s come back to the normal displaying of your data, with this third regex S/R which simply replaces the colon and slash chracters with a line-break

                        SEARCH [:/]

                        REPLACE \r\n

                        Anc here is your expected OUTPUT text :

                        
                        04
                        03
                        02
                        01
                        
                        iiiii
                        hhhhh
                        ggggg
                        fffff
                        eeeee
                        ddddd
                        ccccc
                        bbbbb
                        aaaaa
                        
                        06
                        05
                        
                        11
                        10
                        09
                        08
                        07
                        
                        03
                        02
                        01
                        
                        LAST line
                        Fifth line
                        Fourth line
                        Third line
                        Second line
                        FIRST Line
                        

                        Notes :

                        • The trivial cases, of a single data section only or sections of one line only, are correctly handled, too !

                        • Any additional line-breaks, between sections, are preserved in your OUTPUT text

                        • Of course, you can use any char, instead of the colon and the slash characters :

                          • Provided that they cannot be found in your present INPUT data

                          • Provided that you modify the regexes, accordingly

                        • As said above, the second regex S/R needs N successive searches/replacements, where N is the number of lines of the longest section, in your data

                        • BTW, if you redo all the same process, you get the original order of each section !!

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors