Community
    • Login

    regex help with reverse line

    Scheduled Pinned Locked Moved General Discussion
    13 Posts 6 Posters 1.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • namx3249N
      namx3249
      last edited by

      hi everybody,
      i need to reverse line on my document with over 5000 lines. work fine Reverse line order (from Line operation) but i need to select manually each block so not easy for me!
      i have block like this one:

      (blank empty line)
      aaaaaaa
      bbbbbb
      ccccccc
      (blank empty line)
      eeeeeeeee
      wwwwww
      ppppppp
      (blank empty line)
      

      i need to reverse line 2 with line 4 like this one:

      (blank empty line)
      ccccccc
      bbbbbb
      aaaaaaa
      (blank empty line)
      ppppppp
      wwwwww
      eeeeeeeee
      (blank empty line)
      

      now my current regex is:
      find what: (?-s)^(^.*\R)(.+\R)((?:.+\R)*?)(.+\R)(?=^)
      replace: \1\4\3\2
      but this regex reverse line 2 with 3, not 2 with 4
      what wrong on my regex? thanks for your attention

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @namx3249
        last edited by

        @namx3249 said in regex help with reverse line:

        (blank empty line)
        aaaaaaa
        bbbbbb
        ccccccc
        (blank empty line)
        eeeeeeeee
        wwwwww
        ppppppp
        (blank empty line)

        I see no reason that this won’t work on your data:

        Find: ^\(blank empty line\)\R\K(.+\R)(.+\R)(.+\R)
        Replace: ${3}${2}${1}
        Search mode: Regular expression

        1 Reply Last reply Reply Quote 2
        • namx3249N
          namx3249
          last edited by

          thanks for your reply, but maybe my mistake, i expressed badly myself: with (blank empty line) i mean blank line like this one:

          
          aaaaaaa
          bbbbbb
          ccccccc
          
          eeeeeeeee
          wwwwww
          ppppppp
          
          

          sorry for that

          Mark OlsonM 1 Reply Last reply Reply Quote 1
          • Mark OlsonM
            Mark Olson @namx3249
            last edited by

            @namx3249
            find/replace (.+\R)(.+\R)(.+)(\R)? with \3(?4\4:\r\n)\2\1 aught to work fine.

            Note that the much simpler find/replace (.+\R)(.+\R)(.+\R) with \3\2\1 will also work fine, but it will miss the edge case where the EOF comes after the third line in the group.

            1 Reply Last reply Reply Quote 3
            • namx3249N
              namx3249
              last edited by

              oh great. both regex work fine for my document.
              only i don’t understand from second simply regex:

              … it will miss the edge case where the EOF comes after the third line in the group

              what is EOF ? sorry for my ignorance …!

              Mark OlsonM 1 Reply Last reply Reply Quote 0
              • Mark OlsonM
                Mark Olson @namx3249
                last edited by

                @namx3249
                EOF = end of file

                1 Reply Last reply Reply Quote 3
                • namx3249N
                  namx3249
                  last edited by namx3249

                  ok understand.

                  another small question: if i want put each block line to single line only, like this one:

                  aaaaaaa bbbbbb ccccccc
                  
                  eeeeeeeee wwwwww ppppppp
                  

                  which regex does this? works well with Line Operations - Join Lines but this requires selecting each individual block and this isn’t easy to do if the document has many blocks

                  PeterJonesP 1 Reply Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones @namx3249
                    last edited by PeterJones

                    @namx3249 ,

                    Assuming that you want to collapse any newline that isn’t preceded or followed by a newline into a space, I would say,

                    • FIND = (?<!\r\n)(?!\A)(\r\n)(?!\r\n)
                    • REPLACE = \x20
                    • SEARCH MODE = regular expression

                    The search says “replace any Windows newline not preceded by newline; not preceded by beginning of the file, and not followed by a newline” and the replacement is a single space character (I showed an escape, because it can be copy/pasted easily from the forum; really, you could just type the space character in your replacement field.)

                    before

                    
                    aaaaaaa
                    bbbbbb
                    ccccccc
                    
                    eeeeeeeee
                    wwwwww
                    ppppppp
                    
                    

                    after

                    
                    aaaaaaa bbbbbb ccccccc
                    
                    eeeeeeeee wwwwww ppppppp 
                    
                    
                    1 Reply Last reply Reply Quote 3
                    • namx3249N
                      namx3249
                      last edited by

                      wow, awesome. thank you for this regex

                      I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

                      Luckily for me there is this wonderful forum!
                      Thank you all for your attention and greetings to everybody

                      PeterJonesP 1 Reply Last reply Reply Quote 1
                      • namx3249N namx3249 referenced this topic on
                      • PeterJonesP
                        PeterJones @namx3249
                        last edited by

                        @namx3249 said in regex help with reverse line:

                        wow, awesome. thank you for this regex

                        I thought it more easy … understood that the block is selected with (.+\R)(.+\R)(.+\R) i thought it was easy to put right value in replace field

                        There is more than one way to do it.

                        • FIND = (.+)\R(.+)\R(.+)$
                        • REPLACE = $1 $2 $3
                        • un-checkmark . matches newline

                        would do it, too. My previous solution made no assumption about number of lines in each block. This variant assumes exactly three in a block.

                        1 Reply Last reply Reply Quote 3
                        • namx3249N
                          namx3249
                          last edited by

                          oh nice. thanks for this clarification !
                          i hope, step by step, to understand the amazing world of regex, very hard for me !
                          Regards

                          1 Reply Last reply Reply Quote 1
                          • sky 247S
                            sky 247
                            last edited by

                            The regular expression you provided is reversing the second and third lines because of the way it captures the text blocks. To reverse the second and fourth lines, you need to modify the regular expression to capture the second and fourth lines as separate groups. Here’s a modified version of the regular expression that should work:

                            Find what: (?-s)^(\h*\R|\R?\n)(.+)(\R)(.+)(\h*\R|\R?\n)

                            Replace with: \1\4\3\2\5

                            Explanation of the regular expression:

                            (?-s) - Disables dot-matches-all mode.

                            ^ - Matches the start of a line.

                            (\h*\R|\R?\n) - Matches a blank line. \h* matches zero or more horizontal whitespace characters, and \R|\R?\n matches either a line break sequence (CR, LF, CRLF, or Unicode line separator) or an optional CR followed by an LF.

                            (.+) - Matches the first non-blank line and captures it in group 2.

                            (\R) - Matches the line break sequence after the first non-blank line and captures it in group 3.

                            (.+) - Matches the second non-blank line and captures it in group 4.

                            (\h*\R|\R?\n) - Matches another blank line.

                            The replace pattern:

                            \1\4\3\2\5 - Replaces the match with the captured groups in the desired order. The first and fifth groups (the blank lines) remain in their original positions, and the second and fourth groups (the non-blank lines) are swapped by reversing their order. click here to see live example

                            I hope this helps! Let me know if you have any further questions.

                            1 Reply Last reply Reply Quote 1
                            • guy038G
                              guy038
                              last edited by guy038

                              Hello, @namx3249, @alan-kilborn, @mark-olson, @peterjones, @sky-247 and All,

                              I found out a general method to reverse the lines of sections, separated with a pure empty line :

                              • Whatever the number of lines of each section

                              • Whatever the number of sections


                              Let’s go :

                              We start with the following INPUT text :

                              
                              01
                              02
                              03
                              04
                              
                              aaaaa
                              bbbbb
                              ccccc
                              ddddd
                              eeeee
                              fffff
                              ggggg
                              hhhhh
                              iiiii
                              
                              05
                              06
                              
                              07
                              08
                              09
                              10
                              11
                              
                              01
                              02
                              03
                              
                              FIRST Line
                              Second line
                              Third line
                              Fourth line
                              Fifth line
                              LAST line
                              

                              Note the empty line at the very beginning of the data ! ( Important )


                              With this first regex S/R, we replace any EOL chars, not followed with other EOL chars, with a colon character

                              • SEARCH (?x) \R (?! ^\R )

                              • REPLACE :

                              We get this temporary text :

                              :01:02:03:04
                              :aaaaa:bbbbb:ccccc:ddddd:eeeee:fffff:ggggg:hhhhh:iiiii
                              :05:06
                              :07:08:09:10:11
                              :01:02:03
                              :FIRST Line:Second line:Third line:Fourth line:Fifth line:LAST line
                              

                              As you can see :

                              • Any section is rewritten in a single line

                              • Any previous line is simply preceded with a colon character

                              • Any line must end with text without a colon character


                              Now, with this second regex S/R, we separate each line in two parts :

                              • A first part between the first colon of the line and right before the last colon

                              • A second part from after the last colon till the end of current line

                              • In the replacement phase, we rewrite these two parts, in reverse order, with a leading slash

                              SEARCH (?x) ( : .+ ) : ( .+ )

                              REPLACE /\2\1

                              Click of the Replace All button as many times as the maximum number of lines in sections

                              Regarding our example, you should click nine times on the Replace All button !

                              You may also hit the Alt + A shortcut, repeatedly, till the message Replace All: 0 occurrence were replaced... occurs

                              And we get this temporary text below :

                              /04/03/02:01
                              /iiiii/hhhhh/ggggg/fffff/eeeee/ddddd/ccccc/bbbbb:aaaaa
                              /06:05
                              /11/10/09/08:07
                              /03/02:01
                              /LAST line/Fifth line/Fourth line/Third line/Second line:FIRST Line
                              

                              Finally, let’s come back to the normal displaying of your data, with this third regex S/R which simply replaces the colon and slash chracters with a line-break

                              SEARCH [:/]

                              REPLACE \r\n

                              Anc here is your expected OUTPUT text :

                              
                              04
                              03
                              02
                              01
                              
                              iiiii
                              hhhhh
                              ggggg
                              fffff
                              eeeee
                              ddddd
                              ccccc
                              bbbbb
                              aaaaa
                              
                              06
                              05
                              
                              11
                              10
                              09
                              08
                              07
                              
                              03
                              02
                              01
                              
                              LAST line
                              Fifth line
                              Fourth line
                              Third line
                              Second line
                              FIRST Line
                              

                              Notes :

                              • The trivial cases, of a single data section only or sections of one line only, are correctly handled, too !

                              • Any additional line-breaks, between sections, are preserved in your OUTPUT text

                              • Of course, you can use any char, instead of the colon and the slash characters :

                                • Provided that they cannot be found in your present INPUT data

                                • Provided that you modify the regexes, accordingly

                              • As said above, the second regex S/R needs N successive searches/replacements, where N is the number of lines of the longest section, in your data

                              • BTW, if you redo all the same process, you get the original order of each section !!

                              Best Regards,

                              guy038

                              1 Reply Last reply Reply Quote 2
                              • First post
                                Last post
                              The Community of users of the Notepad++ text editor.
                              Powered by NodeBB | Contributors