Community
    • Login

    Insert a string after each number of words with conditions

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    19 Posts 5 Posters 11.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • abuali humaA
      abuali huma
      last edited by

      What I wan to accomplish is a little bit complicated.
      I want to insert the string [NEWLINE] after every 44 characters length including spaces. But with respect the the string cannot be in the middle of a word, and in that case the string should go back to the space before the that word.

      Example #one
      Straight forward

      Original text
      As your advisor I am qualified to assist you in all matters related to rule civilization. If you don’t require my services dismiss me to attend to other matters.

      Result
      As your advisor I am qualified to assist you[NEWLINE]in all matters related to rule civilization.[NEWLINE]**If you don’t require my services dismiss me[NEWLINE]to attend to other matters.

      As your advisor I am qualified to assist you= 44 Characters
      in matters related to ruling civilization.= 44 Characters
      If you don’t require my services dismiss me= 44 Characters
      to attend to other matters.= 28 Characters

      Example #two

      Original text
      Greetings My Liege! As your personal advisor, I am qualified to assist you in all matters related to ruling our civilization. I am at your service.

      Wrong Unwated Result
      Greetings My Liege! As your personal advisor[NEWLINE], I am qualified to assist you in all matter[NEWLINE]s related to ruling our civilization. I am a[NEWLINE]t your service.

      Greetings My Liege! As your personal advisor= 44 Characters
      , I am qualified to assist you in all matter= 44 Characters
      s related to ruling our civilization. I am a= 44 Characters
      t your service.= 15 Characters

      Seeked Result
      Greetings My Liege! As your personal[NEWLINE]advisor, I am qualified to assist you in all[NEWLINE]matters related to ruling our civilization. [NEWLINE] I am at your service.

      Greetings My Liege! As your personal= 36 Characters
      advisor, I am qualified to assist you in all= 44 Characters
      matters related to ruling our civilization. = 44 Characters
      I am at your service.= 22 Characters

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello @abuali-huma,

        post UPDATED, on 04/172017, with the help of @Alan-Kilborn. See :

        https://notepad-plus-plus.org/community/topic/13686/insert-a-string-after-each-number-of-words-with-conditions/8

        I think that you could obtain what you want, by using a S/R, with regular expressions, but not exactly with the same template that you proposed !

        Before giving you the simple search and replacement regexes, needed, and the explanation of the slight differences with your seeked results, that produces this S/R, I would like to recapitulate your two examples, as I did not find, exactly, the same results as you ;-))

        IMPORTANT :

        Throughout all this post, I simply replaced the leading and trailing spaces, of the final lines of text, by the Double Low Line Unicode character ‗ ( \x{2017} ). It’s more readable and rigorous, isn’t it ?

        # Original text 1
        
        As your advisor I am qualified to assist you in all matters related to rule civilization. If you don’t require my services dismiss me to attend to other matters.
        
        # Correct Result 1 :
        
        As your advisor I am qualified to assist you‗       45
        in all matters related to rule civilization.        44
        ‗If you don’t require my services dismiss me        44
        ‗to attend to other matters.                        28
        
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        
        # Original text 2
        
        Greetings My Liege! As your personal advisor, I am qualified to assist you in all matters related to ruling our civilization. I am at your service.
        
        # Unwanted Result 2 :
        
        Greetings My Liege! As your personal advisor        44
        , I am qualified to assist you in all matter        44
        s related to ruling our civilization. I am a        44
        t your service.                                     15
        
        # Wanted Result 2 :
        
        Greetings My Liege! As your personal‗               37
        advisor, I am qualified to assist you in all        44
        ‗matters related to ruling our civilization.        44
        ‗I am at your service.                              22
        

        To my mind, compared to your post, I think that :

        • The first block of the Correct Result 1 is 45 characters long ( not 44 ), due to the final space character

        • The first block of the Wanted Result 2 is 37 character long ( not 36 ), for the same reason !


        Now, I think that the behaviour, about the slicing that you seek for, cannot be easily achieved with regular expressions, as the insertion of the End of Line character(s), inside text, would depend on the character right after the end of the next block of 44 characters ! I did try to find a suitable regex, without success, unfortunately :-((

        But I’ve got an other solution, which works out the current block to match, in order that it is always followed by a NON-word character, in the 44 characters limit !

        The searched regex is, simply : (?-s).{1,44}(?=\W).

        Applying this S/R, to your two examples, we obtain :

        # Original text 1
        
        As your advisor I am qualified to assist you in all matters related to rule civilization. If you don’t require my services dismiss me to attend to other matters.
        
        # Final text 1
        
        As your advisor I am qualified to assist you        44
        ‗in all matters related to rule civilization        44
        . If you don’t require my services dismiss          42
        ‗me to attend to other matters.                     31
        
        # Original text 2
        
        Greetings My Liege! As your personal advisor, I am qualified to assist you in all matters related to ruling our civilization. I am at your service.
        
        # Final text 2
        
        Greetings My Liege! As your personal advisor        44
        , I am qualified to assist you in all               37
        ‗matters related to ruling our civilization.        44
        ‗I am at your service.                              22
        

        What do you think about this kind of slicing ? I do hope it could be OK for you :-))


        So, to sump up :

        • First, go to Settings > Preferences… > Editing > Vertical Edge Settings

        • Check the Show vertical edge and the Line mode options

        • Type 44 in the Number of columns dialog and valid by Enter

        • Hit the close button, to close the Preferences dialog

        => A blue line, located after the 44th character, should appear ! ( BTW, I also supposed that your current font is a monospaced font, as, for instance, the Courrier New or the Consolas font ! )

        • Select, also, the menu option View > Show Symbol > Show White Space and TAB

        • Now, go back to the very beginning of your document ( Ctrl + Origin )

        • Open the replace dialog ( Ctrl + H )

        SEARCH (?-s).{1,44}(?=\W)

        REPLACE $0\r\n

        • Select the Regular expression search mode

        • Click on the Replace All button… Et voilà !

        Notes :

        • The (?-s) part is a modifier, that forces the regex engine to consider the dot meta-character as matching a single standard character, only ( NOT the \r and the \n characters )

        • Then the .{1,44} part tries to match the longest string, containing between 1 to 44 characters, included

        • With the additional condition that the character, following that string, must be a NON-word character, due to the positive look-ahead syntax (?=\W)

        • In replacement, we, first, rewrite the entire searched string $0, followed by the string \r\n, which inserts the two Windows End of Line characters

        • In case, you’re using Unix files, note that the replacement zone should be $0\n, only


        Finally, from the original paragraph 5 of the license.txt file of N++ v7.3.3, below :

        5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.
        

        We would obtain, after replacement :

        5. You are not required to accept this
        ‗License, since you have not signed it.
        ‗However, nothing else grants you permission
        ‗to modify or distribute the Program or its
        ‗derivative works. These actions are
        ‗prohibited by law if you do not accept this
        ‗License. Therefore, by modifying or
        ‗distributing the Program (or any work based
        ‗on the Program), you indicate your
        ‗acceptance of this License to do so, and
        ‗all its terms and conditions for copying,
        ‗distributing or modifying the Program or
        ‗works based on it.
        

        Just notice that the space character ( usual word separator ) is always moved, with that S/R, at beginning of the next line. You may prefer the other text arrangement, below, where the non-word character is simply the last character of each final line, using the S/R

        SEARCH (?-s).{1,43}\W

        REPLACE $0\r\n

        5. You are not required to accept this‗
        License, since you have not signed it.‗
        However, nothing else grants you permission‗
        to modify or distribute the Program or its‗
        derivative works. These actions are‗
        prohibited by law if you do not accept this‗
        License. Therefore, by modifying or‗
        distributing the Program (or any work based‗
        on the Program), you indicate your‗
        acceptance of this License to do so, and‗
        all its terms and conditions for copying,‗
        distributing or modifying the Program or‗
        works based on it.
        

        See you later,

        Best Regards,

        guy038

        hu maH abuali humaA 2 Replies Last reply Reply Quote 0
        • hu maH
          hu ma @guy038
          last edited by

          @guy038 said:

          Hello @abuali-huma,

          I think that you could obtain what you want, by using a S/R, with regular expressions, but not exactly with the same template that you proposed !

          Before giving you the simple search and replacement regexes, needed, and the explanation of the slight differences with your seeked results, that produces this S/R, I would like to recapitulate your two examples, as I did not find, exactly, the same results as you ;-))

          IMPORTANT :

          Throughout all this post, I simply replaced the leading and trailing spaces, of the final lines of text, by the Double Low Line Unicode character ‗ ( \x{2017} ). It’s more readable and rigorous, isn’t it ?

          # Original text 1
          
          As your advisor I am qualified to assist you in all matters related to rule civilization. If you don’t require my services dismiss me to attend to other matters.
          
          # Correct Result 1 :
          
          As your advisor I am qualified to assist you‗       45
          in all matters related to rule civilization.        44
          ‗If you don’t require my services dismiss me        44
          ‗to attend to other matters.                        28
          
          ------------------------------------------------------------------------
          
          # Original text 2
          
          Greetings My Liege! As your personal advisor, I am qualified to assist you in all matters related to ruling our civilization. I am at your service.
          
          # Unwanted Result 2 :
          
          Greetings My Liege! As your personal advisor        44
          , I am qualified to assist you in all matter        44
          s related to ruling our civilization. I am a        44
          t your service.                                     15
          
          # Wanted Result 2 :
          
          Greetings My Liege! As your personal‗               37
          advisor, I am qualified to assist you in all        44
          ‗matters related to ruling our civilization.        44
          ‗I am at your service.                              22
          

          To my mind, compared to your post, I think that :

          • The first block of the Correct Result 1 is 45 characters long ( not 44 ), due to the final space character

          • The first block of the Wanted Result 2 is 37 character long ( not 36 ), for the same reason !


          Now, I think that the behaviour, about the slicing that you seek for, cannot be easily achieved with regular expressions, as the insertion of the End of Line character(s), inside text, would depend on the character right after the end of the next block of 44 characters ! I did try to find a suitable regex, without success, unfortunately :-((

          But I’ve got an other solution which works out the current block to match, in order that it is always followed by a NON-word character, in the 44 characters limit !

          The searched regex is, simply : (?-s).{16,44}(?=\W). You probably think : what the 16 number is for ?

          Well, from the link below, it happens that the longest non-coined and non-technical English word is the word antidisestablishmentarianism, which is 28 characters long !! ( and 16 is just equal to 44 - 28 )

          https://en.wikipedia.org/wiki/Longest_word_in_English

          So, if we consider the simple text, below, with that long word :

          1234567890123456 antidisestablishmentarianism 12345678901234
          
          • As the string 1234567890123456 antidisestablishmentarianism is 45 characters long ( so more than the limit of 44 chars), the regex engine backtracks, searching, successively, from beginning of that text, for a string of 43 characters long, followed by a non-word character, then for a string of 42 characters long, followed by a non-word character … and so on, till it detects a first match : the string 1234567890123456, of 16 characters long, which is followed with a space character

          • A second search gets all the remaining characters, as the string ‗antidisestablishmentarianism 12345678901234 is just 44 characters long

          • Now, let’s suppose you add the digit 5 at the end of the example text, as below :

          1234567890123456 antidisestablishmentarianism 123456789012345
          
          • This time, again, as the second resulting string would be 45 characters long, the regex engine backtracks and selects, only, the string ‗antidisestablishmentarianism ( 29 characters long ) as a second match. Finally, the third match is the 16 characters long string ‗123456789012345 !

          Applying my S/R, to your two examples, we obtain :

          # Original text 1
          
          As your advisor I am qualified to assist you in all matters related to rule civilization. If you don’t require my services dismiss me to attend to other matters.
          
          # Final text 1
          
          As your advisor I am qualified to assist you        44
          ‗in all matters related to rule civilization        44
          . If you don’t require my services dismiss          42
          ‗me to attend to other matters.                     31
          
          # Original text 2
          
          Greetings My Liege! As your personal advisor, I am qualified to assist you in all matters related to ruling our civilization. I am at your service.
          
          # Final text 2
          
          Greetings My Liege! As your personal advisor        44
          , I am qualified to assist you in all               37
          ‗matters related to ruling our civilization.        44
          ‗I am at your service.                              22
          

          What do you think about this kind of slicing ? I do hope it could be OK for you :-))


          So, to sump up :

          • First, go to Settings > Preferences… > Editing > Vertical Edge Settings

          • Check the Show vertical edge and the Line mode options

          • Type 44 in the Number of columns dialog and valid by Enter

          • Hit the close button, to close the Preferences dialog

          => A blue line, located after the 44th character, should appear ! ( BTW, I also supposed that your current font is a monospaced font, as, for instance, the Courrier New or the Consolas font ! )

          • Select, also, the menu option View > Show Symbol > Show White Space and TAB

          • Now, go back to the very beginning of your document ( Ctrl + Origin )

          • Open the replace dialog ( Ctrl + H )

          SEARCH (?-s).{16,44}(?=\W)

          REPLACE $0\r\n

          • Click on the Replace All button… Et voilà !

          Notes :

          • The (?-s) part is a modifier, that forces the regex engine to consider the dot meta-character as matching a single standard character, only ( NOT the \r and the \n characters )

          • Then the .{16,44} part tries to match the longest string, containing between 16 to 44 characters, included

          • With the additional condition that the character, following that string, must be a NON-word character, due to the positive look-ahead syntax (?=\W)

          • In replacement, we, first, rewrite the entire searched string $0, followed by the string \r\n, which inserts the two Windows End of Line characters

          • In case, you’re using Unix files, note that the replacement zone should be $0\n, only


          Finally, from the original paragraph 5 of the license.txt file of N++ v7.3.3, below :

          5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.
          

          We would obtain, after replacement :

          5. You are not required to accept this
          ‗License, since you have not signed it.
          ‗However, nothing else grants you permission
          ‗to modify or distribute the Program or its
          ‗derivative works. These actions are
          ‗prohibited by law if you do not accept this
          ‗License. Therefore, by modifying or
          ‗distributing the Program (or any work based
          ‗on the Program), you indicate your
          ‗acceptance of this License to do so, and
          ‗all its terms and conditions for copying,
          ‗distributing or modifying the Program or
          ‗works based on it.
          

          Just notice that the space character ( usual word separator ) is always moved, with that S/R, at beginning of the next line. You may prefer the other text arrangement, below, where the non-word character is simply the last character of each final line, using the regex .{16,43}\W :

          5. You are not required to accept this‗
          License, since you have not signed it.‗
          However, nothing else grants you permission‗
          to modify or distribute the Program or its‗
          derivative works. These actions are 
          prohibited by law if you do not accept this‗
          License. Therefore, by modifying or‗
          distributing the Program (or any work based‗
          on the Program), you indicate your‗
          acceptance of this License to do so, and‗
          all its terms and conditions for copying,‗
          distributing or modifying the Program or‗
          works based on it.
          

          See you later,

          Best Regards,

          guy038

          That is how genius answers! Wow I’m impressed!! I didn’t realize that my purpose can be achieved!

          Still haven’t tested it, but it is enough to know what is the possibilities that your help would gets.

          Thanks very much, I will back with results.

          1 Reply Last reply Reply Quote 0
          • abuali humaA
            abuali huma
            last edited by abuali huma

            It works like charm.
            Just had to edit the regex to

            SEARCH (?-s).{16,44}(?=\W) – unchanged

            REPLACE $0[NEWLINE]

            So the example#2 output is
            Greetings My Liege! As your personal advisor[NEWLINE], I am qualified to assist you in all[NEWLINE] matters related to ruling our civilization.[NEWLINE] I am at your service.

            in order 44[NEWLINE]37[NEWLINE]43[NEWLINE]22

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by

              Hi, @abuali-huma,

              Oh ! Sorry. I, initially, thought that you wrote the string [NEW LINE], as a notation for a Line Break ! But, you did mean the litteral string [NEW LINE] ;-))

              Cheers,

              guy038

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • hu maH
                hu ma
                last edited by hu ma

                Ummm…
                Back again, there is one more thing I need to do.
                Continuing with example#2 result which was

                Greetings My Liege! As your personal advisor[NEWLINE], I am qualified to assist you in all[NEWLINE] matters related to ruling our civilization.[NEWLINE] I am at your service.

                Seeked arrangement
                I am at your service. [NEWLINE] matters related to ruling our civilization.[NEWLINE], I am qualified to assist you in all[NEWLINE]Greetings My Liege! As your personal advisor

                To put an understanding to it, I want to capture the text before, between, and after the string [NEWLINE] and change their order from \1\2\3\4 to \4\3\2\1.
                I can achieve this by first replacing the string [NEWLINE] to say ✓✓✓, then capture them by this regex
                search: (.+[\x{0000}-\x{9faf}])✓✓✓(.+[\x{0000}\x{9faf}])✓✓✓(.+[\x{0000}\x{9faf}])✓✓✓(.+[\x{0000}\x{9faf}])
                Replace : \4[NEWLINE]\3[NEWLINE]\2[NEWLINE]\1

                That can be happened, but I know this method can only work up to 9 captured groups, and I had some of them excceding 9 groups.

                Scott SumnerS 1 Reply Last reply Reply Quote 0
                • Scott SumnerS
                  Scott Sumner @hu ma
                  last edited by

                  @hu-ma said:

                  but I know this method can only work up to 9 captured groups

                  If you use the \number syntax, your statement is true.
                  If you switch to the $number syntax, you can go higher than 9.

                  For example, if you search for this regex: (a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)
                  And replace it with $10
                  On this text: abcdefghijklmnopqrstuvwxyz
                  You will obtain: jopqrstuvwxyz

                  Contrast with this example:
                  search for this regex: (a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)
                  And replace it with \10
                  On this text: abcdefghijklmnopqrstuvwxyz
                  You will obtain: a0opqrstuvwxyz

                  hu maH 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @guy038
                    last edited by

                    @guy038

                    I was playing around with this idea and I’m not sure I see the importance of introducing the complication of the “longest word in English” stuff. For example, if I experiment with a variant of the regex that ignores this, I still get nice results:

                    Find: (?-s)(.{1,43})\W
                    Replace: $1\r\n

                    …gives me nice text wrapping after the desired amount of columns.

                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      Hi, @abuali-huma, @scott-sumner and @alan-kilborn,

                      Alan, looking again to my previous post, you’re absolutely right about it. Can’t understand why I thought that the length of words was so important ! I should have been excessively tired, two days ago ;-))

                      So , I’ve just completely updated my previous post, mentioning your contribution to that nicer regex. Thanks for that !


                      As for your own S/R, below :

                      SEARCH (?-s)(.{1,43})\W

                      REPLACE $1\r\n

                      It just differs from my last S/R, of my previous post, as it does not take, in account, the final NON-word character, at position 44, in the replacement part !

                      Therefore, starting, again, from this part of the license.txt file, below :

                      5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.
                      

                      It would give the same text, without the space character, at the end of all the lines generated :

                      5. You are not required to accept this
                      License, since you have not signed it.
                      However, nothing else grants you permission
                      to modify or distribute the Program or its
                      derivative works. These actions are
                      prohibited by law if you do not accept this
                      License. Therefore, by modifying or
                      distributing the Program (or any work based
                      on the Program), you indicate your
                      acceptance of this License to do so, and
                      all its terms and conditions for copying,
                      distributing or modifying the Program or
                      works based on it.
                      

                      Cheers,

                      guy038

                      Alan KilbornA 2 Replies Last reply Reply Quote 0
                      • Alan KilbornA
                        Alan Kilborn @guy038
                        last edited by

                        @guy038

                        Yes, in playing around with your original regex, I didn’t worry about the resulting space at the end of the line, as I have my “save” shortcut mapped to “trim trailing spaces” + “save”. The ONLY way files should be saved (for me!).

                        1 Reply Last reply Reply Quote 1
                        • Alan KilbornA
                          Alan Kilborn @guy038
                          last edited by

                          @guy038

                          And it is great that you have Admin rights here and can edit old posts, but I’m neutral on this. I think that old posts should not be edited and clarifying posts should just be added on. It is difficult to follow sometimes when history is CHANGED rather than simply CORRECTED/CLARIFIED later. :-D

                          1 Reply Last reply Reply Quote 0
                          • guy038G
                            guy038
                            last edited by

                            Hi, Alan,

                            Yes, you’re right about it : I should have created a new post with the corrections, for a better history ! It’s just that my updated post was, still, quite long and I thought it would be more clear to, simply, change my initial post. But, I do understand your point of view !

                            Cheers,

                            guy038

                            1 Reply Last reply Reply Quote 0
                            • hu maH
                              hu ma @Scott Sumner
                              last edited by

                              @Scott-Sumner
                              Thanks for the info!

                              1 Reply Last reply Reply Quote 0
                              • guy038G
                                guy038
                                last edited by guy038

                                Hello, @hu-ma and All

                                To complete the @scott-sumner post, about the two syntaxes of the searched groups, in replacement :

                                • \N, with 0 < N < 10

                                • $N, with 0 <= N < 2,147,483,648

                                There is the other practical syntax, below :

                                • ${N}, with 0 <= N < 2,147,483,648

                                Indeed, let’s imagine the original text:

                                abcd
                                1234
                                WXYZ
                                

                                and the first S/R :

                                SEARCH ^.(..).

                                REPLACE $100|

                                You obtain the simple text :

                                |
                                |
                                |
                                

                                Why ?! Just because, in replacement, the regex engine is looking for the group $100, which, obviously, does not exist ! So, the regex engine rewrites a zero-length string, for the non-existent group 100, followed by the literal character | !

                                Now, compare, with the second S/R, below :

                                SEARCH ^.(..).

                                REPLACE ${1}00|

                                This time, you, correctly, get the text, below :

                                bc00|
                                2300|
                                XY00|
                                

                                => All the changed lines begin by the second and third characters of the original lines of text ( $1 ), and are, simply, followed by the string 00|

                                Best Regards,

                                guy038

                                1 Reply Last reply Reply Quote 2
                                • abuali humaA
                                  abuali huma @guy038
                                  last edited by

                                  • Sorry for pumping up old thread, but my issue is related to this one.

                                  Cutting to the thread…
                                  Look at Result#2 with desired arrangement

                                  -Example# 2
                                  Greetings My Liege! As your personal advisor [NEWLINE] , I am qualified to assist you in all[NEWLINE] matters related to ruling our civilization.[NEWLINE] I am at your service.
                                  
                                  --------
                                  +Seeked arrangement
                                  I am at your service. [NEWLINE] matters related to ruling our civilization.[NEWLINE], I am qualified to assist you in all[NEWLINE]Greetings My Liege! As your personal advisor
                                  

                                  I asked before for a way to rearrange the groups between [NEWLINE] to be backward… Now I’m asking for the same but in more automated way…

                                  Because not all lines have the same amount of Groups, I want to arrange all the lines that contains Groups between [NEWLINE] to be backward arrangement.

                                  -Example#3 Contains SIX groups
                                  One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six
                                  
                                  -------
                                  +Seeked arrangement
                                   six[NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE]One
                                  

                                  While using the same regex or python script

                                  -Example#4 Contains 4 groups 
                                  I want [NEWLINE] this [NEWLINE] to be  [NEWLINE] last
                                  
                                  ------
                                  +Seeked arrangement
                                   last[NEWLINE] to be [NEWLINE] this [NEWLINE]I want 
                                  
                                  1 Reply Last reply Reply Quote 0
                                  • guy038G
                                    guy038
                                    last edited by guy038

                                    Hi, @abuali-huma,

                                    I found a general method, which uses three consecutive S/R. We’ll need two dummy characters, NOT used in the current file. I, personally, chose the # and @ characters, but any other may be used !

                                    • The first S/R :

                                      • Changes any string [NEWLINE], possibly preceded and/or followed with a space character, by the dummy character #

                                      • Adds, also, a # character at the end of any non-blank line

                                    • The second S/R is the main S/R, which rewrites the different parts, between the # character, in reverse order.

                                      • Note that this S/R will have to be performed as many times, till the message Replace All: 0 occurrences were replaced occurs, in the Replace dialog

                                      • The general idea, about this S/R, is to switch the beginning and ending parts of the found text, adding a @ character, at the end of the exchanged parts, in order that the next run of this S/R, will avoid these moved parts of text ! Hence, the decreasing number of occurrences found, till zero :-))

                                    • The Third S/R :

                                      • Changes the # character, possibly preceded by a @ character, inside text, by the string [NEXLINE], preceded and followed with a space character

                                      • Deletes the # character, possibly preceded by a @ character, when located at the end of the lines

                                    All these S/R will use the Regular expression search mode, the Wrap around option and the Replace All button, of the Replace dialog

                                    So, let’s start with the original text, below :

                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine [NEWLINE] ten [NEWLINE] eleven [NEWLINE] twelve
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine [NEWLINE] ten [NEWLINE] eleven
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine [NEWLINE] ten
                                    Other text NOT concerned
                                    by this Search Replacement
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven
                                    Bla bla blah
                                    Bla bla blah
                                    Bla bla blah
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five
                                    One [NEWLINE] two [NEWLINE] three [NEWLINE] four
                                    Dummy text
                                    inserted, in between !
                                    One [NEWLINE] two [NEWLINE] three
                                    One [NEWLINE] two
                                    One
                                    I want [NEWLINE] this [NEWLINE] to be  [NEWLINE] last
                                    

                                    After running the following S/R , once :

                                    SEARCH (?-s)\x20?\[NEWLINE\]\x20?|(?<=.)$

                                    REPLACE #

                                    You should get the text, below :

                                    One#two#three#four#five#six#seven#eight#nine#ten#eleven#twelve#
                                    One#two#three#four#five#six#seven#eight#nine#ten#eleven#
                                    One#two#three#four#five#six#seven#eight#nine#ten#
                                    Other text NOT concerned#
                                    by this Search Replacement#
                                    One#two#three#four#five#six#seven#eight#nine#
                                    One#two#three#four#five#six#seven#eight#
                                    One#two#three#four#five#six#seven#
                                    Bla bla blah#
                                    Bla bla blah#
                                    Bla bla blah#
                                    One#two#three#four#five#six#
                                    One#two#three#four#five#
                                    One#two#three#four#
                                    Dummy text#
                                    inserted, in between !#
                                    One#two#three#
                                    One#two#
                                    One#
                                    I want#this#to be #last#
                                    

                                    After running the following S/R, SEVEN times, one after another :

                                    SEARCH (?-s)([^@#\r\n]+?)#(.+#)?([^@#\r\n]+)#

                                    REPLACE \3@#\2\1@#

                                    The modified text is, now :

                                    twelve@#eleven@#ten@#nine@#eight@#seven@#six@#five@#four@#three@#two@#One@#
                                    eleven@#ten@#nine@#eight@#seven@#six#five@#four@#three@#two@#One@#
                                    ten@#nine@#eight@#seven@#six@#five@#four@#three@#two@#One@#
                                    Other text NOT concerned#
                                    by this Search Replacement#
                                    nine@#eight@#seven@#six@#five#four@#three@#two@#One@#
                                    eight@#seven@#six@#five@#four@#three@#two@#One@#
                                    seven@#six@#five@#four#three@#two@#One@#
                                    Bla bla blah#
                                    Bla bla blah#
                                    Bla bla blah#
                                    six@#five@#four@#three@#two@#One@#
                                    five@#four@#three#two@#One@#
                                    four@#three@#two@#One@#
                                    Dummy text#
                                    inserted, in between !#
                                    three@#two#One@#
                                    two@#One@#
                                    One#
                                    last@#to be @#this@#I want@#
                                    

                                    Seven consecutive runs of that regex S/R are required, to get the sought text :

                                    • Run 1 : 12 occurrences replaced
                                    • Run 2 : 10 occurrences replaced
                                    • Run 3 : 7 occurrences replaced
                                    • Run 4 : 5 occurrences replaced
                                    • Run 5 : 3 occurrences replaced
                                    • Run 6 : 1 occurrences replaced
                                    • Run 7 : 0 occurrences replaced

                                    Note : After each run, you may hit the Find Next button, before hitting the Replace All button, to guess the general process !

                                    The part [^@#\r\n], in the searched regex, represents any single character, different from @, #, \n and \r


                                    Then, after running the last S/R, once :

                                    SEARCH (?-s)(@?#)(?=.)|@?#

                                    REPLACE ?1\x20[NEWLINE]\x20

                                    We obtain our final text :

                                    twelve [NEWLINE] eleven [NEWLINE] ten [NEWLINE] nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    eleven [NEWLINE] ten [NEWLINE] nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    ten [NEWLINE] nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    Other text NOT concerned
                                    by this Search Replacement
                                    nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    Bla bla blah
                                    Bla bla blah
                                    Bla bla blah
                                    six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                                    Dummy text
                                    inserted, in between !
                                    three [NEWLINE] two [NEWLINE] One
                                    two [NEWLINE] One
                                    One
                                    last [NEWLINE] to be  [NEWLINE] this [NEWLINE] I want
                                    

                                    The search part looks for the regex @?#, either, inside the lines ( case group 1 defined ) or at end of lines ( case NO group 1 )

                                    The replacement part means that, IF group 1 exists, the searched text is replaced by the string [NEWLINE], surrounded by space characters, ELSE NO replacement occurs

                                    Et voilà !

                                    Best Regards,

                                    guy038

                                    1 Reply Last reply Reply Quote 0
                                    • abuali humaA
                                      abuali huma
                                      last edited by

                                      Thanks very much!
                                      But just to be clear, in the first regex

                                      SEARCH (?-s)\x20?[NEWLINE]\x20?|(?<=.)$

                                      REPLACE #

                                      Removing the value ** \x20 ** will result this
                                      SEARCH (?-s)?[NEWLINE]?|(?<=.)$

                                      Which will result capturing the space “if available” before and after [NEWLINE] string in first and last group?

                                      1 Reply Last reply Reply Quote 0
                                      • abuali humaA
                                        abuali huma
                                        last edited by

                                        I found out the removing the \x20 does what I described… Thanks again

                                        1 Reply Last reply Reply Quote 0
                                        • abuali humaA
                                          abuali huma
                                          last edited by

                                          I modified the original Search regex, as it catches some Unicode characters with will break the line in a middle of a word. So in the modified regex I replace \W with \x20 (space character)… so far no word breaking issues
                                          Here is the modified one
                                          (?-s).{1,44}(?=\x20)

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors