Community
    • Login

    Insert a string after each number of words with conditions

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    19 Posts 5 Posters 11.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by

      Hi, @abuali-huma,

      Oh ! Sorry. I, initially, thought that you wrote the string [NEW LINE], as a notation for a Line Break ! But, you did mean the litteral string [NEW LINE] ;-))

      Cheers,

      guy038

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • hu maH
        hu ma
        last edited by hu ma

        Ummm…
        Back again, there is one more thing I need to do.
        Continuing with example#2 result which was

        Greetings My Liege! As your personal advisor[NEWLINE], I am qualified to assist you in all[NEWLINE] matters related to ruling our civilization.[NEWLINE] I am at your service.

        Seeked arrangement
        I am at your service. [NEWLINE] matters related to ruling our civilization.[NEWLINE], I am qualified to assist you in all[NEWLINE]Greetings My Liege! As your personal advisor

        To put an understanding to it, I want to capture the text before, between, and after the string [NEWLINE] and change their order from \1\2\3\4 to \4\3\2\1.
        I can achieve this by first replacing the string [NEWLINE] to say ✓✓✓, then capture them by this regex
        search: (.+[\x{0000}-\x{9faf}])✓✓✓(.+[\x{0000}\x{9faf}])✓✓✓(.+[\x{0000}\x{9faf}])✓✓✓(.+[\x{0000}\x{9faf}])
        Replace : \4[NEWLINE]\3[NEWLINE]\2[NEWLINE]\1

        That can be happened, but I know this method can only work up to 9 captured groups, and I had some of them excceding 9 groups.

        Scott SumnerS 1 Reply Last reply Reply Quote 0
        • Scott SumnerS
          Scott Sumner @hu ma
          last edited by

          @hu-ma said:

          but I know this method can only work up to 9 captured groups

          If you use the \number syntax, your statement is true.
          If you switch to the $number syntax, you can go higher than 9.

          For example, if you search for this regex: (a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)
          And replace it with $10
          On this text: abcdefghijklmnopqrstuvwxyz
          You will obtain: jopqrstuvwxyz

          Contrast with this example:
          search for this regex: (a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)(m)(n)
          And replace it with \10
          On this text: abcdefghijklmnopqrstuvwxyz
          You will obtain: a0opqrstuvwxyz

          hu maH 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @guy038
            last edited by

            @guy038

            I was playing around with this idea and I’m not sure I see the importance of introducing the complication of the “longest word in English” stuff. For example, if I experiment with a variant of the regex that ignores this, I still get nice results:

            Find: (?-s)(.{1,43})\W
            Replace: $1\r\n

            …gives me nice text wrapping after the desired amount of columns.

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hi, @abuali-huma, @scott-sumner and @alan-kilborn,

              Alan, looking again to my previous post, you’re absolutely right about it. Can’t understand why I thought that the length of words was so important ! I should have been excessively tired, two days ago ;-))

              So , I’ve just completely updated my previous post, mentioning your contribution to that nicer regex. Thanks for that !


              As for your own S/R, below :

              SEARCH (?-s)(.{1,43})\W

              REPLACE $1\r\n

              It just differs from my last S/R, of my previous post, as it does not take, in account, the final NON-word character, at position 44, in the replacement part !

              Therefore, starting, again, from this part of the license.txt file, below :

              5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it.
              

              It would give the same text, without the space character, at the end of all the lines generated :

              5. You are not required to accept this
              License, since you have not signed it.
              However, nothing else grants you permission
              to modify or distribute the Program or its
              derivative works. These actions are
              prohibited by law if you do not accept this
              License. Therefore, by modifying or
              distributing the Program (or any work based
              on the Program), you indicate your
              acceptance of this License to do so, and
              all its terms and conditions for copying,
              distributing or modifying the Program or
              works based on it.
              

              Cheers,

              guy038

              Alan KilbornA 2 Replies Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @guy038
                last edited by

                @guy038

                Yes, in playing around with your original regex, I didn’t worry about the resulting space at the end of the line, as I have my “save” shortcut mapped to “trim trailing spaces” + “save”. The ONLY way files should be saved (for me!).

                1 Reply Last reply Reply Quote 1
                • Alan KilbornA
                  Alan Kilborn @guy038
                  last edited by

                  @guy038

                  And it is great that you have Admin rights here and can edit old posts, but I’m neutral on this. I think that old posts should not be edited and clarifying posts should just be added on. It is difficult to follow sometimes when history is CHANGED rather than simply CORRECTED/CLARIFIED later. :-D

                  1 Reply Last reply Reply Quote 0
                  • guy038G
                    guy038
                    last edited by

                    Hi, Alan,

                    Yes, you’re right about it : I should have created a new post with the corrections, for a better history ! It’s just that my updated post was, still, quite long and I thought it would be more clear to, simply, change my initial post. But, I do understand your point of view !

                    Cheers,

                    guy038

                    1 Reply Last reply Reply Quote 0
                    • hu maH
                      hu ma @Scott Sumner
                      last edited by

                      @Scott-Sumner
                      Thanks for the info!

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by guy038

                        Hello, @hu-ma and All

                        To complete the @scott-sumner post, about the two syntaxes of the searched groups, in replacement :

                        • \N, with 0 < N < 10

                        • $N, with 0 <= N < 2,147,483,648

                        There is the other practical syntax, below :

                        • ${N}, with 0 <= N < 2,147,483,648

                        Indeed, let’s imagine the original text:

                        abcd
                        1234
                        WXYZ
                        

                        and the first S/R :

                        SEARCH ^.(..).

                        REPLACE $100|

                        You obtain the simple text :

                        |
                        |
                        |
                        

                        Why ?! Just because, in replacement, the regex engine is looking for the group $100, which, obviously, does not exist ! So, the regex engine rewrites a zero-length string, for the non-existent group 100, followed by the literal character | !

                        Now, compare, with the second S/R, below :

                        SEARCH ^.(..).

                        REPLACE ${1}00|

                        This time, you, correctly, get the text, below :

                        bc00|
                        2300|
                        XY00|
                        

                        => All the changed lines begin by the second and third characters of the original lines of text ( $1 ), and are, simply, followed by the string 00|

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 2
                        • abuali humaA
                          abuali huma @guy038
                          last edited by

                          • Sorry for pumping up old thread, but my issue is related to this one.

                          Cutting to the thread…
                          Look at Result#2 with desired arrangement

                          -Example# 2
                          Greetings My Liege! As your personal advisor [NEWLINE] , I am qualified to assist you in all[NEWLINE] matters related to ruling our civilization.[NEWLINE] I am at your service.
                          
                          --------
                          +Seeked arrangement
                          I am at your service. [NEWLINE] matters related to ruling our civilization.[NEWLINE], I am qualified to assist you in all[NEWLINE]Greetings My Liege! As your personal advisor
                          

                          I asked before for a way to rearrange the groups between [NEWLINE] to be backward… Now I’m asking for the same but in more automated way…

                          Because not all lines have the same amount of Groups, I want to arrange all the lines that contains Groups between [NEWLINE] to be backward arrangement.

                          -Example#3 Contains SIX groups
                          One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six
                          
                          -------
                          +Seeked arrangement
                           six[NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE]One
                          

                          While using the same regex or python script

                          -Example#4 Contains 4 groups 
                          I want [NEWLINE] this [NEWLINE] to be  [NEWLINE] last
                          
                          ------
                          +Seeked arrangement
                           last[NEWLINE] to be [NEWLINE] this [NEWLINE]I want 
                          
                          1 Reply Last reply Reply Quote 0
                          • guy038G
                            guy038
                            last edited by guy038

                            Hi, @abuali-huma,

                            I found a general method, which uses three consecutive S/R. We’ll need two dummy characters, NOT used in the current file. I, personally, chose the # and @ characters, but any other may be used !

                            • The first S/R :

                              • Changes any string [NEWLINE], possibly preceded and/or followed with a space character, by the dummy character #

                              • Adds, also, a # character at the end of any non-blank line

                            • The second S/R is the main S/R, which rewrites the different parts, between the # character, in reverse order.

                              • Note that this S/R will have to be performed as many times, till the message Replace All: 0 occurrences were replaced occurs, in the Replace dialog

                              • The general idea, about this S/R, is to switch the beginning and ending parts of the found text, adding a @ character, at the end of the exchanged parts, in order that the next run of this S/R, will avoid these moved parts of text ! Hence, the decreasing number of occurrences found, till zero :-))

                            • The Third S/R :

                              • Changes the # character, possibly preceded by a @ character, inside text, by the string [NEXLINE], preceded and followed with a space character

                              • Deletes the # character, possibly preceded by a @ character, when located at the end of the lines

                            All these S/R will use the Regular expression search mode, the Wrap around option and the Replace All button, of the Replace dialog

                            So, let’s start with the original text, below :

                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine [NEWLINE] ten [NEWLINE] eleven [NEWLINE] twelve
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine [NEWLINE] ten [NEWLINE] eleven
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine [NEWLINE] ten
                            Other text NOT concerned
                            by this Search Replacement
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight [NEWLINE] nine
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven [NEWLINE] eight
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six [NEWLINE] seven
                            Bla bla blah
                            Bla bla blah
                            Bla bla blah
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five [NEWLINE] six
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four [NEWLINE] five
                            One [NEWLINE] two [NEWLINE] three [NEWLINE] four
                            Dummy text
                            inserted, in between !
                            One [NEWLINE] two [NEWLINE] three
                            One [NEWLINE] two
                            One
                            I want [NEWLINE] this [NEWLINE] to be  [NEWLINE] last
                            

                            After running the following S/R , once :

                            SEARCH (?-s)\x20?\[NEWLINE\]\x20?|(?<=.)$

                            REPLACE #

                            You should get the text, below :

                            One#two#three#four#five#six#seven#eight#nine#ten#eleven#twelve#
                            One#two#three#four#five#six#seven#eight#nine#ten#eleven#
                            One#two#three#four#five#six#seven#eight#nine#ten#
                            Other text NOT concerned#
                            by this Search Replacement#
                            One#two#three#four#five#six#seven#eight#nine#
                            One#two#three#four#five#six#seven#eight#
                            One#two#three#four#five#six#seven#
                            Bla bla blah#
                            Bla bla blah#
                            Bla bla blah#
                            One#two#three#four#five#six#
                            One#two#three#four#five#
                            One#two#three#four#
                            Dummy text#
                            inserted, in between !#
                            One#two#three#
                            One#two#
                            One#
                            I want#this#to be #last#
                            

                            After running the following S/R, SEVEN times, one after another :

                            SEARCH (?-s)([^@#\r\n]+?)#(.+#)?([^@#\r\n]+)#

                            REPLACE \3@#\2\1@#

                            The modified text is, now :

                            twelve@#eleven@#ten@#nine@#eight@#seven@#six@#five@#four@#three@#two@#One@#
                            eleven@#ten@#nine@#eight@#seven@#six#five@#four@#three@#two@#One@#
                            ten@#nine@#eight@#seven@#six@#five@#four@#three@#two@#One@#
                            Other text NOT concerned#
                            by this Search Replacement#
                            nine@#eight@#seven@#six@#five#four@#three@#two@#One@#
                            eight@#seven@#six@#five@#four@#three@#two@#One@#
                            seven@#six@#five@#four#three@#two@#One@#
                            Bla bla blah#
                            Bla bla blah#
                            Bla bla blah#
                            six@#five@#four@#three@#two@#One@#
                            five@#four@#three#two@#One@#
                            four@#three@#two@#One@#
                            Dummy text#
                            inserted, in between !#
                            three@#two#One@#
                            two@#One@#
                            One#
                            last@#to be @#this@#I want@#
                            

                            Seven consecutive runs of that regex S/R are required, to get the sought text :

                            • Run 1 : 12 occurrences replaced
                            • Run 2 : 10 occurrences replaced
                            • Run 3 : 7 occurrences replaced
                            • Run 4 : 5 occurrences replaced
                            • Run 5 : 3 occurrences replaced
                            • Run 6 : 1 occurrences replaced
                            • Run 7 : 0 occurrences replaced

                            Note : After each run, you may hit the Find Next button, before hitting the Replace All button, to guess the general process !

                            The part [^@#\r\n], in the searched regex, represents any single character, different from @, #, \n and \r


                            Then, after running the last S/R, once :

                            SEARCH (?-s)(@?#)(?=.)|@?#

                            REPLACE ?1\x20[NEWLINE]\x20

                            We obtain our final text :

                            twelve [NEWLINE] eleven [NEWLINE] ten [NEWLINE] nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            eleven [NEWLINE] ten [NEWLINE] nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            ten [NEWLINE] nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            Other text NOT concerned
                            by this Search Replacement
                            nine [NEWLINE] eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            eight [NEWLINE] seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            seven [NEWLINE] six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            Bla bla blah
                            Bla bla blah
                            Bla bla blah
                            six [NEWLINE] five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            five [NEWLINE] four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            four [NEWLINE] three [NEWLINE] two [NEWLINE] One
                            Dummy text
                            inserted, in between !
                            three [NEWLINE] two [NEWLINE] One
                            two [NEWLINE] One
                            One
                            last [NEWLINE] to be  [NEWLINE] this [NEWLINE] I want
                            

                            The search part looks for the regex @?#, either, inside the lines ( case group 1 defined ) or at end of lines ( case NO group 1 )

                            The replacement part means that, IF group 1 exists, the searched text is replaced by the string [NEWLINE], surrounded by space characters, ELSE NO replacement occurs

                            Et voilà !

                            Best Regards,

                            guy038

                            1 Reply Last reply Reply Quote 0
                            • abuali humaA
                              abuali huma
                              last edited by

                              Thanks very much!
                              But just to be clear, in the first regex

                              SEARCH (?-s)\x20?[NEWLINE]\x20?|(?<=.)$

                              REPLACE #

                              Removing the value ** \x20 ** will result this
                              SEARCH (?-s)?[NEWLINE]?|(?<=.)$

                              Which will result capturing the space “if available” before and after [NEWLINE] string in first and last group?

                              1 Reply Last reply Reply Quote 0
                              • abuali humaA
                                abuali huma
                                last edited by

                                I found out the removing the \x20 does what I described… Thanks again

                                1 Reply Last reply Reply Quote 0
                                • abuali humaA
                                  abuali huma
                                  last edited by

                                  I modified the original Search regex, as it catches some Unicode characters with will break the line in a middle of a word. So in the modified regex I replace \W with \x20 (space character)… so far no word breaking issues
                                  Here is the modified one
                                  (?-s).{1,44}(?=\x20)

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  The Community of users of the Notepad++ text editor.
                                  Powered by NodeBB | Contributors