Community
    • Login

    Regex: Put a dot after every 10 words

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    10 Posts 3 Posters 966 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Neculai I. FantanaruN
      Neculai I. Fantanaru
      last edited by

      hi, I need to put a dot after every 10 words. My regex is not so good:

      SEARCH: (\w+){0,5}
      REPLACE BY: \1.

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Neculai I. Fantanaru
        last edited by

        @Neculai-I-Fantanaru

        Now THAT looks like a regex just thrown out there, in order to get some help.
        Can you please explain how that regex is supposed to work?
        If you do that, perhaps you’ll see what’s wrong with it.

        1 Reply Last reply Reply Quote 0
        • Neculai I. FantanaruN
          Neculai I. Fantanaru
          last edited by

          @Alan-Kilborn good day, sir. My regex is not just thrown out there, in order to get some help. It is a solution, a step forward. I know is not good, but I don’t know any other solution…

          1 Reply Last reply Reply Quote 0
          • Neculai I. FantanaruN
            Neculai I. Fantanaru
            last edited by

            ok, I have another solution, will put a dot at every 5 words. But seems to work only for the first line, not for all lines:

            SEARCH: (\w+){10}\K
            REPLACE BY: \1.

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Neculai I. Fantanaru
              last edited by

              @Neculai-I-Fantanaru

              Sorry, it’s just that your initial regex looked very suspicious, because you mentioned 10 but then nothing even close to 10 appeared in your regex.

              So \w does NOT mean to match a “word”, it means match a “word character”.

              For example, there are three word characters in the first word of this:

              abc defg hijkl

              1 Reply Last reply Reply Quote 0
              • Neculai I. FantanaruN
                Neculai I. Fantanaru
                last edited by

                oh, yes. I mention 5 words, sorry, should be 10 words. But the regex should be the same (I will change the number)

                1 Reply Last reply Reply Quote 0
                • Neculai I. FantanaruN
                  Neculai I. Fantanaru
                  last edited by

                  can anyone help me? @guy038

                  1 Reply Last reply Reply Quote 0
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello, @neculai-i-fantanaru, @alan-kilborn and All,

                    I agree with @alan-kilborn’s comments and let you look for a solution by yourself ! But, apparently, you’ve reached a dead-end !

                    You said :

                    hi, I need to put a dot after every 10 words

                    But you haven’t shown us which kind of text is concerned :-(

                    I suppose that your initial text does not contain any punctuation and is, mainly, a list of words, separated with space characters ?!


                    As an example, let’s take the first sentence of the preamble of the license.txt file

                    The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
                    

                    AFTER removing any punctuation sign, we get :

                    The licenses for most software are designed to take away your freedom to share and change it By contrast the GNU General Public License is intended to guarantee your freedom to share and change free software to make sure the software is free for all its users This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it Some other Free Software Foundation software is covered by the GNU Library General Public License instead You can apply it to your programs too
                    

                    Note that, in this sentence, it remains the possessive structure Free Software Foundation's software. To my mind, the expression Foundation's should be considered as a single word, as well as other contracted English forms such as I'm, don't,…

                    So, an appropriate regex S/R could be :

                    SEARCH (?:([\w'’]+)\W+){9}(?1)\K    or    (?:[\w'’]+\W+){9}[\w'’]+\K

                    REPLACE .

                    And, after a single click on the Replace All button, this OUTPUT sentence becomes :

                    The licenses for most software are designed to take. away your freedom to share and change it By contrast. the GNU General Public License is intended to guarantee your. freedom to share and change free software to make sure. the software is free for all its users This General. Public License applies to most of the Free Software Foundation's. software and to any other program whose authors commit to. using it Some other Free Software Foundation software is covered. by the GNU Library General Public License instead You can. apply it to your programs too
                    

                    However, note that this text is not consistent as a full stop is inserted every 10 words and does not respect, obviously, the English language !!

                    Best Regards,

                    guy038

                    Neculai I. FantanaruN 1 Reply Last reply Reply Quote 3
                    • Neculai I. FantanaruN
                      Neculai I. Fantanaru @guy038
                      last edited by

                      @guy038 said in Regex: Put a dot after every 10 words:

                      (?:([\w’’]+)\W+){9}(?1)\K

                      thank you @guy038

                      1 Reply Last reply Reply Quote 0
                      • guy038G
                        guy038
                        last edited by guy038

                        Hi, @neculai-i-fantanaru and All,

                        I forgot to explain why the first provided regex was (?:([\w'’]+)\W+){9}(?1)\K and not the regex (?:([\w'’]+)\W+){9}\1\K


                        Well, the \1 right before \K would match the last occurrence of the group 1, that is the ninth word found with the regex [\w'’]+ !

                        So, the regex (?:([\w'’]+)\W+){9}\1 would match strings like :

                        111 222 333 444 555 666 777 888 999 999
                        

                        or

                        000 111 222 333 444 555 666 777 888 888
                        

                        but NOT the string :

                        000 111 222 333 444 555 666 777 888 999
                        

                        On the contrary, the (?1) syntax, is a subroutine call to the group 1 and is, fundamentally, identical to the regex [\w'’]+ itself. So, the regex (?:([\w'’]+)\W+){9}(?1) would also match my third example too and any other word ;-))

                        Best Regards

                        guy038

                        1 Reply Last reply Reply Quote 2
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors