• Login
Community
  • Login

Regex: Put a dot after every 10 words

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
10 Posts 3 Posters 970 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • N
    Neculai I. Fantanaru
    last edited by Sep 25, 2021, 9:24 PM

    hi, I need to put a dot after every 10 words. My regex is not so good:

    SEARCH: (\w+){0,5}
    REPLACE BY: \1.

    A 1 Reply Last reply Sep 25, 2021, 11:05 PM Reply Quote 0
    • A
      Alan Kilborn @Neculai I. Fantanaru
      last edited by Sep 25, 2021, 11:05 PM

      @Neculai-I-Fantanaru

      Now THAT looks like a regex just thrown out there, in order to get some help.
      Can you please explain how that regex is supposed to work?
      If you do that, perhaps you’ll see what’s wrong with it.

      1 Reply Last reply Reply Quote 0
      • N
        Neculai I. Fantanaru
        last edited by Sep 26, 2021, 6:40 AM

        @Alan-Kilborn good day, sir. My regex is not just thrown out there, in order to get some help. It is a solution, a step forward. I know is not good, but I don’t know any other solution…

        1 Reply Last reply Reply Quote 0
        • N
          Neculai I. Fantanaru
          last edited by Sep 26, 2021, 6:48 AM

          ok, I have another solution, will put a dot at every 5 words. But seems to work only for the first line, not for all lines:

          SEARCH: (\w+){10}\K
          REPLACE BY: \1.

          A 1 Reply Last reply Sep 26, 2021, 11:14 AM Reply Quote 0
          • A
            Alan Kilborn @Neculai I. Fantanaru
            last edited by Sep 26, 2021, 11:14 AM

            @Neculai-I-Fantanaru

            Sorry, it’s just that your initial regex looked very suspicious, because you mentioned 10 but then nothing even close to 10 appeared in your regex.

            So \w does NOT mean to match a “word”, it means match a “word character”.

            For example, there are three word characters in the first word of this:

            abc defg hijkl

            1 Reply Last reply Reply Quote 0
            • N
              Neculai I. Fantanaru
              last edited by Sep 26, 2021, 11:23 AM

              oh, yes. I mention 5 words, sorry, should be 10 words. But the regex should be the same (I will change the number)

              1 Reply Last reply Reply Quote 0
              • N
                Neculai I. Fantanaru
                last edited by Sep 27, 2021, 5:01 AM

                can anyone help me? @guy038

                1 Reply Last reply Reply Quote 0
                • G
                  guy038
                  last edited by guy038 Nov 24, 2022, 3:27 PM Sep 27, 2021, 10:00 AM

                  Hello, @neculai-i-fantanaru, @alan-kilborn and All,

                  I agree with @alan-kilborn’s comments and let you look for a solution by yourself ! But, apparently, you’ve reached a dead-end !

                  You said :

                  hi, I need to put a dot after every 10 words

                  But you haven’t shown us which kind of text is concerned :-(

                  I suppose that your initial text does not contain any punctuation and is, mainly, a list of words, separated with space characters ?!


                  As an example, let’s take the first sentence of the preamble of the license.txt file

                  The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
                  

                  AFTER removing any punctuation sign, we get :

                  The licenses for most software are designed to take away your freedom to share and change it By contrast the GNU General Public License is intended to guarantee your freedom to share and change free software to make sure the software is free for all its users This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it Some other Free Software Foundation software is covered by the GNU Library General Public License instead You can apply it to your programs too
                  

                  Note that, in this sentence, it remains the possessive structure Free Software Foundation's software. To my mind, the expression Foundation's should be considered as a single word, as well as other contracted English forms such as I'm, don't,…

                  So, an appropriate regex S/R could be :

                  SEARCH (?:([\w'’]+)\W+){9}(?1)\K    or    (?:[\w'’]+\W+){9}[\w'’]+\K

                  REPLACE .

                  And, after a single click on the Replace All button, this OUTPUT sentence becomes :

                  The licenses for most software are designed to take. away your freedom to share and change it By contrast. the GNU General Public License is intended to guarantee your. freedom to share and change free software to make sure. the software is free for all its users This General. Public License applies to most of the Free Software Foundation's. software and to any other program whose authors commit to. using it Some other Free Software Foundation software is covered. by the GNU Library General Public License instead You can. apply it to your programs too
                  

                  However, note that this text is not consistent as a full stop is inserted every 10 words and does not respect, obviously, the English language !!

                  Best Regards,

                  guy038

                  N 1 Reply Last reply Sep 27, 2021, 4:55 PM Reply Quote 3
                  • N
                    Neculai I. Fantanaru @guy038
                    last edited by Sep 27, 2021, 4:55 PM

                    @guy038 said in Regex: Put a dot after every 10 words:

                    (?:([\w’’]+)\W+){9}(?1)\K

                    thank you @guy038

                    1 Reply Last reply Reply Quote 0
                    • G
                      guy038
                      last edited by guy038 Nov 24, 2022, 3:29 PM Sep 27, 2021, 7:35 PM

                      Hi, @neculai-i-fantanaru and All,

                      I forgot to explain why the first provided regex was (?:([\w'’]+)\W+){9}(?1)\K and not the regex (?:([\w'’]+)\W+){9}\1\K


                      Well, the \1 right before \K would match the last occurrence of the group 1, that is the ninth word found with the regex [\w'’]+ !

                      So, the regex (?:([\w'’]+)\W+){9}\1 would match strings like :

                      111 222 333 444 555 666 777 888 999 999
                      

                      or

                      000 111 222 333 444 555 666 777 888 888
                      

                      but NOT the string :

                      000 111 222 333 444 555 666 777 888 999
                      

                      On the contrary, the (?1) syntax, is a subroutine call to the group 1 and is, fundamentally, identical to the regex [\w'’]+ itself. So, the regex (?:([\w'’]+)\W+){9}(?1) would also match my third example too and any other word ;-))

                      Best Regards

                      guy038

                      1 Reply Last reply Reply Quote 2
                      6 out of 10
                      • First post
                        6/10
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors