Regex: Put a dot after every 10 words
-
hi, I need to put a dot after every 10 words. My regex is not so good:
SEARCH:
(\w+){0,5}
REPLACE BY:\1.
-
Now THAT looks like a regex just thrown out there, in order to get some help.
Can you please explain how that regex is supposed to work?
If you do that, perhaps you’ll see what’s wrong with it. -
@Alan-Kilborn good day, sir. My regex is not just thrown out there, in order to get some help. It is a solution, a step forward. I know is not good, but I don’t know any other solution…
-
ok, I have another solution, will put a dot at every 5 words. But seems to work only for the first line, not for all lines:
SEARCH:
(\w+){10}\K
REPLACE BY:\1.
-
Sorry, it’s just that your initial regex looked very suspicious, because you mentioned 10 but then nothing even close to 10 appeared in your regex.
So
\w
does NOT mean to match a “word”, it means match a “word character”.For example, there are three word characters in the first word of this:
abc defg hijkl
-
oh, yes. I mention 5 words, sorry, should be 10 words. But the regex should be the same (I will change the number)
-
can anyone help me? @guy038
-
Hello, @neculai-i-fantanaru, @alan-kilborn and All,
I agree with @alan-kilborn’s comments and let you look for a solution by yourself ! But, apparently, you’ve reached a dead-end !
You said :
hi, I need to put a dot after every 10 words
But you haven’t shown us which kind of text is concerned :-(
I suppose that your initial text does not contain any punctuation and is, mainly, a list of words, separated with space characters ?!
As an example, let’s take the first sentence of the preamble of the
license.txt
fileThe licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
AFTER removing any punctuation sign, we get :
The licenses for most software are designed to take away your freedom to share and change it By contrast the GNU General Public License is intended to guarantee your freedom to share and change free software to make sure the software is free for all its users This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it Some other Free Software Foundation software is covered by the GNU Library General Public License instead You can apply it to your programs too
Note that, in this sentence, it remains the possessive structure
Free Software Foundation's software
. To my mind, the expressionFoundation's
should be considered as a single word, as well as other contracted English forms such asI'm
,don't
,…So, an appropriate regex S/R could be :
SEARCH
(?:([\w'’]+)\W+){9}(?1)\K
or(?:[\w'’]+\W+){9}[\w'’]+\K
REPLACE
.
And, after a single click on the
Replace All
button, this OUTPUT sentence becomes :The licenses for most software are designed to take. away your freedom to share and change it By contrast. the GNU General Public License is intended to guarantee your. freedom to share and change free software to make sure. the software is free for all its users This General. Public License applies to most of the Free Software Foundation's. software and to any other program whose authors commit to. using it Some other Free Software Foundation software is covered. by the GNU Library General Public License instead You can. apply it to your programs too
However, note that this text is not consistent as a full stop is inserted every
10
words and does not respect, obviously, the English language !!Best Regards,
guy038
-
-
Hi, @neculai-i-fantanaru and All,
I forgot to explain why the first provided regex was
(?:([\w'’]+)\W+){9}(?1)\K
and not the regex(?:([\w'’]+)\W+){9}\1\K
Well, the
\1
right before\K
would match the last occurrence of the group1
, that is the ninth word found with the regex[\w'’]+
!So, the regex
(?:([\w'’]+)\W+){9}\1
would match strings like :111 222 333 444 555 666 777 888 999 999
or
000 111 222 333 444 555 666 777 888 888
but NOT the string :
000 111 222 333 444 555 666 777 888 999
On the contrary, the
(?1)
syntax, is a subroutine call to the group1
and is, fundamentally, identical to the regex[\w'’]+
itself. So, the regex(?:([\w'’]+)\W+){9}(?1)
would also match my third example too and any other word ;-))Best Regards
guy038