Vasile and All,
Ah, of course, Vasile, your regex I love \w{1,6}+ car contains the part \w{1,6}+, with the quantifier {1,6}, followed by a + sign. It represents an atomic sequence. That is to say that, ONCE the regex engine matches the greatest possible amount of word characters , up to 6, it would NEVER backtrack, in order to satisfy, possibly, the remainder of the overall regex !
Actually, your regex would just match the 6 following lines, below, with, only, ONE word ( of 1 to 6 letters ), between the words love and car
I love car
I love 1 car
I love 12 car
I love 123 car
I love 1234 car
I love 12345 car
I love 123456 car
I love 1234567 car
Here are, below, TWO regexes, looking for a range of words, between TWO boundaries-words, let’s say, WORD_1 and WORD_2, which would be, both, separated by, at least, M words and NO more than N words :
WORD_1(\W+\w+){M,N}\W+WORD_2
WORD_1([^\w\r\n]+\w+){M,N}[^\w\r\n]+WORD_2
The first syntax may match over several consecutive lines
The second syntax forces the regex engine to match the two boundaries-words WORD_1 and WORD_2, on a SAME line
One example :
Let WORD_1 be the article I
Let WORD_2 be the name car
Let M and N be the values 2 and 5
So, the two resulting regexes are :
I(\W+\w+){2,5}\W+car
I([^\w\r\n]+\w+){2,5}[^\w\r\n]+car
=> The first syntax matches the two complete lines, first, then the lines, from 3 to 6
=> The second syntax matches, ONLY, a UNIQUE line, from 3 to 6, below :
I car !
I love car !
I love this car !
I love this blue car !
I love this nice blue car !
I love this very nice blue car !
I love this gleaming and very nice blue car !
I would like to point out a special regex construction, which may, sometimes, help to get powerful matches. It’s the part [^\w\r\n] !
Indeed, I, originally, considered the classical syntax \W+, to match any range of NON-words characters, which occurs before a word. However, as the class \W is the opposite of the \w class, the NON-word \W may, also, match any EOL character, like \n or \r, leading, sometimes, to matches on two consecutive lines !
So, to find out the second case, I built this NEGATIVE class [^\w\r\n], that considers a character, which is, both, NOT a Word character AND neither the \n nor the \r EOL character !
An other example : the regex [^\W_a-z] is a kind of double-negation construction : It, finally, matches any Word character, except for the underscore ( _ ) and all the usual lower-case letters ( [a-z] ). In other words, this regex would match :
Any digit or number-like symbol
Any upper-case letter, accentuated or NOT
Any accentuated lower-case letter, ONLY
Cheers,
guy038