regex: Search operators (like google)
-
hi. I want to make a search like this one I use on google. A simple example:
"I love * car"practically
*will be replace with a missing word, like “This” (I love this car).How can I search this way with notepad++ ?
-
Hi, Vasile
Easy ! To simulate this Google search, use, in N++, the Regular expression search mode and, simply, looks for the regex, below :
SEARCH
I love \w+ carRemember that
\wstands for a single word character, in the range[0-9A-Fa-f_], as well as any accentuated letter or any word character from the Greek, Cyrillic, Hebrew or Arab scripts !Cheers,
guy038
-
hello guy038, works just fine. But If I want to find more words, I try this, but doesn’t work:
I love \w{1,6}+ car -
anyway, I find another solution to find all words between “I love” and “car”
I love \w.+? car -
Vasile and All,
Ah, of course, Vasile, your regex
I love \w{1,6}+ carcontains the part\w{1,6}+, with the quantifier{1,6}, followed by a+sign. It represents an atomic sequence. That is to say that, ONCE the regex engine matches the greatest possible amount of word characters , up to 6, it would NEVER backtrack, in order to satisfy, possibly, the remainder of the overall regex !Actually, your regex would just match the 6 following lines, below, with, only, ONE word ( of 1 to 6 letters ), between the words love and car
I love car I love 1 car I love 12 car I love 123 car I love 1234 car I love 12345 car I love 123456 car I love 1234567 car
Here are, below, TWO regexes, looking for a range of words, between TWO boundaries-words, let’s say, WORD_1 and WORD_2, which would be, both, separated by, at least, M words and NO more than N words :
-
WORD_1(\W+\w+){M,N}\W+WORD_2
-
WORD_1([^\w\r\n]+\w+){M,N}[^\w\r\n]+WORD_2
The first syntax may match over several consecutive lines
The second syntax forces the regex engine to match the two boundaries-words WORD_1 and WORD_2, on a SAME line
One example :
-
Let WORD_1 be the article I
-
Let WORD_2 be the name car
-
Let M and N be the values 2 and 5
So, the two resulting regexes are :
-
I(\W+\w+){2,5}\W+car
-
I([^\w\r\n]+\w+){2,5}[^\w\r\n]+car
=> The first syntax matches the two complete lines, first, then the lines, from 3 to 6
=> The second syntax matches, ONLY, a UNIQUE line, from 3 to 6, below :
I car ! I love car ! I love this car ! I love this blue car ! I love this nice blue car ! I love this very nice blue car ! I love this gleaming and very nice blue car !
I would like to point out a special regex construction, which may, sometimes, help to get powerful matches. It’s the part
[^\w\r\n]!Indeed, I, originally, considered the classical syntax
\W+, to match any range of NON-words characters, which occurs before a word. However, as the class\Wis the opposite of the\wclass, the NON-word\Wmay, also, match any EOL character, like\nor\r, leading, sometimes, to matches on two consecutive lines !So, to find out the second case, I built this NEGATIVE class
[^\w\r\n], that considers a character, which is, both, NOT a Word character AND neither the\nnor the\rEOL character !
An other example : the regex
[^\W_a-z]is a kind of double-negation construction : It, finally, matches any Word character, except for the underscore (_) and all the usual lower-case letters ([a-z]). In other words, this regex would match :-
Any digit or number-like symbol
-
Any upper-case letter, accentuated or NOT
-
Any accentuated lower-case letter, ONLY
Cheers,
guy038
-
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login