Regex: Find this 3 words within the range of 8 words



  • For example, like this google search: "mother * beautiful * car" With this search, on google I find this line:

    Text messages haunt mother after ‘beautiful’ daughter’s car crash death.

    The same, I want to use a regex to find the same line. So I have to find this 3 words within the range of 8 words

    I made this regex, but not to good:

    ( mother)\h+(\w+\h+){1,8}?(beautiful ) #this will search mother and beautiful between 8 words. This is ok. Works.

    but to search 3 words, between 8 words, it is difficult:

    ( MOTHER)\h+(\w+\h+){1,8}?(BEAUTIFUL )|\h+(\w+\h+){1,8}?(CAR)

    anyone can help me?



  • @Vasile-Caraus ,

    anyone can help me?

    Possibly. But not without assumptions.

    If you could clarify things like

    • can it be mother one two three four five six seven eight beautiful one two three four five six seven eight car? or does the distance from mother to car have to be the 8?
    • do they have to have words in between, or would mother beautiful car match?
    • does order matter?
    • what happens when it’s one mother three beautiful five six seven eight car – would it fail because car is the ninth word? or would it match because from mother to car is only eight words
    • should it span multiple lines or not?

    Showing us examples of data that should match and data that shouldn’t is always better than making us guess. no matter how well written, almost no one will come up with a description of their data that is sufficient to get a question like this answered without also providing example data.

    mother two three beautiful five six car eight
    mother two three ugly five six car eight
    mother two three beautify five six vehicle eight
    one mother three beautiful five six seven car
    mother one two three four five beautiful one two three four five car
    

    Here’s one solution that matches three of the lines with a certain set of assumptions.
    (mother)\h+((?:\w+\h+){0,8})(beautiful)\h+((?:\w+\h+){0,8})(car)
    0f19c815-7fb1-4aa1-a200-03baf75502bf-image.png

    And a slight modification that matches only two, with a more restrictive assumption:
    (mother)\h+(?=(?:\w+\h+){0,8}car)((?:\w+\h+){0,8})(beautiful)\h+((?:\w+\h+){0,8})(car)
    d5ec075e-3e0e-47ff-8002-03cad03dcf37-image.png

    (The first only requires that there be no more than 8 words between each of the three keywords. The second uses a lookahead to make sure there are 0-8 words between mother and car, as well as allowing some number of words between each word.)

    You should know by now that you will get better answers if you make a better problem statement. Until you provide more, you probably won’t get anything better than that.

    ----

    Do you want regex search/replace help? Then please be patient and polite, show some effort, and be willing to learn; answer questions and requests for clarification that are made of you. All example text should be marked as literal text using the </> toolbar button or manual Markdown syntax. To make regex in red (and so they keep their special characters like *), use backticks, like `^.*?blah.*?\z`. Screenshots can be pasted from the clipboard to your post using Ctrl+V to show graphical items, but any text should be included as literal text in your post so we can easily copy/paste your data. Show the data you have and the text you want to get from that data; include examples of things that should match and be transformed, and things that don’t match and should be left alone; show edge cases and make sure you examples are as varied as your real data. Show the regex you already tried, and why you thought it should work; tell us what’s wrong with what you do get. Read the official NPP Searching / Regex docs and the forum’s Regular Expression FAQ. If you follow these guidelines, you’re much more likely to get helpful replies that solve your problem in the shortest number of tries.



  • Hello, vasile, @peterjones and All,

    OMG, this problem really gave me a hard time ! Indeed, we have many conditions to respect :

    EDIT : Indeed, all this post is obsolete So refer, instead, to this other post, below :    https://community.notepad-plus-plus.org/post/64093

    • Each KEY word ( mother , beautiful and car ) must all be present, at least once, in current line

    • The range between the a first and a last KEY word must not contain more than 8 words

    • The range to match must contain the three KEY words in any order


    • The first condition is rather easy with the three look-aheads (?=.*mother)(?=.*beautiful)(?=.*car) which are tested as soon as the regex engine is at beginning of a line

    • The second condition means that :

      • The shortest range is KEY_word_#1 KEY_word_#3 KEY_word_#2

      • The greatest range is KEY_word_#1 word_1 word_2 word_3 wod_4 word_5 KEY_word_#2, with KEY_word_#3 being any word_#

    So this fourth look-ahead (?=(mother|beautiful|car)\h(?:\w+\h){1,6}(?!\1)(mother|beautiful|car)) will be used to be sure that the range to match contain between 3 and 8 words, in totality !

    Note that the first KEY_word_1 is the group 1 and the last KEY_word_#2 is the group 2, which must be different from KEY_word_1, because of the negative look-ahead (?!\1)

    • The third condition represents the main part and searches :

      • The group 1, standing for KEY_word_#1, followed with its blank char

      • A possible range of chars, all different of all the KEY words, followed by its blank char

      • The third KEY word KEY_word_#3, which must be different from KEY_word_#1 and KEY_word_#2,followed by a blank char

      • Again, a possible range of chars, all different of all the KEY words, followed by its blank char

      • The group 2, standing for KEY_word_#2


    Using the free-spacing mode, here is this tricky search regex :

    (?x)                                                                     # FREE-SPACING Mode 
    (?i-s)                                                                   # Search SENSITIVE to CASE and LIMITED to a SINGLE line, at a time
    (?=.*mother)(?=.*beautiful)(?=.*car)                                     # IF EACH KEY word exists, further on, at least ONCE, in CURRENT line
    .*?\K                                                                    #   LEADING chars are NOT considered ( due to \K )
    (?=(mother|beautiful|car)\h(?:\w+\h){1,6}(?!\1)(mother|beautiful|car))   #   IF [KEY_word_#1 + (any WORD + BLANK char) from 1 to 6 TIMES + KEY_word_#2]
                                                                             #      Start of OVERALL match                     
    \1\h                                                                     #      KEY_word_#1 with its BLANK char
    ((?:(?!(?1)).)+?\h)?                                                     #      The SHORTEST POSSIBLE range of chars DIFFERENT from ALL KEY words, with its BLANK char, till ...
    (?!\1|\2)(?1)\h                                                          #      KEY_WORD_#3, with its BLANK char, DIFFERENT from KEY_word_#1 and KEY_word_#2
    ((?:(?!(?1)).)+?\h)?                                                     #      The SHORTEST POSSIBLE range of chars DIFFERENT from ALL KEY words, with its BLANK char, till ...
    \2                                                                       #      Key word #2
    

    Remark that the (?1) syntaxes, in some parts of the regex, are sub-routine calls to the regex, defined by the group 1, so strictly identical to the regex (mother|beautiful|car)

    In order to do some tests, use the regex, below, and the text that follows !

    =================================================== Regex in FREE-SPACING mode ========================================
    (?x)
    (?i-s)
    (?=.*aaaa)(?=.*hhhh)(?=.*zzzz)
    .*?\K
    (?=(aaaa|hhhh|zzzz)\h(?:\w+\h){1,6}(?!\1)(aaaa|hhhh|zzzz))
    \1\h
    ((?:(?!(?1)).)+?\h)?
    (?!\1|\2)(?1)\h
    ((?:(?!(?1)).)+?\h)?
    \2
    =======================================================================================================================
    
    The KEY words are the THREE strings "aaaa", "hhhh" and "zzzz"
    
    ================================================== Cases NOT MATCHED ==================================================
    
    aaaa                                               ONE word ONLY
    hhhh                                               ONE word ONLY
    zzzz                                               ONE word ONLY
    
    aaaa 111 222 hhhh                                  TWO words ONLY
    aaaa 111 222 zzzz                                  TWO words ONLY
    hhhh 111 222 zzzz                                  TWO words ONLY
    
    
    aaaa 111 aaaa 222 aaaa                             SAME word, THREE times
    hhhh 111 hhhh 222 hhhh                             SAME word, THREE times
    zzzz 111 zzzz 222 zzzz                             SAME word, THREE times
    
    
    aaaa aaaa 333 hhhh                                 THREE words, but 2 SAME words
    aaaa aaaa 333 zzzz                                 THREE words, but 2 SAME words
    hhhh hhhh 333 aaaa                                 THREE words, but 2 SAME words
    hhhh hhhh 333 zzzz                                 THREE words, but 2 SAME words
    zzzz zzzz 333 aaaa                                 THREE words, but 2 SAME words
    zzzz zzzz 333 hhhh                                 THREE words, but 2 SAME words
    
    aaaa 111 222 aaaa 333 444 555 hhhh                 THREE words, but 2 SAME words
    aaaa 111 222 aaaa 333 444 555 zzzz                 THREE words, but 2 SAME words
    hhhh 111 222 hhhh 333 444 555 aaaa                 THREE words, but 2 SAME words
    hhhh 111 222 hhhh 333 444 555 zzzz                 THREE words, but 2 SAME words
    zzzz 111 222 zzzz 333 444 555 aaaa                 THREE words, but 2 SAME words
    zzzz 111 222 zzzz 333 444 555 hhhh                 THREE words, but 2 SAME words
    
    
    aaaa 111 222 hhhh 333 444 555 666 zzzz             The 3 DIFFERENT words, but in a range of  9 Words  ( so > 8 )
    hhhh 111 222 aaaa 333 444 555 666 zzzz             The 3 DIFFERENT words, but in a range of  9 Words  ( so > 8 )
    aaaa 111 222 zzzz 333 444 555 666 hhhh             The 3 DIFFERENT words, but in a range of  9 Words  ( so > 8 )
    zzzz 111 222 hhhh 333 444 555 666 aaaa             The 3 DIFFERENT words, but in a range of  9 Words  ( so > 8 )
    zzzz 111 222 aaaa 333 444 555 666 hhhh             The 3 DIFFERENT words, but in a range of  9 Words  ( so > 8 )
    zzzz 111 222 hhhh 333 444 555 666 aaaa             The 3 DIFFERENT words, but in a range of  9 Words  ( so > 8 )
    
    aaaa 111 222 3333 hhhh 444 555 666 777 zzzz        The 3 DIFFERENT words, but in a range of 10 Words  ( so > 8 )
    hhhh 111 222 3333 aaaa 444 555 666 777 zzzz        The 3 DIFFERENT words, but in a range of 10 Words  ( so > 8 )
    aaaa 111 222 3333 zzzz 444 555 666 777 hhhh        The 3 DIFFERENT words, but in a range of 10 Words  ( so > 8 )
    zzzz 111 222 3333 hhhh 444 555 666 777 aaaa        The 3 DIFFERENT words, but in a range of 10 Words  ( so > 8 )
    zzzz 111 222 3333 aaaa 444 555 666 777 hhhh        The 3 DIFFERENT words, but in a range of 10 Words  ( so > 8 )
    zzzz 111 222 3333 hhhh 444 555 666 777 aaaa        The 3 DIFFERENT words, but in a range of 10 Words  ( so > 8 )
    
    aaaa 111 222 3333 444 hhhh 555 666 777 888 zzzz    The 3 DIFFERENT words, but in a range of 11 Words  ( so > 8 )
    hhhh 111 222 3333 444 aaaa 555 666 777 888 zzzz    The 3 DIFFERENT words, but in a range of 11 Words  ( so > 8 )
    aaaa 111 222 3333 444 zzzz 555 666 777 888 hhhh    The 3 DIFFERENT words, but in a range of 11 Words  ( so > 8 )
    zzzz 111 222 3333 444 hhhh 555 666 777 888 aaaa    The 3 DIFFERENT words, but in a range of 11 Words  ( so > 8 )
    zzzz 111 222 3333 444 aaaa 555 666 777 888 hhhh    The 3 DIFFERENT words, but in a range of 11 Words  ( so > 8 )
    zzzz 111 222 3333 444 hhhh 555 666 777 888 aaaa    The 3 DIFFERENT words, but in a range of 11 Words  ( so > 8 )
    
    ================================================== MATCHED cases ======================================================
    
    aaaa hhhh zzzz                          aaaa hhhh zzzz
    aaaa zzzz hhhh                          aaaa zzzz hhhh
    hhhh aaaa zzzz                          hhhh aaaa zzzz
    hhhh zzzz aaaa                          hhhh zzzz aaaa
    zzzz aaaa hhhh                          zzzz aaaa hhhh
    zzzz hhhh aaaa                          zzzz hhhh aaaa
    
    Range of 4 WORDS  aaaa 111 hhhh zzzz    aaaa hhhh 111 zzzz
    Range of 4 WORDS  aaaa 111 zzzz hhhh    aaaa zzzz 111 hhhh
    Range of 4 WORDS  hhhh 111 aaaa zzzz    hhhh aaaa 111 zzzz
    Range of 4 WORDS  hhhh 111 zzzz aaaa    hhhh zzzz 111 aaaa
    Range of 4 WORDS  zzzz 111 aaaa hhhh    zzzz aaaa 111 hhhh
    Range of 4 WORDS  zzzz 111 hhhh aaaa    zzzz hhhh 111 aaaa
    
    aaaa 111 222 hhhh zzzz    aaaa 111 hhhh 222 zzzz    aaaa hhhh 111 222 zzzz
    aaaa 111 222 zzzz hhhh    aaaa 111 zzzz 222 hhhh    aaaa zzzz 111 222 hhhh
    hhhh 111 222 aaaa zzzz    hhhh 111 aaaa 222 zzzz    hhhh aaaa 111 222 zzzz
    hhhh 111 222 zzzz aaaa    hhhh 111 zzzz 222 aaaa    hhhh zzzz 111 222 aaaa
    zzzz 111 222 aaaa hhhh    zzzz 111 aaaa 222 hhhh    zzzz aaaa 111 222 hhhh
    zzzz 111 222 hhhh aaaa    zzzz 111 hhhh 222 aaaa    zzzz hhhh 111 222 aaaa
    
    Range of 6 WORDS  aaaa 111 222 333 hhhh zzzz    aaaa 111 222 hhhh 333 zzzz    aaaa 111 hhhh 222 333 zzzz    aaaa hhhh 111 222 333 zzzz
    Range of 6 WORDS  aaaa 111 222 333 zzzz hhhh    aaaa 111 222 zzzz 333 hhhh    aaaa 111 zzzz 222 333 hhhh    aaaa zzzz 111 222 333 hhhh
    Range of 6 WORDS  hhhh 111 222 333 aaaa zzzz    hhhh 111 222 aaaa 333 zzzz    hhhh 111 aaaa 222 333 zzzz    hhhh aaaa 111 222 333 zzzz
    Range of 6 WORDS  hhhh 111 222 333 zzzz aaaa    hhhh 111 222 zzzz 333 aaaa    hhhh 111 zzzz 222 333 aaaa    hhhh zzzz 111 222 333 aaaa
    Range of 6 WORDS  zzzz 111 222 333 aaaa hhhh    zzzz 111 222 aaaa 333 hhhh    zzzz 111 aaaa 222 333 hhhh    zzzz aaaa 111 222 333 hhhh
    Range of 6 WORDS  zzzz 111 222 333 hhhh aaaa    zzzz 111 222 hhhh 333 aaaa    zzzz 111 hhhh 222 333 aaaa    zzzz hhhh 111 222 333 aaaa
    
    aaaa 111 222 333 4444 hhhh zzzz    aaaa 111 222 333 hhhh 4444 zzzz    aaaa 111 222 hhhh 333 4444 zzzz    aaaa 111 hhhh 222 333 4444 zzzz    aaaa hhhh 111 222 333 4444 zzzz
    aaaa 111 222 333 4444 zzzz hhhh    aaaa 111 222 333 zzzz 4444 hhhh    aaaa 111 222 zzzz 333 4444 hhhh    aaaa 111 zzzz 222 333 4444 hhhh    aaaa zzzz 111 222 333 4444 hhhh
    hhhh 111 222 333 4444 aaaa zzzz    hhhh 111 222 333 aaaa 4444 zzzz    hhhh 111 222 aaaa 333 4444 zzzz    hhhh 111 aaaa 222 333 4444 zzzz    hhhh aaaa 111 222 333 4444 zzzz
    hhhh 111 222 333 4444 zzzz aaaa    hhhh 111 222 333 zzzz 4444 aaaa    hhhh 111 222 zzzz 333 4444 aaaa    hhhh 111 zzzz 222 333 4444 aaaa    hhhh zzzz 111 222 333 4444 aaaa
    zzzz 111 222 333 4444 aaaa hhhh    zzzz 111 222 333 aaaa 4444 hhhh    zzzz 111 222 aaaa 333 4444 hhhh    zzzz 111 aaaa 222 333 4444 hhhh    zzzz aaaa 111 222 333 4444 hhhh
    zzzz 111 222 333 4444 hhhh aaaa    zzzz 111 222 333 hhhh 4444 aaaa    zzzz 111 222 hhhh 333 4444 aaaa    zzzz 111 hhhh 222 333 4444 aaaa    zzzz hhhh 111 222 333 4444 aaaa
    
    
    Range of 8 WORDS  aaaa 111 222 333 444 555 hhhh zzzz    aaaa 111 222 333 444 hhhh 555 zzzz    aaaa 111 222 333 hhhh 444 555 zzzz    aaaa 111 222 hhhh 333 444 555 zzzz    aaaa 111 hhhh 222 333 444 555 zzzz    aaaa hhhh 111 222 333 444 555 zzzz
    Range of 8 WORDS  aaaa 111 222 333 444 555 zzzz hhhh    aaaa 111 222 333 444 zzzz 555 hhhh    aaaa 111 222 333 zzzz 444 555 hhhh    aaaa 111 222 zzzz 333 444 555 hhhh    aaaa 111 zzzz 222 333 444 555 hhhh    aaaa zzzz 111 222 333 444 555 hhhh
    Range of 8 WORDS  hhhh 111 222 333 444 555 aaaa zzzz    hhhh 111 222 333 444 aaaa 555 zzzz    hhhh 111 222 333 aaaa 444 555 zzzz    hhhh 111 222 aaaa 333 444 555 zzzz    hhhh 111 aaaa 222 333 444 555 zzzz    hhhh aaaa 111 222 333 444 555 zzzz
    Range of 8 WORDS  hhhh 111 222 333 444 555 zzzz aaaa    hhhh 111 222 333 444 zzzz 555 aaaa    hhhh 111 222 333 zzzz 444 555 aaaa    hhhh 111 222 zzzz 333 444 555 aaaa    hhhh 111 zzzz 222 333 444 555 aaaa    hhhh zzzz 111 222 333 444 555 aaaa
    Range of 8 WORDS  zzzz 111 222 333 444 555 aaaa hhhh    zzzz 111 222 333 444 aaaa 555 hhhh    zzzz 111 222 333 aaaa 444 555 hhhh    zzzz 111 222 aaaa 333 444 555 hhhh    zzzz 111 aaaa 222 333 444 555 hhhh    zzzz aaaa 111 222 333 444 555 hhhh
    Range of 8 WORDS  zzzz 111 222 333 444 555 hhhh aaaa    zzzz 111 222 333 444 hhhh 555 aaaa    zzzz 111 222 333 hhhh 444 555 aaaa    zzzz 111 222 hhhh 333 444 555 aaaa    zzzz 111 hhhh 222 333 444 555 aaaa    zzzz hhhh 111 222 333 444 555 aaaa
    
    
    111 aaaa 222 333 hhhh 444 555 zzzz
    111 aaaa 222 333 zzzz 444 555 hhhh
    111 hhhh 222 333 aaaa 444 555 zzzz
    111 hhhh 222 333 zzzz 444 555 aaaa
    111 zzzz 222 333 aaaa 444 555 hhhh
    111 zzzz 222 333 hhhh 444 555 aaaa
    
    
           aaaa 111 222 333 hhhh 444 zzzz 555
           aaaa 111 222 333 zzzz 444 hhhh 555
           hhhh 111 222 333 aaaa 444 zzzz 555
           hhhh 111 222 333 zzzz 444 aaaa 555
           zzzz 111 222 333 aaaa 444 hhhh 555
           zzzz 111 222 333 hhhh 444 aaaa 555
    
    111 aaaa 222 333 hhhh 444 zzzz 555
    111 aaaa 222 333 zzzz 444 hhhh 555
    111 hhhh 222 333 aaaa 444 zzzz 555
    111 hhhh 222 333 zzzz 444 aaaa 555
    111 zzzz 222 333 aaaa 444 hhhh 555
    111 zzzz 222 333 hhhh 444 aaaa 555
    
    =======================================================================================================================
    

    Best Regards,

    guy038



  • This post is deleted!


  • hello @guy038 @PeterJones and everyone.

    @guy038 Your regex (?=(mother|beautiful|car)\h(?:\w+\h){1,6}(?!\1)(mother|beautiful|car)) only puts the cursor, but is doesn’t make a select from WORD 1 to WORD 2 and WORD 3 (mother ... beautiful ... car)



  • Hi @vasile-caraus,

    You said :

    Your regex (?=(mother|beautiful|car)\h(?:\w+\h){1,6}(?!\1)(mother|beautiful|car)) only puts the cursor, but is doesn’t make a select from WORD 1 to WORD 2 and WORD 3 (mother … beautiful … car)

    But this is NOT my regex !!! Just ONE of the conditions of the overall regex in order to valid a match attempt which respects all the hypotheses !

    The EXACT regex is, indeed :

    (?x)                                                                     # FREE-SPACING Mode 
    (?i-s)                                                                   # Search SENSITIVE to CASE and LIMITED to a SINGLE line, at a time
    (?=.*mother)(?=.*beautiful)(?=.*car)                                     # IF EACH KEY word exists, further on, at least ONCE, in CURRENT line
    .*?\K                                                                    #   LEADING chars are NOT considered ( due to \K )
    (?=(mother|beautiful|car)\h(?:\w+\h){1,6}(?!\1)(mother|beautiful|car))   #   IF [KEY_word_#1 + (any WORD + BLANK char) from 1 to 6 TIMES + KEY_word_#2]
                                                                             #      Start of OVERALL match                     
    \1\h                                                                     #      KEY_word_#1 with its BLANK char
    ((?:(?!(?1)).)+?\h)?                                                     #      The SHORTEST POSSIBLE range of chars DIFFERENT from ALL KEY words, with its BLANK char, till ...
    (?!\1|\2)(?1)\h                                                          #      KEY_WORD_#3, with its BLANK char, DIFFERENT from KEY_word_#1 and KEY_word_#2
    ((?:(?!(?1)).)+?\h)?                                                     #      The SHORTEST POSSIBLE range of chars DIFFERENT from ALL KEY words, with its BLANK char, till ...
    \2                                                                       #      Key word #2
    

    So :

    • Open the file to test

    • Make a normal selection of the all the text, between the strings (?x) and \2, right above

    • Open the Find or Mark dialog ( with Ctrl + F or Ctrl + M )

    => This selected text ( the overall regex ) should be inserted in the Find what: field

    • If the In selection button is ticked, just untick that option

    • Click on the Fin Next or Mark button

    Et voilà !

    BR

    guy038



  • thanks @guy038



  • Hi, @vasile-caraus, @peterjones and All,

    Vasile, forget everything about my previous attempt ! I was mistaken in many ways :-((

    So, I assume :

    • Three KEY-words ( you , to and and ) which must all be present, at least once, in current line

    • The range to match must contain these three KEY-words in any order

    • The range, between the a first and last KEY-word, included, must not contain more than 8 words

    • No KEY-word exists between the first and second KEY-words found NOR between the second and third KEY-words


    Here is the correct and definitive regex S/R, again, with the free-spacing mode !

    (?xi-s)                                                                 # FREE-SPACING mode and search SENSITIVE to CASE and LIMITED to a SINGLE line, at a time 
    (?=\b(you|to|and)[^\w\r\n]+(?:\w+[^\w\r\n]+){0,6}?(?!\1)(you|to|and)\b) # IF KEY_word_#1 + NON-WORD char(s) + [any WORD + NON-WORD char(s)] from 0 to 6 TIMES + KEY_word_#2, DIFFERENT from KEY_word_#1
    (?=\b\1[^\w\r\n]+(?:\w+[^\w\r\n]+){0,6}?(?!\1|\2)(?1)\b)                # IF KEY_word_#1 + NON-WORD char(s) + [any WORD + NON-WORD char(s)] from 0 to 6 TIMES + KEY_word_#3, DIFFERENT from KEY_word_#1 and KEY_word_#2
    \b\1[^\w\r\n]+                                                          # Whole KEY_word_#1 ( Stored as GROUP 1 ) + NON-WORD CHARS
    (?:(?:(?!(?1)).)+?[^\w\r\n]+)?                                          # The SHORTEST POSSIBLE range of chars, NOT containing any KEY-words (?1) + NON-WORD char(s), till ...
    (?!\1)(you|to|and)[^\w\r\n]+                                            # A whole KEY-word, DIFFERENT from KEY_word_#1 ( Stored as GROUP 2 ) + NON-WORD char(s)
    (?:(?:(?!(?1)).)+?[^\w\r\n]+)?                                          # The SHORTEST POSSIBLE range of chars, NOT containing any KEY-words (?1) + NON-WORD char(s), till ...
    (?!\1|\3)(?1)\b                                                         # A whole KEY-word, DIFFERENT from GROUP 1 ( KEY_word_#1 ) and GROUP 2 ( KEY_word_#2 )
    

    Notes :

    • You may change the maximum number of the lazy quantifiers {0,6}?, in the two locations, for instance, {0,12}?, in order to increase the length of the range in words

    • Remark that, if you want a maximum of N words, between the first and the last KEY-word, you need to use the {0,N-2}? quantifier !

    • If you change some of the words, while still keeping 3 distinct words, replace the part you|to|and in three locations of that regex ( lines 2 and 6 )


    Now, in order to test it, as I did, :

    • Open the license.txt file

    • Open the Find dialog ( Ctrl + F )

      • Select the Normal search mode

      • Tick the Match whole word only option

      • Untick the Match case option

      • Close the Find dialog ( ESC )

    • Select any word you

    • Run the Search > Mark All > Using 1st Style command

    • Select any word to

    • Run the Search > Mark All > Using 2nd Style command

    • Select any word and

    • Run the Search > Mark All > Using 3rd Style command

    Now, you can test the regex above against the license.txt file and try also with, for instance, the lazy quantifiers {0,9}? or {0,12}?, which, remember, must be changed in two locations of the regex !

    REMINDER :

    • Select all the regex from (?xi-s) to (?!\1|\3)(?1)\b

    • Open the Find dialog ( Ctrl + F )

    => All this multi-lines text should be inserted in the Find what: field


    Now, here is an other regex, much more simple, which can be used if your goal is to search two words only, in a maximum range of words !

    So, I assume :

    • Two KEY-words ( you and and ) which must all be present, at least once, in current line

    • The range to match must contain these two KEY-words in any order

    • The range, between the a first and last KEY-word, included, must not contain more than 8 words

    • No KEY-word exists between the first and the last KEY-words found


    Here is this second regex S/R, using the free-spacing mode :

    (?xi-s)                          # FREE-SPACING mode and search SENSITIVE to CASE and LIMITED to a SINGLE line, at a time 
    \b(you|and)[^\w\r\n]+            # Whole KEY_word_#1 ( Stored as GROUP 1 ) + NON-WORD CHARS
    (?:(?!(?1))\w+[^\w\r\n]+){0,6}?  # Any word, NOT containing any KEY-words (?1) + NON-WORD char(s), REPEATED between 0 and SIX times
    (?!\1)(?1)\b                     # A whole KEY-word, DIFFERENT from GROUP 1 ( KEY_word_#1 )
    

    You may test this regex against the license.txt file and change either, the words in you|and and/or change the maximum, testing, for instance, the {0,9}? or {0,12}? values !

    Best Regards,

    guy038



  • Very nice this piece of regex @guy038 , extracted from one regex of your explanation:

    ([^\w\r\n]+) - Select all the empty spaces and all the signs without words or numbers on each line

    (.*?[^\w\r\n]) - Select everything on each line except the last word on each line


Log in to reply