Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    Regex: Find words between words

    Help wanted · · · – – – · · ·
    3
    4
    1090
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vasile Caraus
      Vasile Caraus last edited by

      I have a gift for my mother. I want a nice book for myself. I love a beautiful woman for her eyes.

      So, as you can see, I want to find all words expression starting with a and ending with for. But for not more then 6 words between them.

      I made a regex, but it is not so good.

      \bA[\s\S]*?\w+{1,6}\bFOR[\s\S]*?

      1 Reply Last reply Reply Quote 2
      • PeterJones
        PeterJones last edited by PeterJones

        @Vasile-Caraus ,

        First, may I say, thank you for showing example data and what you had previously tried. That make it so much easier to help you.

        I’ve got a solution that I think will work: (?i)\bA\b(?:\s*\b\w+\b\s*){1,6}\bFOR\b See it described at https://regexr.com/3uuht.

        In your sequence, the problems I see:

        • [\s\S]*? means a non-greedy selection of 0 or more spaces or non-spaces – that means it non-greedy matches anything. I am not sure that’s what you really intended
        • \w+{1,6}: one or more word characters (from \w+), followed by nothing repeated 1 to 6 times. This is actually an error in the regex, and probably will cause the regex to do nothing (NPP Find says “invalid regular expression”). If you want the {1,6} to apply to the groupings of one-or-more word characters, you have to parenthesize around the \w+, as shown

        I fixed those in my version.

        I also added a couple more word boundaries, just to be explicit, and to prevent some false matches that I think may or may not be what you want.

        Caveats in my interpretation:

        • (?i): I was explicit about case insensitive (otherwise A and FOR would not match a and for)
        • I wanted to make it match a day for, but not an evening for, because you said “a”. If you want it to be able to start with a or an, but not and, use \bAN*\b
        • Since you said “for”, I assumed you wanted to match a day for, but not a day to go forth nor a day to go forward. If you want the latter as well, then \bFOR\w*\b
        • You weren’t explict as to whether you wanted the space after “for” to be included in the match or not (your final [\s\S] kindof hinted you do, but I wasn’t sure). If you do, then \bFOR\b\s+
        1 Reply Last reply Reply Quote 3
        • guy038
          guy038 last edited by guy038

          Hello @vasile-caraus, @peterjones and All,

          Why not this regex :

          a\s+(\w+\s+){1,6}?for

          OR

          a\h+(\w+\h+){1,6}?for

          With the sample text, below, it matches, only, in sentences 2 to 7 !

          1 : If was a for sale ( Incorrect sentence ! )
          2 : It was a house for sale.
          3 : It was a small house for sale.
          4 : It was a small old house for sale.
          5 : It was a very small old house for sale.
          6 : It was a very small old green house for sale.
          7 : It was a very small old green wooden house for sale.
          8 : It was a very small old green wooden house designed for sale.
          9 : It was a very small old green wooden house not designed for sale.
          

          With a single long line of text, joining lines 1 to 9, it works nice too !

          If was a for sale ( Incorrect sentence ! ) It was a house for sale. It was a small house for sale. It was a small old house for sale. It wasavery small old house for sale. It was a very small old green house for sale. It was a very small old green wooden house for sale. It was a very small old green wooden house designed for sale. It was a very small old green wooden house not designed for sale.
          

          Best Regards,

          guy038

          1 Reply Last reply Reply Quote 2
          • Vasile Caraus
            Vasile Caraus last edited by

            thank you

            also,

            a(\W+\w+){1,6}\W+for

            1 Reply Last reply Reply Quote 1
            • First post
              Last post
            Copyright © 2014 NodeBB Forums | Contributors