Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    Regex: Find that words that have html, but doesn't have .dot before html.

    Help wanted · · · – – – · · ·
    3
    6
    112
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Hellena Crainicu
      Hellena Crainicu last edited by

      hello, I need to find that words that have html, but doesn’t have .dot before html. See the lines below:

      1. I go home now.html" and then
      2. Take me with you nowhtml" and then
      3. asfa asdfdsa
      4. <!DOCTYPE html>
      5. <!--<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" >
      6. <html lang="en-US" xmlns:og="schema/">

      The Output should be the line 2, because is contains WORD+HTML, but without .HTML:

      Take me with you nowhtml" and then

      I made a regex, but is not very good, because it finds all lines that doesn’t have .HTML

      SEARCH: ^(?=.*html)(?:(?!\w+\.html).)+$

      Alan Kilborn 1 Reply Last reply Reply Quote 0
      • Alan Kilborn
        Alan Kilborn @Hellena Crainicu last edited by

        @hellena-crainicu

        You seem to ask a lot of regex questions.
        You’d be better served by really studying regex and solving your own problems.
        Likely soon this forum will quit just handing you the answers.

        But…

        You might try (?-s)^.*?\whtml.*, but it will match line 5 in addition to line 2. If this is not desired, you have to be more specific about what you need.

        Hellena Crainicu 1 Reply Last reply Reply Quote 0
        • Hellena Crainicu
          Hellena Crainicu @Alan Kilborn last edited by

          @alan-kilborn

          (?-s)^.*?\whtml.*

          your regex is almost good, but it finds 2 lines, instead of one:

          Take me with you nowhtml" and then

          and

          <!--<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" >

          THE OUTPUT must be only:

          Take me with you nowhtml" and then

          p.s. I ask, because my solution is not to good, and I am not an expert on regex. But as you can see, is not easy to find the solution on this problem.

          Alan Kilborn 1 Reply Last reply Reply Quote 0
          • Alan Kilborn
            Alan Kilborn @Hellena Crainicu last edited by Alan Kilborn

            @hellena-crainicu said in Regex: Find that words that have html, but doesn't have .dot before html.:

            your regex is almost good, but it finds 2 lines, instead of one:

            Yes. Did you read where I wrote?:

            it will match line 5 in addition to line 2. If this is not desired, you have to be more specific about what you need

            1 Reply Last reply Reply Quote 0
            • Robin Cruise
              Robin Cruise last edited by Robin Cruise

              try this regex:

              SEARCH: (?:^|\h)\w+html" This will match your request.

              but, also, can be a 6 case which you missed. Suppose you have a link such as: https://mywebsite.com/prince-is-my-fatherhtml

              So, in this case, the [dot] before html is missing. But I don’t know how to handle this situation…

              Robin Cruise 1 Reply Last reply Reply Quote 1
              • Robin Cruise
                Robin Cruise @Robin Cruise last edited by

                ok, try this solution, very good for all your example and for the 6’ case:

                SEARCH: ^(?=.*https://)(?:(?!\.html).)+$

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright © 2014 NodeBB Forums | Contributors