Community
    • Login

    Regex: Find that words that have html, but doesn't have .dot before html.

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    6 Posts 3 Posters 339 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Hellena CrainicuH
      Hellena Crainicu
      last edited by

      hello, I need to find that words that have html, but doesn’t have .dot before html. See the lines below:

      1. I go home now.html" and then
      2. Take me with you nowhtml" and then
      3. asfa asdfdsa
      4. <!DOCTYPE html>
      5. <!--<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" >
      6. <html lang="en-US" xmlns:og="schema/">

      The Output should be the line 2, because is contains WORD+HTML, but without .HTML:

      Take me with you nowhtml" and then

      I made a regex, but is not very good, because it finds all lines that doesn’t have .HTML

      SEARCH: ^(?=.*html)(?:(?!\w+\.html).)+$

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Hellena Crainicu
        last edited by

        @hellena-crainicu

        You seem to ask a lot of regex questions.
        You’d be better served by really studying regex and solving your own problems.
        Likely soon this forum will quit just handing you the answers.

        But…

        You might try (?-s)^.*?\whtml.*, but it will match line 5 in addition to line 2. If this is not desired, you have to be more specific about what you need.

        Hellena CrainicuH 1 Reply Last reply Reply Quote 0
        • Hellena CrainicuH
          Hellena Crainicu @Alan Kilborn
          last edited by

          @alan-kilborn

          (?-s)^.*?\whtml.*

          your regex is almost good, but it finds 2 lines, instead of one:

          Take me with you nowhtml" and then

          and

          <!--<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" >

          THE OUTPUT must be only:

          Take me with you nowhtml" and then

          p.s. I ask, because my solution is not to good, and I am not an expert on regex. But as you can see, is not easy to find the solution on this problem.

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @Hellena Crainicu
            last edited by Alan Kilborn

            @hellena-crainicu said in Regex: Find that words that have html, but doesn't have .dot before html.:

            your regex is almost good, but it finds 2 lines, instead of one:

            Yes. Did you read where I wrote?:

            it will match line 5 in addition to line 2. If this is not desired, you have to be more specific about what you need

            1 Reply Last reply Reply Quote 0
            • Robin CruiseR
              Robin Cruise
              last edited by Robin Cruise

              try this regex:

              SEARCH: (?:^|\h)\w+html" This will match your request.

              but, also, can be a 6 case which you missed. Suppose you have a link such as: https://mywebsite.com/prince-is-my-fatherhtml

              So, in this case, the [dot] before html is missing. But I don’t know how to handle this situation…

              Robin CruiseR 1 Reply Last reply Reply Quote 1
              • Robin CruiseR
                Robin Cruise @Robin Cruise
                last edited by

                ok, try this solution, very good for all your example and for the 6’ case:

                SEARCH: ^(?=.*https://)(?:(?!\.html).)+$

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors