Community
    • Login

    Regex: Find only one line, from 2 similar lines (html tags)

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 4 Posters 608 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • astrosofistaA
      astrosofista @Robin Cruise
      last edited by

      @Robin-Cruise said in Regex: Find only one line, from 2 similar lines (html tags):

      <meta name=“description” content=.(( the | that | of ).){3,}.*>

      I think it’s easy to understand why the first line isn’t matched.

      <meta name="description" content="the mystery of the art that seeks its meaning.">
      

      The first “the” is not matched because it lacks a space before it. Then " of " is matched —notice the spaces surrounding it—, but the second “the” isn’t matched, again because it lacks a space before it, since the previous match —" of "— consumed the required space. Finally, " that " is matched, but you only got two matches, not the required three ones.

      One way to solve the issue is to remove the spaces and surround the group with the symbol \b. See the details in the documentation.

      Just to be clearer:

      <meta name="description" content=.*(\b(the|that|of)\b.*){3,}.*>
      

      HTH

      Alan KilbornA 2 Replies Last reply Reply Quote 1
      • Alan KilbornA
        Alan Kilborn @astrosofista
        last edited by Alan Kilborn

        @astrofist

        I was hoping not to give too much of a “stop-all-thinking-here’s-your-solution” to the OP, a known and repetitive data manipulation “taker”. Thus my pointing to the “formula” for how to do what OP needs, with an implied “go off and try it”.

        I believe we have to continue to promote learning.
        And perhaps some day the takers actually will learn and we’ll have such noise here less and less (because they actually WILL start solving their own problems and not need to post).
        Hmmm, maybe this is wishful thinking.

        However, your info about the spaces was good.
        Regex is sensitive to such extra spaces unless the (?x) directive is used.

        astrosofistaA 1 Reply Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn
          last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • Alan KilbornA
            Alan Kilborn @astrosofista
            last edited by

            @astrosofista

            What happened to your final a ? :-)

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Robin CruiseR
              Robin Cruise
              last edited by

              thanks @astrosofista

              1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @Alan Kilborn
                last edited by Alan Kilborn

                @Alan-Kilborn said in Regex: Find only one line, from 2 similar lines (html tags):

                What happened to your final a ? :-)

                I guess it is more than the final a that changed. :-)
                Or…is maybe still changing.
                Personally, I don’t like when people change their user name here, even slightly.
                It just confuses what I’m used to.
                I thought about removing the space between Alan and Kilborn and couldn’t decide conclusively if that was a good or bad idea.
                I notice when searching for users with a space between one or more words, the user doesn’t appear in the popup suggestion list (that’s why I was considering a change to drop the space).

                astrosofistaA 1 Reply Last reply Reply Quote 0
                • Terry RT
                  Terry R
                  last edited by

                  @Alan-Kilborn said in Regex: Find only one line, from 2 similar lines (html tags):

                  Personally, I don’t like when people change their user name here, even slightly.
                  It just confuses what I’m used to.
                  I thought about removing the space between Alan and Kilborn and couldn’t decide conclusively if that was a good or bad idea.

                  Sounds like you are in 2 minds on the matter. ;-))
                  I admit it was an issue when I first started posting trying to get the right name when typing the @. I noted just now that with your “handle” I can type @k and you come right to the top, even though the k is further down the string. So there does seem to be some intelligence with the lookup table.

                  It’s also not consistent when it allows the names with spaces against our icons, yet when referencing users the system insists on replacing spaces with -.

                  Terry

                  PS keep it as it is!

                  1 Reply Last reply Reply Quote 1
                  • astrosofistaA
                    astrosofista @Alan Kilborn
                    last edited by

                    @Alan-Kilborn

                    Yes, I am aware of OP’s behavior and in fact I believe this is the first time I have responded to one of his posts. However, I think my response was also educational, as I explained to him why his regular expression was failing. It was failing because of something simple to understand, but which for some reason eluded OP.

                    Since each term required a space, if there were two terms in a row, such as “of the”, there would have to have been two spaces between them for there to be a match. Since there were not, the regex failed.

                    The lesson here, and I hope OP will learn it and apply it from here on out, is to always be aware of the position of the reading head as it moves through the string. This would prevent a lot of trouble and frustration.

                    As for why I posted a solution, well, the explanation is also simple: I couldn’t resist :)

                    1 Reply Last reply Reply Quote 1
                    • astrosofistaA
                      astrosofista @Alan Kilborn
                      last edited by

                      @Alan-Kilborn

                      Nope, I didn’t change my nickname. I’m still astrosofista. I don’t know what could have happened.

                      1 Reply Last reply Reply Quote 0
                      • Alan KilbornA
                        Alan Kilborn
                        last edited by Alan Kilborn

                        I actually thought the OP was putting the extra spaces in for some sort of emphasis, even though they used this type of markup on it. I don’t know, posters do weird things some times. That’s why I didn’t even consider the spacing originally.

                        Something strange is going on.
                        While I was posting earlier, I saw your username being shown as “astrofist” and even “astrophista”! It was weird!
                        Now you are back where you belong as “astrofista”. BTW, is there any meaning to that name? Maybe you are an astrophysicist?

                        astrosofistaA 1 Reply Last reply Reply Quote 0
                        • astrosofistaA
                          astrosofista @Alan Kilborn
                          last edited by

                          @Alan-Kilborn

                          My guess is that OP used the spaces as a sort of word delimiter, but who knows.

                          astrosofista is the nick I used on Twitter for an account that was indeed about space related topics. Since I used that account to register for this forum, I left the same nick.

                          And although I like astronomy very much, I am not an astrophysicist. My academic studies are in philosophy. I have been teaching an introductory course in propositional logic and philosophy of science for twenty years. And now I am close to retirement - I will have more time to play with regex, scripting and the like.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors