Community
    • Login

    Regex: Find those tags that contain a string, but which do not contain other string

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    9 Posts 3 Posters 1.1k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • rodica FR Offline
      rodica F
      last edited by

      1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name</a> is beyond magic.</p>
          
      2. <p class="mb-40px">I love my home s< because I stay with my lovely cat.</p>
      
      3. <p class="mb-40px">Because of this book t< I cannot sleep well.</p>
      

      I want to find only the lines that have the operator < included in the html tag <p class=“mb-40px”> </p> , except those lines that have

      In my example above, the output should be line 2 (that have s< ) and line 3 ( that have t< )

      So, I use @guy032 generic formula: (REGION-START)+(.)+\K(FIND REGEX)(?s:(?=.*(REGION-FINAL)))

      In my case FIND: (<p class=“mb-40px”>)+(.)+\K(\w<)(?s:(?=.*(</p>)))

      The problem is that my regex find also the e</a> from the first line. And I don’t wanna find the tags with </a>

      Maybe @guy038 have a better GENERIC for this kind of problem

      1 Reply Last reply Reply Quote -1
      • guy038G Offline
        guy038
        last edited by guy038

        Hello, @rodica-f and All,

        Just consider this example :

        
        1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name </a> is z<beyond f< magic.</p>
            
        2. <p class="Test">I love my home s< because I stay with my b<lovely cat.</p>
        
        3. <p class="mb-40px">Because of this book t<I cannot a< sleep well.</p>
        

        Within this text :

        • Two tags begin with <p class="mb-40px"> and one begins with <p class="Test">

        • Each <p tag contains two < operators ( one followed with a space char, the other followed with a letter )


        So :

        • To find any <p... tag containing any string \w<, preceded with a space char, use the regex :

        SEARCH / MARK (?-si:<p class=".+?">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<

        • To find any <p... tag and containing any string \w<, preceded and followed with a space, use the regex :

        SEARCH / MARK (?-si:<p class=".+?">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<(?=\x20)

        • To find the specific tag <p class="mb-40px"> containing any string \w<, preceded with a space char, use the regex :

        SEARCH / MARK (?-si:<p class="mb-40px">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<

        • To find the specific tag <p class="mb-40px"> containing any string \w<, preceded and followed with a space char, use the regex :

        SEARCH / MARK (?-si:<p class="mb-40px">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<(?=\x20)

        Best Regards,

        guy038

        P.S. :

        BTW, no need to use a new profile. You’re certainly @robin-cruise !

        rodica FR 1 Reply Last reply Reply Quote 1
        • rodica FR Offline
          rodica F @guy038
          last edited by

          @guy038 thanks for the solution.

          1 mobile account and 1 desktop account. No difference. It’s all about where you are at that time…

          rodica FR 1 Reply Last reply Reply Quote 1
          • rodica FR Offline
            rodica F @rodica F
            last edited by rodica F

            @guy038 So, the generic formulas for this kind of problem (contain a string, but doesn’t contain other string) should be this:

            (?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\K(FR)

            (?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)

            (?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR

            (?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)

            BSR (begin part) = <p class="mb-40px">
            ESR (end part) = </p>
            FR - (FIND Regex) = \w<

            1 Reply Last reply Reply Quote 0
            • guy038G Offline
              guy038
              last edited by

              Hi, @rodica-f and All,

              Just a remainder :

              • Don’t forget to move the caret to the very beginning of file, before running the regex, with the Ctrl + Home shortcut

              BR

              guy038

              Robin CruiseR 1 Reply Last reply Reply Quote 2
              • Robin CruiseR Offline
                Robin Cruise @guy038
                last edited by Robin Cruise

                @guy038 but what if I have the following case?

                Must use a regex as to find all lines which contain <p class="sd-23"> but does not contain the closing tag </p>

                <p class="sd-23">Somebody to love</p>
                
                <p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen.
                
                <p class="sd-23">Holy Birth of God, have mercy on us!</p>
                

                my regex doesn’t work at all. It should have found the second line.

                FIND: (?<p class="sd-23">).*(?!</p>)

                1 Reply Last reply Reply Quote 0
                • guy038G Offline
                  guy038
                  last edited by guy038

                  Hello, @Robin-cruise and All,

                  Well, not so difficult ! I assume that each line must end with the </p> tag and that you’re not speaking about any multi-lines block !


                  Then, use the following regex in order to find out all the lines beginning with <p class="sd-23"> and not ending with the </p> tag :

                  (?-i)\h*<p class="sd-23">((?!</p>).)*$

                  For instance, using this four-lines text :

                  p class="sd-23">Somebody to love</p>
                  
                  <p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen.
                  
                      <p class="sd-23">
                  
                      <p class="sd-23">Holy Birth of God, have mercy on us!</p>
                  

                  The regex would select the entire lines 2 and 3 !


                  Notes :

                  • The regex finds, first, the string <p class="sd-23">, with this exact case, after possible leading blank characters

                  • Then, it grasps all remaining text ( .* ) till the end of the current line ( $ )…

                  • …But ONLY IF it does not meet the </p> tag at any position after <p class="sd-23">, till the end of current line

                  Best Regards,

                  guy038

                  Robin CruiseR 1 Reply Last reply Reply Quote 1
                  • Robin CruiseR Offline
                    Robin Cruise @guy038
                    last edited by Robin Cruise

                    @guy038 thanks ! but I don’t understand what does this doing:

                    (?-i)\h*

                    1 Reply Last reply Reply Quote 0
                    • guy038G Offline
                      guy038
                      last edited by guy038

                      Hi, @robin-cruise,

                      • The (?-i) part means that, from thiat point, the search will be sentitive to case. So, it will match the string <p class="sd-23">, but not, for instance, the string <P class="sd-23"> nor the string <p CLASS="sd-23"> !

                      • Then, the \h class character represents any horizontal blank character ( so, either, the \t [ Tabulation ] char or the \x20 [ space] char or the \xa0 character [No-breaking Space] char)

                      • Thus, the \h* syntax represents any range of horizontal blank chars, from 0 to n

                      BR

                      guy038

                      1 Reply Last reply Reply Quote 1

                      Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                      Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                      With your input, this post could be even better 💗

                      Register Login
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors