Community
    • Login

    regex: Match string not containing string

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    14 Posts 4 Posters 26.0k Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Neculai I. FantanaruN Offline
      Neculai I. Fantanaru
      last edited by Neculai I. Fantanaru

      Good day. and Merry Christmas !!

      I have this lines all starting with tag <p class=“TESTA”> and ending with tag <br> except the last two.

      <p class="TESTA">I love you.<br>
      <p class="TESTA">You love her.<br>
      <p class="TEXTA">She loves me.<LLbr>
      <p class="TEXTA">It is not about me.<AAbr>
      

      My output result should be

      <p class="TEXTA">She loves me.<LLbr>
      <p class="TEXTA">It is not about me.<AAbr>
      

      I made a regex, but is not too good. I use ?! and \b but seems it is not a very good idea.

      (?-s)(.*<p class="TEXTA">.*)(?:(?!\b<br>\b))(.*)$

      Meta ChuhM 1 Reply Last reply Reply Quote 1
      • rinku singhR Offline
        rinku singh
        last edited by

        try compare
        https://regex101.com/r/We7Afi/1

        https://notepad-plus-plus.org/community/topic/16817/regex-find-all-lines-starting-with-a-specific-tag-and-ending-with-a-different-tag/3

        1 Reply Last reply Reply Quote 0
        • Meta ChuhM Offline
          Meta Chuh moderator @Neculai I. Fantanaru
          last edited by Meta Chuh

          hi @Neculai-I.-Fantanaru
          merry christmas to you too !!

          this regex will work on your given example:

          find what: ^(.*?)TESTA(.*?)(\r\n|\r|\n)
          replace with: (leave empty)
          search mode: regular expression
          click on replace all

          so from your example:

          <p class="TESTA">I love you.<br>
          <p class="TESTA">You love her.<br>
          <p class="TEXTA">She loves me.<LLbr>
          <p class="TEXTA">It is not about me.<AAbr>
          

          every line will be deleted except:

          <p class="TEXTA">She loves me.<LLbr>
          <p class="TEXTA">It is not about me.<AAbr>
          
          1 Reply Last reply Reply Quote 1
          • Neculai I. FantanaruN Offline
            Neculai I. Fantanaru
            last edited by

            yes, @gurikbal-singh , I inspired myself from a previous topic, but it’s not the same thing. This is why I open another topic, even if it’s close.

            1 Reply Last reply Reply Quote 0
            • Neculai I. FantanaruN Offline
              Neculai I. Fantanaru
              last edited by Neculai I. Fantanaru

              yes, I find the solution. I believe my mistake was the fact I also use (?-s) in a negative lookahead regex. Doesn’t work this way.

              <p class="TEXTA">(?:(?!<br>).)*$

              or

              (.*<p class="TEXTA">.*)(?:(?!\b<br>\b))(.*)$

              1 Reply Last reply Reply Quote 0
              • guy038G Online
                guy038
                last edited by guy038

                Hello, @neculai-i-fantanaru, and All

                In order to match complete non-empty lines which do NOT contain a specific string, let’s say, the word TEXT, with that exact case, here are, below, 5 regexes :

                • Regex A : (?-is)^(?!.*TEXT).+\R matches any line which does NOT contain the string TEXT

                • Regex B : (?-is)^(?!^TEXT).+\R matches any line which does NOT contain the string TEXT, at beginning of line

                • Regex C : (?-is)^(?!.*TEXT$).+\R matches any line which does NOT contain the string TEXT, at end of line

                • Regex D : (?-is)^(?!^TEXT|.*TEXT$).+\R matches any line which does NOT contain the string TEXT, at beginning OR at end of line

                • Regex E : (?-is)^(?!^.+TEXT.+$).+\R matches any line which does NOT contain the string TEXT, NOT at line boundaries


                In the table, below, the lines matched are noted with a X and therefore, will be deleted, if the Replace zone is empty

                •-----------------------------------------•-----------•-----------•-----------•-----------•-----------•
                |              Lines Scanned              |  Regex A  |  Regex B  |  Regex C  |  Regex D  |  Regex E  |
                •-----------------------------------------•-----------•-----------•-----------•-----------•-----------•
                |  TEST : I love you.                     |     X     |     X     |     X     |     X     |     X     |
                |  TEXT : She loves me.                   |           |           |     X     |           |     X     |
                |  ABCD : It is not about me.             |     X     |     X     |     X     |     X     |     X     |
                |  TEXT : You love her.                   |           |           |     X     |           |     X     |
                •-----------------------------------------•-----------•-----------•-----------•-----------•-----------•
                |  Statement "TEST" : I love you.         |     X     |     X     |     X     |     X     |     X     |
                |  Statement "TEXT" : She loves me.       |           |     X     |     X     |     X     |           |
                |  Statement "ABCD" : It is not about me. |     X     |     X     |     X     |     X     |     X     |
                |  Statement "TEXT" : You love her.       |           |     X     |     X     |     X     |           |
                •-----------------------------------------•-----------•-----------•-----------•-----------•-----------•
                |  I love you.          = TEST            |     X     |     X     |     X     |     X     |     X     |
                |  She loves me.        = TEXT            |           |     X     |           |           |     X     |
                |  It is not about me.  = ABCD            |     X     |     X     |     X     |     X     |     X     |
                |  You love her.        = TEXT            |           |     X     |           |           |     X     |
                •-----------------------------------------•-----------•-----------•-----------•-----------•-----------•
                

                Remark : Of course, for correct testing of these regexes, just copy the text provided, in that way :

                TEST : I love you.
                TEXT : She loves me.
                ABCD : It is not about me.
                TEXT : You love her.
                
                Statement "TEST" : I love you.
                Statement "TEXT" : She loves me.
                Statement "ABCD" : It is not about me.
                Statement "TEXT" : You love her.
                
                I love you.          = TEST
                She loves me.        = TEXT
                It is not about me.  = ABCD
                You love her.        = TEXT
                

                Now, to match all complete non-empty lines which do NOT contain the expression <p class="TEXT">, possibly preceded by some blank characters, with that exact case, use the regex :

                Regex F : (?-is)^(?!\h*<p class="TEXT">).+\R

                •-------------------------------------------•-----------•
                |               Lines Scanned               |  Regex F  |
                •-------------------------------------------•-----------•
                |  <p class="TEST">I love you.<br>          |     X     |
                |      <p class="TEXT">She loves me.<br>    |           |
                |  <p class="ABCD">It is not about me.<br>  |     X     |
                |  <p class="TEXT">You love her.<br>        |           |
                •-------------------------------------------•-----------•
                

                Best Regards,

                guy038

                1 Reply Last reply Reply Quote 1
                • Neculai I. FantanaruN Offline
                  Neculai I. Fantanaru
                  last edited by Neculai I. Fantanaru

                  yes, @guy038 . but if I have opposite scenario:

                  1. <p class="TEXTA">She loves me.</p>
                  2. <p class="TEXTA">She loves me.<LLbr>
                  3. <p class="TEXTA">It is not about me.<AAbr>
                  4. <p class="TEXTA">She loves me.
                     </p>
                  

                  And I want to select all tags which contains <p class="TEXTA"> but does not contains </p>. So I want to select only the 2 and 3 lines. How can I do this ?

                  Meta ChuhM 1 Reply Last reply Reply Quote 0
                  • Meta ChuhM Offline
                    Meta Chuh moderator @Neculai I. Fantanaru
                    last edited by

                    @Neculai-I.-Fantanaru

                    this regex will work on your new example:

                    find what: ^(.*?)TEXTA(.*?)(.|\R)(.*?)</p>\R
                    replace with: (leave empty)
                    search mode: regular expression
                    click on replace all

                    so from a copy of your example:
                    (it has to have an empty line after all texts for a correct newline \R detection on multi line <p>…</p> tags like the 4. you’ve given in your example)

                    1. <p class="TEXTA">She loves me.</p>
                    2. <p class="TEXTA">She loves me.<LLbr>
                    3. <p class="TEXTA">It is not about me.<AAbr>
                    4. <p class="TEXTA">She loves me.
                       </p>
                    
                    

                    it will delete everything and leave you with:

                    2. <p class="TEXTA">She loves me.<LLbr>
                    3. <p class="TEXTA">It is not about me.<AAbr>
                    
                    

                    if this does not work with your real data, please provide us with a real data example and how your result should look like

                    1 Reply Last reply Reply Quote 1
                    • guy038G Online
                      guy038
                      last edited by guy038

                      Hi, @neculai-i-fantanaru, and All

                      In that case, you could use the regex (?-i)<p class="TEXTA">[^<>]+<(?!/p).+?>

                      Notes :

                      • First, this regex looks for the literal expression <p class="TEXTA">, with that exact case

                      • Followed with a non-empty range of characters, either different from < and >, till an < symbol

                      • Followed with a non-empty range of standard characters till the nearest > symbol, but ONLY IF the string /p cannot be found, right after the < symbol !

                      Just test it, with that text below :

                      <p class="TEXTA">She loves me.</p>
                      <p class="TEXTA">an other
                      test </abc>
                      <p class="TEXTA">She loves me.</p>      <p class="TEXTA">She loves me.</123>
                         <p class="TEXTA">She loves me.<LLbr>    <p class="TEXTA">She loves me.</p>
                         <p class="TEXTA">She loves me.
                         </p>
                      <p class="TEXTA">She loves me.<LLbr>
                      <p class="TEXTA">It is not about me.<AAbr>     <p class="TEXTA">It is
                       not about
                       me.<p>
                      
                      <p class="TEXTA">She loves me.
                         </p>
                      <p class="TEXTA">It is not about me.<AAbr>
                      

                      As I suppose that you would like to replace any bad ending tag, like </abc>, /123, <LLbr>, <AAbr>, or even <p> with the right ending tag </p>, use the following regex S/R :

                      SEARCH <p class="TEXTA">[^<>]+<\K(?!/p).+?(?=>)

                      REPLACE /p

                      Notes :

                      • If you just perform the search part, it just matches any bad ending tag, without the < and > boundaries, which is different from /p

                      • Remember that the \K syntax forces the regex engine to forget everything already matched and reset the working position to the location, right after the < symbol !

                      • If you click on the Replace All button ( not the Replace one ), any bad ending tag is then changed into </p>

                      Cheers,

                      guy038

                      1 Reply Last reply Reply Quote 1
                      • Neculai I. FantanaruN Offline
                        Neculai I. Fantanaru
                        last edited by

                        @guy038 said:

                        (?-i)<p class=“TEXTA”>[^<>]+<(?!/p).+?>

                        Your regex is great. But I just find another case that you may update regex, if you want. Strange thing. I did not take this into account. That can be some other tags in the same tag. For example:

                        <p class="TEXTA">I believe in love<em>but only if</em>you can make me smile</p>
                        

                        So, to update my last scenario:

                        1. <p class="TEXTA">She loves me.</p>
                        2. <p class="TEXTA">She loves me.<LLbr>
                        3. <p class="TEXTA">It is not about me.<AAbr>
                        4. <p class="TEXTA">She loves me.
                           </p>
                        5.<p class="TEXTA">I believe in love<em>but only if</em>you can make me smile</p>
                        6.<p class="TEXTA">I believe in love<em>but only if</em>you can make me smile</title>
                        

                        So, the regex it should select lines 2,3 and 6 . Right now, your regex select also the line 5 (because of that 2 <em></em> witch is not good).

                        1 Reply Last reply Reply Quote 0
                        • guy038G Online
                          guy038
                          last edited by guy038

                          @neculai-i-fantanaru, and All

                          Ah, OK ! So, I’ve created a regex, using a recursive pattern ( due to the (?1) subroutine to group 1, located inside the group whose it refers to ), which allows the search of any block :

                          • Beginning with the tag <p class="TEXTA">

                          • Ending with a tag, different from </p>, which ends the line

                          • Containing any correct matched areas <tag>.....<tag, possibly juxtaposed and/or nested, as for instance :

                          <p class="TEXTA">.......<abc>.....<def>...
                          ....</def>...........</abc>.........<123>........</123>......<456>....
                          ...</456>............
                          ........<Niv1>.......<Niv2>.........<Niv3>......
                          ...<Niv4>.......<XXX>...........</XXX>............</Niv4>......
                          ..............</Niv3>..........
                          .......</Niv2>..........
                          .........</Niv1>...........<bla bla bla>
                          

                          Highly unlikely case, isn’t it !

                          So, here is the regex :

                          (?-i)<p class="TEXTA">(?:([^<>]+<(\w{1,10})>([^<>]+|(?1))</\2>[^<>]+)+|[^<>]*)<(?!/p)[^<>]+?>(?=\R)

                          And, again, if you just want to catch the wrong ending tag use the regex :

                          (?-i)<p class="TEXTA">(?:([^<>]+<(\w{1,10})>([^<>]+|(?1))</\2>[^<>]+)+|[^<>]*)<\K(?!/p)[^<>]+?(?=>\R)


                          Test these regexes, against text below. Note that they match only the blocks with even numbers ( 2, 4, 6, … )

                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          1. <p class="TEXTA">She loves me.</p>
                          
                          2. <p class="TEXTA">She loves me.<LLbr>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          3. <p class="TEXTA">It is not
                                   about me
                              </p>
                          
                          4. <p class="TEXTA">It is not
                                   about me.
                              <AAbr>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          5. <p class="TEXTA">I believe in love<em>but only if</em>you can make me smile</p>
                          
                          6. <p class="TEXTA">I believe in love<em>but only if</em>you can make me smile</title>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          7. <p class="TEXTA">I believe in love<em>but
                                 only 
                               if</em>you can make me smile</p>
                          
                          8. <p class="TEXTA">I believe in love<em>but
                                 only 
                               if</em>you can make me smile</html>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          9.  <p class="TEXTA">I believe in love<12345>but
                              only 
                              if</12345>you can make me smile</p>
                          
                          10. <p class="TEXTA">I believe in love<12345>but
                              only 
                              if</12345>you can make me smile</div>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          11. <p class="TEXTA">I believe<em> in love<em>but
                            only 
                          if</em>you can ma</em>ke me smile</p>
                          
                          12. <p class="TEXTA">I believe<em> in love<em>but
                            only 
                          if</em>you can ma</em>ke me smile<h3>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          13.  <p class="TEXTA">I be<em>lieve<def> in love<em>but
                               only 
                               if</em>you can ma</def>ke me smi</em>le</p>
                          
                          14.  <p class="TEXTA">I be<em>lieve<def> in love<em>but
                               only 
                               if</em>you can ma</def>ke me smi</em>le</body>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          15.
                          <p class="TEXTA">I be<abc>lieve<def> in love<em>but
                          only 
                          if</em>you can ma</def>ke me smi</abc>le</p>
                          
                          16.
                          <p class="TEXTA">I be<abc>lieve<def> in love<em>but
                          only 
                          if</em>you can ma</def>ke me smi</abc>le<abcde>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          17. <p class="TEXTA">I <ab>believe </ab>in love<em>but only if</em>you <123>can make </123>me smile</p>
                          
                          18. <p class="TEXTA">I <ab>believe </ab>in love<em>but only if</em>you <123>can make </123>me smile</a>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          19. <p class="TEXTA">I <code>believe </code>in love<em>but
                                only if<123>you </123>can 
                                make </em>me
                             smile</p>
                          
                          20. <p class="TEXTA">I <code>believe </code>in love<em>but
                                only if<123>you </123>can 
                                make </em>me
                             smile</tr>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          21.<p class="TEXTA"></p>
                          
                          22.<p class="TEXTA"><script>
                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          23.   <p class="TEXTA">.......<abc>.....<def>...
                          ....</def>...........</abc>.........<123>........</123>......<456>....
                          ...</456>............
                          ........<Niv1>.......<Niv2>.........<Niv3>......
                          ..............</Niv3>..........
                          ...<Niv4>.......<XXX>...........</XXX>............</Niv4>......
                          .......</Niv2>..........
                          .........</Niv1>...........</p>
                          
                          24..   <p class="TEXTA">.......<abc>.....<def>...
                          ....</def>...........</abc>.........<123>........</123>......<456>....
                          ...</456>............
                          ........<Niv1>.......<Niv2>.........<Niv3>......
                          ...<Niv4>.......<XXX>...........</XXX>............</Niv4>......
                          ..............</Niv3>..........
                          ...<Niv3>.......<XXX>...........</XXX>............</Niv3>......
                          .......</Niv2>..........
                          .........</Niv1>...........<bla bla bla>
                          

                          Best Regards,

                          guy038

                          1 Reply Last reply Reply Quote 1
                          • Neculai I. FantanaruN Offline
                            Neculai I. Fantanaru
                            last edited by Neculai I. Fantanaru

                            @guy038 said:

                            (?-i)<p class=“TEXTA”>(?:([^<>]+<(\w{1,10})>([^<>]+|(?1))</\2>[^<>]+)+|[^<>]*)<(?!/p)[^<>]+?>(?=\R)

                            good morning. I try your regex, both, I don’t know why, but doesn’t select line number 6. Only the lines 2 and 3.

                            Meta ChuhM 1 Reply Last reply Reply Quote 0
                            • Meta ChuhM Offline
                              Meta Chuh moderator @Neculai I. Fantanaru
                              last edited by Meta Chuh

                              @Neculai-I.-Fantanaru

                              again, you will have to add an empty line below 6. if it is the last line of your test document.
                              then @guy038 's regex will find line 6. correctly with your given example.
                              so your document must end with an empty line in order for the regex to work.

                              1 Reply Last reply Reply Quote 0
                              • Neculai I. FantanaruN Offline
                                Neculai I. Fantanaru
                                last edited by

                                yes, ok, but if I have an .html file, I will never finnish with this line. :) So, for sure I have a lot of lines and other tags after line six :)

                                anyway, I get it. I remove the last part (?=\R) and works.

                                (?-i)<p class="TEXTA">(?:([^<>]+<(\w{1,10})>([^<>]+|(?1))</\2>[^<>]+)+|[^<>]*)<(?!/p)[^<>]+?>

                                thank you @guy038

                                And Happy New Year everyone !!

                                1 Reply Last reply Reply Quote 0

                                Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                With your input, this post could be even better 💗

                                Register Login
                                • First post
                                  Last post
                                The Community of users of the Notepad++ text editor.
                                Powered by NodeBB | Contributors