• Login
Community
  • Login

Negative lookbehind regular expression not working on Notepad++

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
43 Posts 6 Posters 8.6k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • P
    PeterJones @dr ramaanand
    last edited by Apr 22, 2025, 2:39 PM

    I said,

    I will come back soon with a 3 step process to do what I think you want.

    1. FIND = (<span\b[^>]*?color\s*:\s*black[^>]*>\s*|p\b[^>]*?color\s*:\s*black[^>]*>\s*<span\b[^>]*>\s*)\K
      REPLACE = ☹
      • this puts a FROWN just after the spans/paragraphs that you don’t want to come before.
      • now you have a single character to mark which things you don’t want.
    2. Now that you have a single character which marks all the ones you don’t want, you can use a single-character negative lookbehind, which doesn’t have the problem of being variable width, so negative lookbehind will work:
      • FIND = (?<!☹)<code\s*style="background-color:\s*transparent;">
      • this will find just the ones you want, I believe
        07b59f8f-e883-4847-827d-e0b9758c473c-image.png
      • so now you could use whatever REPLACE you didn’t tell us about.
    3. After you’ve finished, you can search for the ☹ and replace with nothing, to get rid of those temporary markers

    Again, I will reiterate: when I come across a search-and-replace that’s too complicated, one of my primary strategies is to break it down into a multi-step

    1 Reply Last reply Reply Quote 2
    • G
      guy038
      last edited by Apr 22, 2025, 3:27 PM

      Hello, @dr-ramaanand, @peterjones, @mpheath, @mathlete2 and All,

      First, in order to simplify the problem, let’s use this theoretical notation to express your regex search, below :

      (?<!<span\b[^>]*?color\s*:\s*black[^>]*>\s*)(?<!<p\b[^>]*?color\s*:\s*black[^>]*>\s*<span\b[^>]*>\s*)<code\s*style="background-color:\s*transparent;">

      ( if  NOT part A BEFORE part C ) ( if NOT part B BEFORE Part C ) ( FIND part C )
      

      But, as @peterjones said, previously :

      The look-behinds cannot contain variable quantifiers like {n,}, {n,m}, ? ( idem {0,1} ), + ( idem {1,} ) or * ( idem {0,} ). So the parts A and B can only contain possible {n} quantifiers !

      Moreover, if a look-behind contains an alternative, each part of the alternative must contain the same number of characters, too !


      Now , you could say : OK, so I’ll replace all the look-behinds by normal regex parts between alternatives, followed with the \K syntax and get the theoretical notation, below :

      ( ( part A ) | ( Part B ) ) \K ( FIND part C )
      

      Which gives the functional regex :

      (<span\b[^>]*?color\s*:\s*black[^>]*>\s*|<p\b[^>]*?color\s*:\s*black[^>]*>\s*<span\b[^>]*>\s*)<code\s*style="background-color:\s*transparent;">

      But indeed, it just finds the opposite matches because :

      • If the part A matched, then it will match the part C only

      • If The part B matched, then it will match the part C only

      • In all other cases, as the parts A or B never occur, it will not match the part C, either, as @peterjones rightly explained !


      At this point, we can imagine this other regex, which should look, with our notation :

        ( if ( part A | part B ) followed with part C ) then I do NOT want these TWO cases | In ALL other cases, I want to match the part C
      

      This kind of logic can be reached with the help of the two phrasal verbs (*SKIP)(*F) ( very well-known of you !! ), giving the functional regex :

      (?:<span\b[^>]*?color\s*:\s*black[^>]*>\s*|<p\b[^>]*?color\s*:\s*black[^>]*>\s*<span\b[^>]*>\s*)<code\s*style="background-color:\s*transparent;">(*SKIP)(*F)|<code\s*style="background-color:\s*transparent;">


      If we use the free-spacing mode, we can add extra information on the method :

      (?sx-i)                                                           #  Free-Spacing mode - DOT matches NEW-LINE, Search SENSITIVE to Case
        (?:                                                             #    If it gets :
          <span\b[^>]*?color\s*:\s*black[^>]*>\s*                       #      A match of this FIRST regex part ( Ex-FIRST negative look-behind )
        |                                                               #    OR
          <p\b[^>]*?color\s*:\s*black[^>]*>\s*<span\b[^>]*>\s*          #      A match of this SECOND regex part ( Ex-SECOND negative look-beind )
        )                                                               #    End of the NON-CAPTURING group 
        <code\s*style="background-color:\s*transparent;">               #    , FOLLOWED with this MAIN regex part,
        (*SKIP)  (*F)                                                   #    CANCELS the WHOLE search and CONTINUE for a further possible MATCH
      |                                                                 #  In ALL other cases ( OR )
        <code\s*style="background-color:\s*transparent;">               #    Matches the MAIN regex part
      

      Against your example text, this regex do find the 4 occurrences ( out of the 6 occurrences of the string <code style="background-color: transparent;"> )


      Below, I indicated, in your example text, the end of the main searched regex, in the two cases which are UNWANTED, by the ••••• mark !

      <html>
      <p style="font-family: &quot;verdana&quot;; font-size: 18px; color: black; line-height: 18px; text-align: justify; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: cyan;"><span style="font-size: 13.5pt; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;;"><code style="background-color: transparent;">•••••<b>some text here</b></code></span></p>
      <span><span style="font-size: 13.5pt; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;; background-color: cyan;"><code style="background-color: transparent;"><b>some text here</b></code></span>
      
      
      <code style="background-color: transparent;">
      
      
      <p style="font-family: &quot;verdana&quot;; font-size: 18px; color: cyan; line-height: 18px; text-align: justify; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: cyan;"><span style="color: black; font-size: 13.5pt; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;;"><code style="background-color: transparent;">•••••<b>some text here</b></code></span></p>
      
      
      <span><span style="font-size: 13.5pt; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;; background-color: cyan;"><code style="background-color: transparent;"><b>some text here</b></code></span>
      
      
      <p style="font-family: &quot;verdana&quot;; font-size: 18px; color: cyan; line-height: 18px; text-align: justify; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: navy;"><span style="font-size: 13.5pt; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;;"><code style="background-color: transparent;"><b>some text here</b></code></span></p>
      </html>
      

      Note that the main regex <code\s*style="background-color:\s*transparent;"> MUST be placed in two places :

      • Right BEFORE the part (*SKIP)(*F) and the last | symbol

      • Right AFTER the last | symbol

      Best Regards,

      guy038

      P.S. :

      @dr-ramaanand, for such searches, you could mainly use the general template, below :

          ( Condition 1 | Condition 2 | ..... |  Condition N ) (  MAIN Regex Search ) (*SKIP)(*F)  |  ( MAIN Regex Search )
                                                                                                   |
          <---------------------------------- That I do NOT want ---------------------------------> <-- That I DO want --->
      
      D P 2 Replies Last reply Apr 22, 2025, 3:32 PM Reply Quote 4
      • D
        dr ramaanand @guy038
        last edited by Apr 22, 2025, 3:32 PM

        @guy038 Yes, your method is perfect. Thanks a lot. Merci beaucoup!

        D 1 Reply Last reply Apr 22, 2025, 4:25 PM Reply Quote 1
        • D
          dr ramaanand @dr ramaanand
          last edited by dr ramaanand Apr 22, 2025, 4:31 PM Apr 22, 2025, 4:25 PM

          @PeterJones You may want to study what is mentioned at https://www.rexegg.com/regex-lookarounds.php to understand how to use your method of regular expression for multiple negative look behinds. This is the specific regular expression I believe can help: (?<=(?<!(?<!X)_)_)\d+

          P M 2 Replies Last reply Apr 22, 2025, 4:35 PM Reply Quote 0
          • P
            PeterJones @dr ramaanand
            last edited by PeterJones Apr 22, 2025, 4:38 PM Apr 22, 2025, 4:35 PM

            @dr-ramaanand said in Negative lookbehind regular expression not working on Notepad++:

            @PeterJones You may want to study what is mentioned at https://www.rexegg.com/regex-lookarounds.php to understand how to use your method of regular expression for multiple negative look behinds

            Or, I may not.

            As is obvious from this post, I know how to use multiple lookaheads to do the extra logic, and have for years.

            But, as I said, “I really don’t like doing super-complicated single-run regular expressions when a multi-step that’s easier to understand would work.” And trying to rework the multiple conditions into multiple negative lookaheads buried before a \K to mimic a variable-width lookbehind moves it from the “this is reasonable and practical” world to the world of “why don’t you just do it with a simple three-step process, instead of confusing yourself and making other people write one complicated regex to do a job that’s easy if you break it into pieces”.

            @guy038 is able to do those super-fancy regex, and appears to enjoy it, so I let him. But I personally see no need for making a single regex that complex, and will not be using it for myself, nor do I think it’s necessarily the right solution for someone who comes here asking for regex help, since it’s not likely to continue to work when they change their parameters slightly. If @guy038 wants to share such solutions, in the hopes that eventually that person being helped will be able to do more for themselves, great; but I just want a practical solution that’s “good enough”.

            1 Reply Last reply Reply Quote 2
            • M
              mpheath @dr ramaanand
              last edited by mpheath Apr 22, 2025, 6:58 PM Apr 22, 2025, 6:16 PM

              @dr-ramaanand said in Negative lookbehind regular expression not working on Notepad++:

              @PeterJones You may want to study what is mentioned at https://www.rexegg.com/regex-lookarounds.php to understand how to use your method of regular expression for multiple negative look behinds. This is the specific regular expression I believe can help: (?<=(?<!(?<!X)_)_)\d+

              Please explain why as I am not a believer.

              To be more explicit in detail, you have an issue and now you consider nested within nested within nested regular expression is a solution to your problem?

              1 Reply Last reply Reply Quote 2
              • P
                PeterJones @guy038
                last edited by Apr 22, 2025, 7:31 PM

                @guy038 said in Negative lookbehind regular expression not working on Notepad++:

                you could mainly use the general template, below :

                @guy038 , that’s an awesome template.

                I highly encourage you to write up a short blog post about it, and then link to that new post from the Generic Regex Formula FAQ, because I think that’s a formula that could end up being useful.

                (I would just link to your post in here, but the focus is this particular example, which I think would be too complicated for most readers to understand. Doing a simpler example in the blog would be useful, I think, to help people translate your “template” into a real regex.)

                D 1 Reply Last reply Apr 22, 2025, 9:21 PM Reply Quote 2
                • D
                  dr ramaanand @PeterJones
                  last edited by dr ramaanand Apr 22, 2025, 11:36 PM Apr 22, 2025, 9:21 PM

                  @PeterJones I have understood what @guy038 is trying to convey (and I have been using it). A template would be useful and this is an example: (xyz)(*SKIP)(*F)|(z) is like a negative look behind which skips finding any z if it is preceded by y or x (the order of the x and y need not be the same) but finds all other occurrences of z - post no.16 shows how he used it for the block I typed for testing at the top of this thread

                  I would prefer a template like this:-

                  (String1|String2)(MAIN Regex Search)(*SKIP)(*F)|(MAIN Regex Search)
                                                                 |
                  <------------- This I do NOT want ------------><- This I DO want ->
                  
                  D 1 Reply Last reply Apr 23, 2025, 6:13 AM Reply Quote 0
                  • D
                    dr ramaanand @dr ramaanand
                    last edited by dr ramaanand Apr 23, 2025, 10:00 AM Apr 23, 2025, 6:13 AM

                    @PeterJones We can add another line below the above RegEx explanation like this (to explain it better):-

                    <------------- What I want to SKIP ------------><- What I want to MATCH ->
                    
                    D 1 Reply Last reply Apr 23, 2025, 10:41 AM Reply Quote 1
                    • D
                      dr ramaanand @dr ramaanand
                      last edited by dr ramaanand Apr 23, 2025, 10:47 AM Apr 23, 2025, 10:41 AM

                      @PeterJones The wonderful thing about the (*SKIP)(*F) method is that it can be used for negative look aheads also like this:-

                      (MAIN Regex Search)(String1|String2)(*SKIP)(*F)|(MAIN Regex Search)
                                                                     |
                      <------------- What I want to SKIP ------------><-What I want to MATCH->
                      
                      P 1 Reply Last reply Apr 23, 2025, 1:01 PM Reply Quote 0
                      • P
                        PeterJones @dr ramaanand
                        last edited by Apr 23, 2025, 1:01 PM

                        @dr-ramaanand said in Negative lookbehind regular expression not working on Notepad++:

                        The wonderful thing about the (*SKIP)(*F) method is that it can be used for negative look aheads also like this

                        But pointless, because lookaheads (negative or positive) can have variable width, so if you want a lookahead, just use a lookahead.

                        D 1 Reply Last reply Apr 23, 2025, 1:14 PM Reply Quote 2
                        • D
                          dr ramaanand @PeterJones
                          last edited by Apr 23, 2025, 1:14 PM

                          @PeterJones The (*SKIP)(*F) method can be of variable width but it can be used only for negative look aheads and negative look behinds

                          A 1 Reply Last reply Apr 23, 2025, 1:56 PM Reply Quote 0
                          • A
                            Alan Kilborn @dr ramaanand
                            last edited by Apr 23, 2025, 1:56 PM

                            @dr-ramaanand

                            Peter’s last point (which you missed) was that lookaheads are best done with native regex syntax, because it is more obvious that way.

                            And he probably would have confused you less if he had left out (negative or positive) from his sentence; doing that doesn’t change the meaning.

                            1 Reply Last reply Reply Quote 3
                            • G
                              guy038
                              last edited by guy038 Apr 24, 2025, 10:38 AM Apr 24, 2025, 10:37 AM

                              Hello, @peterjones and All,

                              OK. I going to prepare a blog post regarding the (*SKIP)(*F) feature !

                              However, be patient because I’ll try, first :

                              • To find out some other pertinent examples from various regex sites

                              • To propose alternatives to the (*SKIP)(*F) syntax when it’s possible !

                              BR

                              guy038

                              D 1 Reply Last reply Apr 24, 2025, 12:22 PM Reply Quote 3
                              • D
                                dr ramaanand @guy038
                                last edited by dr ramaanand Apr 24, 2025, 1:20 PM Apr 24, 2025, 12:22 PM

                                @guy038 please create the blog to show how to use the (*SKIP)(*FAIL) regular expression, not an alternative to it. @PeterJones may be able to create an alternative to it. If @PeterJones wants to still use his method for what I have typed as my block for testing, he can do it in 2 parts; first using the regular expression, (<span\b[^>]*?color\s*:\s*black[^>]*>\s*|<p\b[^>]*?color\s*:\s*black[^>]*>\s*<span\b[^>]*>\s*)\K(<code\s*style="background-color:\s*transparent;">) in the find field and a unique string (say for example, a unique name like, “Czeslawski”) in the replace field, he can replace the <code\s*style="background-color:\s*transparent;"> with that unique string. Then he can do what is needed to the other strings of <code\s*style="background-color:\s*transparent;"> and then again replace the unique string (“Czeslawski” in this case) with <code\s*style="background-color:\s*transparent;">. If it is something simple, this example should be sufficient: https://stackoverflow.com/questions/17286667/regular-expression-using-negative-lookbehind-not-working-in-notepad

                                1 Reply Last reply Reply Quote 0
                                • G
                                  guy038
                                  last edited by Apr 24, 2025, 1:51 PM

                                  Hello, @dr-ramaanand,

                                  When I said :

                                  To propose alternatives to the (*SKIP)(*F) syntax when it’s possible !

                                  I’m not talking about a work-around, using a several-steps regex, but, indeed, other direct regexes, without the (*SKIP)(*F) syntax, which are, sometimes, even shorter !

                                  You’ll understand what I mean., sooner !

                                  Best Regards,

                                  guy038

                                  D 1 Reply Last reply Apr 24, 2025, 4:50 PM Reply Quote 1
                                  • D
                                    dr ramaanand @guy038
                                    last edited by Apr 24, 2025, 4:50 PM

                                    @guy038 I will understand it only after you post that regular expression (RegEx) here

                                    D 1 Reply Last reply Apr 26, 2025, 9:50 AM Reply Quote 0
                                    • D
                                      dr ramaanand @dr ramaanand
                                      last edited by Apr 26, 2025, 9:50 AM

                                      This post is deleted!
                                      1 Reply Last reply Reply Quote 0
                                      • G
                                        guy038
                                        last edited by Apr 26, 2025, 1:20 PM

                                        Hello, @peterjones and All,

                                        Peter, Done ! Refer to :

                                        https://community.notepad-plus-plus.org/topic/26812/generic-regex-how-to-use-the-couple-of-backtracking-control-verbs-skip-fail-or-skip-f-in-regexes

                                        I also added a link to this post in your FAQ: Generic Regular Expression (regex) Formulas post.

                                        Best Regards,

                                        guy038

                                        D 1 Reply Last reply Apr 26, 2025, 1:57 PM Reply Quote 3
                                        • D
                                          dr ramaanand @guy038
                                          last edited by dr ramaanand Apr 26, 2025, 5:23 PM Apr 26, 2025, 1:57 PM

                                          @guy038 So, if you have an alternative method to the (*SKIP)(*FAIL) method for the block posted right at the top of this thread for testing to match the same string you posted in post#16 above, please post it here

                                          D 1 Reply Last reply Apr 27, 2025, 10:11 AM Reply Quote 0
                                          24 out of 43
                                          • First post
                                            24/43
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors