Community
    • Login

    Why won't this PCRE regular expression work in Notepad++ when it works on regex101?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 6 Posters 381 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Terry RT
      Terry R @BlohoJo
      last edited by Terry R

      @BlohoJo
      So you still haven’t answered my questions, so I’ll just provide this answer and hope either I have it right or it will make you realise you may need to expand on what you really need.

      First run the “Reverse Line Order” under the Edit, Line Operations menu
      Then using the Replace function:
      Find What(?-s)^(Phrase A: )(?=.*\R\1)
      Replace With: nothing in this field
      Lastly run the “Reverse Line Order” under the Edit, Line Operations menu

      Bear in mind there are often a few solutions available so others may also provide their input. It will be up to you to decide which fits your needs the best.

      Terry

      PS you can record all this as a macro and assign a shortcut so it is easy to re-run, especially if you are handing the job onto someone with less knowledge than yourself.

      1 Reply Last reply Reply Quote 2
      • CoisesC
        Coises @BlohoJo
        last edited by

        @BlohoJo said in Why won't this PCRE regular expression work in Notepad++ when it works on regex101?:

        However, I was quite shocked to find that I now have no way to do what I want which is to DELETE the found phrases.

        I can’t select or delete marked text. I need to delete just the marked text, not the entire line, so Bookmarks don’t help here.

        One option is to use the Search… dialog in the Columns++ plugin. (You can install it from Plugins Admin.)

        That search acts in an indicated region; if you begin with nothing selected, it will set the search region to the entire document.

        On the dropdown menu for the Count button there is an option Select All. If you start with nothing selected, enter your search expression in the Find what box, set the Search Mode to Regular expression and then choose Select All, I believe that will do exactly what you want. You can then close the dialog and press the Delete key to delete the selected text.

        1 Reply Last reply Reply Quote 0
        • Lycan ThropeL
          Lycan Thrope @BlohoJo
          last edited by Lycan Thrope

          @BlohoJo ,
          Not that I’m anywhere near an expert with Regex, but first off, Notepad++ does not use PCRE, it uses Boost which is patterned somewhat after PCRE. Unless I misunderstand the conditions you want to test for, I did a very simple Search, Mark All, came up with the right solution, put it back into Replace mode and ran my Boost/NPP regex and it seems to have done exactly what your first example shows you desired. Not to take from @Terry-R or @Coises suggestions with the following regex and pics to show results.

          Find What = Phrase A\:\h?
          Replace With = 
          Search Mode = REGULAR EXPRESSION
          Dot Matches Newline = NOT CHECKED
          

          Pics show the results:
          Mark Tab:
          PCRENOTBOOST.PNG

          Replace Tab: Replace All (selected)
          PCRENOTBOOST2.PNG

          Terry RT 1 Reply Last reply Reply Quote 2
          • Terry RT
            Terry R @Lycan Thrope
            last edited by

            @Lycan-Thrope said in Why won't this PCRE regular expression work in Notepad++ when it works on regex101?:

            and it seems to have done exactly what your first example shows you desired.

            Actually I think you missed the OP’s need. They said:
            and finds and selects a specified phrase at the beginning of sentences, only if it is preceded by a sentence that also contains the phrase. The first sentence’s phrase is not selected.

            When I showed what my “Mark” result was using his regex he confirmed that the highlighted text was exactly what he wanted removed. Unfortunately as he found out, and I had alluded to in an earlier post, the removal was an entirely different matter unless a reversal of lines was used.

            So if a line has the string Phrase A:, then remove that string if the preceding line also contained that string. So when replacing, say line 9 contains that string. If line 8 also had it, then remove the string from line 9. Now check line 10, if that had the same as line 9… Oh wait, I’ve just removed that from line 9, but as I’m a regex, I don’t remember that I did that so I don’t remove it from line 10. But I should have!

            This is all due to the lookbehind function which doesn’t help as we’ve possibly already changed the line before and now don’t know what it was. So what I’ve done is reversed the line order, so now it’s a lookahead. As such we’ve yet to process the next line so we CAN use the lookahead function with certainty.

            Terry

            Lycan ThropeL 1 Reply Last reply Reply Quote 3
            • Alan KilbornA
              Alan Kilborn
              last edited by

              @Coises

              It seems like the suggestion of using a plugin to solve the OP’s problem is a bit “much” when the problem can be solved with Notepad++ itself/alone.

              However, as OP did ask for a means to “select” search matches, and Notepad++ can’t do that, suggesting the plugin makes some sense. But, for any future readers of this with a similar need, replacing search matches with nothing effectively deletes them – you don’t need a means to “select” them so that you can then hit the Delete key to remove them.

              1 Reply Last reply Reply Quote 2
              • guy038G
                guy038
                last edited by guy038

                Hello, @blohojo, @terry-r, @coises, @lycan-thrope, @alan-kilborn and All,

                Indeed, this tricky replacement cannot be correctly done because, as soon as the string Phrase A: had been deleted, the regex conditions do not exist anymore for a possible match of the next line !

                Thus, instead of trying to DELETE anything, we’ll try, first, to ADD something to the matched lines !


                So, from your INPUT text, below :

                Phrase A: Words
                Phrase A: Words
                Phrase A: Words
                Phrase B: Words
                Phrase B: Words
                Phrase A: Words
                Phrase A: Words
                Phrase B: Words
                Phrase A: Words
                Phrase A: Words
                

                the following regex S/R will add, for example, the ¶ symbol at the beginning of any line that need to be matched :

                FIND (?-s)(?<=Phrase A: ).*\R\K(?=Phrase A: )

                REPLACE ¶

                And, after a click on the Replace All button, would produce this temporary OUTPUT :

                Phrase A: Words
                ¶Phrase A: Words
                ¶Phrase A: Words
                Phrase B: Words
                Phrase B: Words
                Phrase A: Words
                ¶Phrase A: Words
                Phrase B: Words
                Phrase A: Words
                ¶Phrase A: Words
                

                Then, the trivial S/R, below, would delete all the lines beginning with the string ¶Phrase A:\x20 :

                FIND (?-i)^¶Phrase A:\x20

                REPLACE Leave EMPTY

                And you get your expected OUTPUT text :

                Phrase A: Words
                Words
                Words
                Phrase B: Words
                Phrase B: Words
                Phrase A: Words
                Words
                Phrase B: Words
                Phrase A: Words
                Words
                

                However, I admit that the @coises’s solution, with the Plugins > Columns++ > Search... option and using the (?-s)(?<=Phrase A: ).*\R\KPhrase A:\x20 regex, is more simple and obvious !


                Note that, if in the first S/R, I had written (?-s)(?<=^Phrase A: ).*\R\K(?=Phrase A: ), the second match in line 3 would not have been found because the regex would have expected that the word Phrase BEGINS the line !


                IMPORTANT note :

                If a MARK operation only concerns zero-length strings, like my first S/R, it just returns the message Mark: 0 matches in entire file or the message Mark: 0 matches from caret to end of file, instead of the message Mark: # matches..., with # > 0. However, as shown above, it did realize the needed replacements ! So, I think it’s a Notepad++ bug and I should create an issue about it !

                Of course, the fact that no highlighting occurs is quite logical because one cannot highlight zero-length strings !! And I suppose this leads to the 0 matches result :-((

                Best Regards,

                guy038

                P.S. :

                If I use my first search regex (?-s)(?<=Phrase A: ).*\R\K(?=Phrase A: ), with the Columns++ plugin, and click on the Select All button, it does find 4 zero-length matches !

                1 Reply Last reply Reply Quote 3
                • guy038G
                  guy038
                  last edited by

                  Hi, All,

                  As intended, here is my issue , on GitHub :

                  https://github.com/notepad-plus-plus/notepad-plus-plus/issues/16279

                  BR

                  guy038

                  1 Reply Last reply Reply Quote 1
                  • Lycan ThropeL
                    Lycan Thrope @Terry R
                    last edited by Lycan Thrope

                    @Terry-R said in Why won't this PCRE regular expression work in Notepad++ when it works on regex101?:

                    Actually I think you missed the OP’s need. They said:

                    I did say, I may have misunderstood what he was testing for. I thought the first line being highlighted by a background color indicated it was chosen, so yeah that’s my bad…and I did say I’m no regex expert. :-)

                    I kind of agree with @Alan-Kilborn and @guy038 how he maybe should have proceeded, but obviously per @guy038 , @Coises solution was a better option for him. I don’t know, I’m not currently working in his plugin, so wouldn’t have known if that would work. I’m still trying to work out some kind of solution, but seem not to be able to treat the regex like a programming logic, or don’t know the proper syntax of the Boost regex to try and make something work the way we want.
                    Since most of you accomplished guys aren’t seemingly able to solve it, I don’t hold out hope for myself, either, but at least it’s making me read more of the documentation and found some new things to look at and learn, like \G (little L). Just learning and will be wrong. :-)

                    CoisesC 1 Reply Last reply Reply Quote 0
                    • CoisesC
                      Coises @Lycan Thrope
                      last edited by

                      @Lycan-Thrope said in Why won't this PCRE regular expression work in Notepad++ when it works on regex101?:

                      I kind of agree with @Alan-Kilborn and @guy038 how he maybe should have proceeded, but obviously per @guy038 , @Coises solution was a better option for him.

                      It’s kind of a weird regular expression. Had I been trying to solve this problem from scratch, I probably would have done something like @guy038 suggested.

                      The only reason I brought up Columns++ was because the original poster succeeded in getting the expression to match exactly what he wanted, but replacement with an empty string didn’t work. He wanted a way to select all the matches so he could delete them, and that is something Columns++ search offers that Notepad++ search does not.

                      The thing that makes this tricky is that regular expression replacements occur after each corresponding match; it doesn’t first match everything, then do the replacements. And for this problem, some matches (third and subsequent lines beginning with “Phrase A: ”) fail once the previously matched text has been replaced.

                      @guy038’s suggestion is the normal way of solving this sort of problem. You “mark” each match by adding some character that you know doesn’t occur in the file, then you use those marks to make the changes you need in a second step. I don’t think there is, in general, a way to do it in a single step (though of course in any specific case there might be a “trick”).

                      Columns++ offers an alternative approach by allowing multi-selection of all the matches without changing the file. In this case, just pressing delete was enough for the second step; in a more complex case, Columns++ allows you to reset the search region to match the multi-selection, so you could then search, or search and replace, within the previous matches. I think it’s powerful, though completely non-standard. You could always accomplish the same thing with the “add a character that isn’t used in the file” technique, though.

                      1 Reply Last reply Reply Quote 3
                      • guy038G
                        guy038
                        last edited by

                        Hi All,

                        See my important and final comment about my own issue :

                        https://github.com/notepad-plus-plus/notepad-plus-plus/issues/16279#issuecomment-2726313083

                        BR

                        guy038

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors