Community
    • Login

    How to stop searching or replacing after a string?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    regex
    24 Posts 5 Posters 4.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dr ramaanandD
      dr ramaanand
      last edited by dr ramaanand

      I am unable to edit the above post. Akismet.com says it is considered as spam but some spaces have appeared which I am editing out here. If I use \?</p>\R<p[^<>]*>(.*)(?=<\/h2) in the, “Find what” field, check the Regular expression mode with ?</h2>\r\n<h2>$1 in the “Replace with” field and hit “Replace All” repeatedly, I get the desired result. Please read the rest above.

      1 Reply Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @dr ramaanand
        last edited by

        @dr-ramaanand said in How to stop searching or replacing after a string?:

        What Regular expression can I use to do that

        Go to our FAQ section, find the generic regex for replacing only within a range. The start of your range is <h2> and the end is </h2>. And you want to replace any </p>\r\n<p> with </h2>\r\n<h2> when within that range.

        ----

        Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.

        dr ramaanandD 1 Reply Last reply Reply Quote 0
        • dr ramaanandD
          dr ramaanand @PeterJones
          last edited by

          @PeterJones I went to the FAQ section and searched for, “within a range” but none of those answers seem to be useful for me. Can you give me a specific link to what exactly you want me to read and try?

          PeterJonesP 1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @dr ramaanand
            last edited by

            @dr-ramaanand said in How to stop searching or replacing after a string?:

            @PeterJones I went to the FAQ section and searched for, “within a range”

            Sorry, I was on my phone browser at the time, and thought I had given enough information for someone to find it:

            1. FAQ section
            2. I said “find the generic regex”. So search for “generic”. => Generic Regex Formulas
            3. I said “range” but the entry’s name is actually “zone”; the concept is the same, and reading the descriptions from the entry you went to in step 2 should have gotten you to the right place, even if I didn’t use the exact words from the FAQ. So go to Replacing in a specific zone of text

            That page gives the regex formula you will have to use.

            dr ramaanandD 1 Reply Last reply Reply Quote 0
            • dr ramaanandD
              dr ramaanand @PeterJones
              last edited by dr ramaanand

              @PeterJones https://community.notepad-plus-plus.org/topic/22690/generic-regex-replacing-in-a-specific-zone-of-text says to use the RegEx (?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\K(?-si:FR), so perhaps (?-si:<h2|(?!\A)\G)(?s-i:(?!</h2).)*?\K(?-si:</p>\R<p[^<>]*>) will find what I want to find. Then I probably have to use, </h2>\r\n<h2> in the, “Replace All” field. Please comment!

              PeterJonesP 1 Reply Last reply Reply Quote 0
              • mkupperM
                mkupper @dr ramaanand
                last edited by

                @dr-ramaanand, It’s not clear what you are trying to do. For example I saw that:

                • Line 1 starts with an <h2> and ends with an </p>.
                • Line 5 starts with a <p> and ends with an </h2>.

                Is the intent to just find and correct those two lines or are you trying to do more?

                If the goal is to find and correct those two lines then one option is:

                ** Search: (?i)<([a-z][a-z0-9]*)>([^<]*)</(?!$1)[a-z][a-z0-9]*>
                Replace: <$1>$2</$1> **

                I’ll unpack that, reading from left to right:

                • (?i) - case insensitive matches - this is optional
                • <([a-z][a-z0-9]*)> Match an HTML tag that starts with a letter and is followed by either letters, digits, and dashes. If you only care about <p> and <h2> then this can be <(p|h2)>.
                • ([^<]*)</ - Skip everything that is not a < and then expect a < followed by a /.
                • (?!$1) - Match anything that is not the HTML tag we saw in the <([a-z][a-z0-9]*)> (or <(p|h2)> if you use that version). The (?!...) style matches are peculiar in that they are what are known as lookahead matches. They match for searching but are not part of the “match” when replacing.
                • [a-z][a-z0-9]*> - This deals with that (?!...) style matches are lookaheads by matching anything that looks like an HTML tag. You could use (p|h2)> here instead if you want.

                The replacement part has:

                • <$1> - Generate a starting tag which is < followed by the first HTML tag, followed by the trailing >.
                • $2 - Generate whatever was between the starting and ending tags.
                • </$1> - Generate a closing tag using the same tag name as the starting tag.
                1 Reply Last reply Reply Quote 1
                • PeterJonesP
                  PeterJones @dr ramaanand
                  last edited by PeterJones

                  @dr-ramaanand said in How to stop searching or replacing after a string?:

                  Please comment!

                  What is there to comment on? If you plug in those FIND and REPLACE expressions, and then hit REPLACE ALL, it does the replacement you want, to the best of my ability to understand.

                  For example, if it started with

                  <h2>What is the best cure for warts in Boston without cutting ?</p>
                  <p>What is the best cure for warts without cutting in Boston ?</p>
                  <p>What is the best cure for warts in Boston without burning ?</p>
                  <p>What is the best cure for warts without burning in Boston ?</p>
                  <p>What is the best warts treatment in Boston without cutting ?</h2>
                  <p>What is the best cure for warts in Boston without cutting ?</p>
                  <p>What is the best cure for warts without cutting in Boston ?</p>
                  <p>What is the best cure for warts in Boston without burning ?</p>
                  <p>What is the best cure for warts without burning in Boston ?</p>
                  <p>What is the best warts treatment in Boston without cutting ?</p>
                  

                  clicking REPLACE ALL with the expressions you figured out from the formula give me

                  <h2>What is the best cure for warts in Boston without cutting ?</h2>
                  <h2>What is the best cure for warts without cutting in Boston ?</h2>
                  <h2>What is the best cure for warts in Boston without burning ?</h2>
                  <h2>What is the best cure for warts without burning in Boston ?</h2>
                  <h2>What is the best warts treatment in Boston without cutting ?</h2>
                  <p>What is the best cure for warts in Boston without cutting ?</p>
                  <p>What is the best cure for warts without cutting in Boston ?</p>
                  <p>What is the best cure for warts in Boston without burning ?</p>
                  <p>What is the best cure for warts without burning in Boston ?</p>
                  <p>What is the best warts treatment in Boston without cutting ?</p>
                  

                  That is the only logical behavior I can come up with from your initial description: since your initial data showed 10 lines of input and 13 lines of output, I had to assume you were just giving the general idea, and not an exact “before” and “after”, because that “after” would be impossible. And when i tried your expression from your original followon post, and try it with the original “before” multiple times (until it stops finding matches), I am left with the identical ten lines to the “after” that I showed.

                  So that one regex with hitting REPLACE ALL once does the same thing as your original regex with four REPLACE ALLs.

                  So again, why ask for comment. It works.

                  mkupperM dr ramaanandD 2 Replies Last reply Reply Quote 1
                  • mkupperM
                    mkupper @PeterJones
                    last edited by

                    @PeterJones said in How to stop searching or replacing after a string?:

                    So again, why ask for comment. It works.

                    I think I now understand what @dr-ramaanand was struggling with which was
                    Search: \?</p>\R<p[^<>]*>(.*)(?=<\/h2)
                    Replace: ?</h2>\r\n<h2>$1

                    The OP was looking for lines that end with </p> followed by lines that start with <p> (done in a convoluted way) that in turn ended with lines that end with </h2>. The ending match was a lookahead.

                    In the replacement part the end of that first line, which had ended in </p> is replaced with </h2> and that second line which started with <p> now starts with <h2>.

                    If the OP does the search/replace over and over it ends up walking the unbalanced p / h2 lines up line with each round of search/replace without ever fixing the underlying issue.

                    The OP’s expression also did not detect nor fix line 1 with its unbalanced <h2> ...</p> until the search/replace had been repeated enough times that it accidentally made line 1 become a balanced <h2> ...</h2> as the lower down unbalanced line was walked up to lines 1 and 2.

                    The puzzle is that in the OP’s “after” text black three additional lines were added between the expected lines 5 and 6. I suspect the OP, in the fog of war, was copy/pasting the blocks of lines and then trying various expressions.

                    dr ramaanandD 1 Reply Last reply Reply Quote 1
                    • dr ramaanandD
                      dr ramaanand @mkupper
                      last edited by dr ramaanand

                      @mkupper @PeterJones Akismet.com wasn’t letting me post all the 13 lines, so I deleted some lines. I however, did not delete the 3 lines from the output results. I am sorry about that. I will check out the RegEx you helped me figure out as soon as I get back home. Thanks a lot!

                      dr ramaanandD 1 Reply Last reply Reply Quote 1
                      • dr ramaanandD
                        dr ramaanand @dr ramaanand
                        last edited by

                        @PeterJones it is working perfectly. Thanks a lot! For your information, www.regex101.com suggests to use the RegEx, “(?-si:<h2|(?!\A)\G)(?s-i:(?!<\/h2).)*?\K(?-si:<\/p>\R<p[^<>]*>)”

                        Terry RT 1 Reply Last reply Reply Quote 0
                        • Terry RT
                          Terry R @dr ramaanand
                          last edited by Terry R

                          @dr-ramaanand
                          You need to understand that there are many flavours of regular expression engines used in the world. Regex101.com can test using some of them, however I don’t think it has access to the exact regex engine used with Notepad++.

                          The site itself is very useful though in describing a particular regular expression, but don’t rely on the expression defined as correct on that site working in Notepad++.

                          Read the FAQ post on it here.

                          Terry

                          PS your regex changes equated to “escaping” what regex101.com described as meta-characters (special characters). To see the actual special characters as used within Notepad++ see the online manual reference here. Note the / is not one of those special characters and therefore does not need escaping with Notepad++. The regex will however still work as the “escape” is basically ignored for a normal character. However it’s not a good idea to just use a regex as confirmed by regex101.com without fully understanding it as it could be possible for the regex to make unwanted changes if used in Notepad++

                          1 Reply Last reply Reply Quote 3
                          • guy038G
                            guy038
                            last edited by guy038

                            Hi, @dr-ramaanand, @peterjones, @mkupper, @terry-R and All,

                            @dr-ramaanand, in a nutshell, I would say :

                            From this INPUT text, relative to the beginning of our GNU license, pasted in a new tab :

                            <h2>The licenses for most software are designed to take away your freedom to share and change it.</p>
                            <p>By contrast, the GNU General Public License is intended to guarantee your freedom to share</p>
                            <p>and change free software--to make sure the software is free for all its users.</p>
                            <p>This General Public License applies to most of the Free Software Foundation's software</p>
                            <p>and to any other program whose authors commit to using it.</h2>
                            <p>(Some other Free Software Foundation software is covered by the GNU Library General</p>
                            <p>Public License instead) You can apply it to your programs, too.</p>
                            <p>When we speak of free software, we are referring to freedom, not price.</p>
                            <h2>Our General Public Licenses are designed to make sure that you have the freedom to distribute</p>
                            <p>copies of free software (and charge for this service if you wish), that you receive source code</p>
                            <p>or can get it if you want it, that you can change the software or use pieces of it in new</p>
                            <p>free programs; and that you know you can do these things.</h2>
                            <p>To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights.</p>
                            <p>These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.</p>
                            
                            <p>For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have.</p>
                            <p>You must make sure that they, too, receive or can get the source code</p>
                            <h2>and you must show them these terms so they know their rights.</p>
                            <p>We protect your rights with two steps: (1) copyright the software, and (2) offer you this</p>
                            <p>license which gives you legal permission to copy, distribute and/or modify the software.</h2>
                            
                            <h2>Also, for each author's protection and ours, we want to make certain that</p>
                            <p>everyone understands that there is no warranty for this free software.</h2>
                            <p>If the software is modified by someone else and passed on, we want its recipients to know that what they have</p>
                            <p>is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.</p>
                            

                            Here is the minimal syntax of the generic S/R to get what you expect to :

                            • Move to the very beginning of the file ( Ctrl + Home ) (IMPORTANT )

                            • Open the Replace dialog

                            • Untick all box options

                            • SEARCH (?-i:<h2|(?!\A)\G)(?s-i:(?!</h2).)*?\K(?-i:/p>\R<p)

                            • REPLACE /h2>\r\n<h2

                            • Select the Regular expression search mode

                            • Click on the Replace All button

                            => Voila ! You should get your expected OUTPUT text :

                            <h2>The licenses for most software are designed to take away your freedom to share and change it.</h2>
                            <h2>By contrast, the GNU General Public License is intended to guarantee your freedom to share</h2>
                            <h2>and change free software--to make sure the software is free for all its users.</h2>
                            <h2>This General Public License applies to most of the Free Software Foundation's software</h2>
                            <h2>and to any other program whose authors commit to using it.</h2>
                            <p>(Some other Free Software Foundation software is covered by the GNU Library General</p>
                            <p>Public License instead) You can apply it to your programs, too.</p>
                            <p>When we speak of free software, we are referring to freedom, not price.</p>
                            <h2>Our General Public Licenses are designed to make sure that you have the freedom to distribute</h2>
                            <h2>copies of free software (and charge for this service if you wish), that you receive source code</h2>
                            <h2>or can get it if you want it, that you can change the software or use pieces of it in new</h2>
                            <h2>free programs; and that you know you can do these things.</h2>
                            <p>To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights.</p>
                            <p>These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.</p>
                            
                            <p>For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have.</p>
                            <p>You must make sure that they, too, receive or can get the source code</p>
                            <h2>and you must show them these terms so they know their rights.</h2>
                            <h2>We protect your rights with two steps: (1) copyright the software, and (2) offer you this</h2>
                            <h2>license which gives you legal permission to copy, distribute and/or modify the software.</h2>
                            
                            <h2>Also, for each author's protection and ours, we want to make certain that</h2>
                            <h2>everyone understands that there is no warranty for this free software.</h2>
                            <p>If the software is modified by someone else and passed on, we want its recipients to know that what they have</p>
                            <p>is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.</p>
                            

                            Best Regards,

                            guy038

                            1 Reply Last reply Reply Quote 0
                            • guy038G
                              guy038
                              last edited by guy038

                              Hello, @dr-ramaanand and All,

                              I said :

                              Here is the minimal syntax of the generic S/R …

                              However, I received a message from @terry-R, by chat, who, seemingly have found out a shorter syntax that mine which looks as the true minimal possible syntax ;-))


                              @terry-r, from my previous solution :

                              • SEARCH (?-i:<h2|(?!\A)\G)(?s-i:(?!</h2).)*?\K(?-i:/p>\R<p)

                              • REPLACE /h2>\r\n<h2

                              I could have slightly changed this S/R by :

                              • SEARCH (?-i:<h2>|(?!\A)\G)(?s:(?!<).)*?\K(?-i:</p>(\R)<p)

                              • REPLACE </h2>$1<h2>

                              Which, in turn, can be expressed as :

                              • SEARCH (?-i:<h2>|(?!\A)\G)[^<]+\K(?-i:</p>(\R)<p)

                              • REPLACE </h2>$1<h2>

                              If we assume that there is a single < char near the very end of each line ( remember that, AFTER the replacement, the regex position is RIGHT AFTER the string <p> )


                              Note that this new solution is quite closed to your solution, exposed in your chat, if we do not use non-capturing groups nor the the -i modifier to ensure the search of valid HTML tags :

                              • SEARCH (<h2>|\G)[^<]+\K</p>(\R)<p>

                              • REPLACE </h2>${2}<h2>

                              How, this new formulation works with my example ?

                              SEARCH   (?x) (?-i: <h2> | (?!\A) \G )    [^<]+    \K  (?-i: </p> ( \R ) <p> )
                                                  ----                  -----              ---------------
                                                  BSR                 ESR = '<'                  FR
                              
                              
                              First, as we start at VERY BEGINNING of file, the regex looks for a FIRST '<h2>' string
                              
                              
                              <h2>The licenses for most software are designed to take away your freedom to share and change it.  </p>
                              <h2>-----------------------------------------[^<]+-----------------------------------------------\K</p>CRLF
                              
                              <p>  By contrast, the GNU General Public License is intended to guarantee your freedom to share  </p>
                              <p>\G-----------------------------------------[^<]+--------------------------------------------\K</p>CRLF
                              
                              <p>  and change free software--to make sure the software is free for all its users.  </p>
                              <p>\G-----------------------------------------[^<]+--------------------------------\K</p>CRLF
                              
                              <p>  This General Public License applies to most of the Free Software Foundation's software  </p>
                              <p>\G--------------------------------------------------------------------------------------\K</p>CRLF
                              
                              <p> and to any other program whose authors commit to using it. </h2>
                              <p>
                              
                                  ( On THIS line, impossible to find the string '</p>CRLF<p>', near the END of line, so the NEXT \G syntax is NOT true anymore
                                    Thus, NO MORE replacement occurs till a NEW '<h2>' string happens and the NEXT THREE lines remained UNTOUCHED ! )
                              
                              <p>(Some other Free Software Foundation software is covered by the GNU Library General</p>
                              <p>Public License instead) You can apply it to your programs, too.</p>
                              <p>When we speak of free software, we are referring to freedom, not price.</p>
                              
                              <h2>Our General Public Licenses are designed to make sure that you have the freedom to distribute  </p>
                              <h2>-----------------------------------------[^<]+-----------------------------------------------\K</p>CRLF...
                              
                              On this LAST line, the RE-SYNCHRONIZATION occurs because a '<h2>' string is found and the cycle RESTARTS !
                              

                              Best Regards,

                              guy038

                              dr ramaanandD 1 Reply Last reply Reply Quote 0
                              • dr ramaanandD
                                dr ramaanand @guy038
                                last edited by dr ramaanand

                                @guy038 there is only one, “<h2>” and one, “</h2>” in the file, so what @PeterJones helped me, “guess”, based on what you typed in an earlier thread seems to be good enough

                                1 Reply Last reply Reply Quote 0
                                • dr ramaanandD
                                  dr ramaanand @PeterJones
                                  last edited by dr ramaanand

                                  @PeterJones using “(?-si:<h2|(?!\A)\G)(?s-i:(?!<\/h2).)*?\K(?-si:<\/p>\R<p[^<>]*>)” in the “Find what” field and “</h2>\r\n<h2>” in the “Replace All” field is working perfectly. Thank you very much. Happy New Year!

                                  dr ramaanandD 1 Reply Last reply Reply Quote 0
                                  • dr ramaanandD
                                    dr ramaanand @dr ramaanand
                                    last edited by

                                    @PeterJones @guy038 I am sorry to say that I tested this on my laptop just now as I was busy with the New Year celebrations and observed that the RegEx which Peter Jones told me to use is also replacing only one </p> and one <p.............. > on the next line instead of replacing all at once.

                                    PeterJonesP 1 Reply Last reply Reply Quote 0
                                    • PeterJonesP
                                      PeterJones @dr ramaanand
                                      last edited by PeterJones

                                      @dr-ramaanand said in How to stop searching or replacing after a string?:

                                      @PeterJones @guy038 I am sorry to say that I tested this on my laptop just now as I was busy with the New Year celebrations and observed that the RegEx which Peter Jones told me to use is also replacing only one </p> and one <p.............. > on the next line instead of replacing all at once.

                                      I’m sorry to say that my experience is different than yours. Replace All with the regex I suggested works all at once.

                                      dr ramaanandD 1 Reply Last reply Reply Quote 0
                                      • dr ramaanandD
                                        dr ramaanand @PeterJones
                                        last edited by dr ramaanand

                                        @PeterJones it looks like you took a long time to reply, so I asked at www.regex101.com and was told to use, “(?-si:<h2[^<>]*+>|\G)[^<>]*+\K<\/p>\s*+<p[^<>]*+>” which is replacing all that I want replaced, all at once!

                                        dr ramaanandD PeterJonesP 3 Replies Last reply Reply Quote 0
                                        • dr ramaanandD
                                          dr ramaanand @dr ramaanand
                                          last edited by dr ramaanand

                                          This post is deleted!
                                          1 Reply Last reply Reply Quote 0
                                          • PeterJonesP
                                            PeterJones @dr ramaanand
                                            last edited by PeterJones

                                            @dr-ramaanand ,

                                            It was less than an hour. if that’s a “long time” in your mind, then using a free, Community-based service is probably not the right question/answer format for you. I would suggest finding someone to pay to give you instant 24/7/365.2425 support. Because you aren’t going to find anything guaranteed faster in any free online Q&A site.

                                            And as we’ve told you before, regex101 uses a different flavor of regex engine than Notepad++ does, so there is no guarantee that a regex suggested by that site will be compatible with Notepad++. Use at your own risk.

                                            Further, as you have reported multiple times, and as my video showed, the regex shown does work (as you once said, “perfectly”), and you didn’t need my video to know that. If you want to use a different regex, that’s fine, go ahead and use whatever “works” for you. But don’t pretend (and publically state) that it was a defficiency in the regex already given to you or a “slow response” from me or any of the other regulars here who answer questions out of the kindness of our hearts.

                                            dr ramaanandD 1 Reply Last reply Reply Quote 3
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors