Community
    • Login

    Regex: Select only the first instance of search results / first match

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    54 Posts 7 Posters 56.1k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vasile CarausV Offline
      Vasile Caraus @Alan Kilborn
      last edited by

      This post is deleted!
      1 Reply Last reply Reply Quote 0
      • Vasile CarausV Offline
        Vasile Caraus @Alan Kilborn
        last edited by

        This post is deleted!
        1 Reply Last reply Reply Quote 0
        • Vasile CarausV Offline
          Vasile Caraus
          last edited by

          so, conclusion. I select all regex from the las converstion:

          Select and replace the first instance:

          SEARCH: (?s)\A.*?<tr>\s*\K.*?(\s*</tr>)(?=$)
          REPLACE BY: NEW CONTENT $1

          or

          SEARCH: (?s)\A.*?<tr>\s*\K.*?(\s*</tr>)
          REPLACE BY: NEW CONTENT $1

          Select and replace the last instance:

          SEARCH: (?s)<tr>.*</tr>.*?<tr>\K.+?(?=</tr>.*?\z)
          REPLACE BY: \r NEW CONTENS $1 \r

          or

          SEARCH: (?s)\A.*<tr>\K.+?(?=</tr>)
          REPLACE BY: \r NEW CONTENS $1 \r

          WORKS. Thanks a lot friends.

          1 Reply Last reply Reply Quote 0
          • Alan KilbornA Offline
            Alan Kilborn
            last edited by Alan Kilborn

            This all seems rather “special case”.
            This <tr> and </tr> junk…

            To be generic, that is, a roadmap for other interested parties to use, why not specify it like this:


            Match only the first occurrence in a file of a regular expression RE:

            (?s)\A.*?\KRE


            Match the last occurrence of a regular expression RE:

            (?s)\A.*(RE).*?\K\1


            Of course, clearly the RE has to be something a bit more specific than (example) .., but these seem to mostly work to achieve the goal.

            1 Reply Last reply Reply Quote 2
            • guy038G Online
              guy038
              last edited by guy038

              Hello, @vasile-caraus, @Terry-R, @alan-kilborn, @peterjones and All,

              IMPORTANT : I wrote this post, after reading posts from the banner 4 YEARS LATER till the @peterjones’s post, below :

              https://community.notepad-plus-plus.org/post/62964

              But I going to add a second post, after reading the last recent solutions ! Sorry for my incomplete work !


              First, @vasile-caraus, I totally agree to @alan-kilbron’s comment on your attitude ! Not very fair and nice to @Terry-r, which was trying to help you :-((

              Seemingly, you quite know, by now, the powerful of regexes, regarding text manipulations. And if you had studied, seriously, some regex tutorials, you would not have spoken about that regex (?s)\z.*?<tr>\s*\K.*?(\s*</tr>) which is a complete nonsense !

              For instance, from the two pages of the Regular-expressions.info site, below, you had understood, at once, that the \z syntax always comes at the very end of a regex expression or, possibly, before an alternation symbol | !!

              https://www.regular-expressions.info/anchors.html

              https://www.regular-expressions.info/refanchors.html


              Now, I slightly simplified the @peterjones’s search regex, which searches for the first element <tr> ••••• </tr>, of an HTML page :

              SEARCH (?s-i)\A.*?<tr>\K.*?(?=</tr>)

              In return, if your replacement regex is :

              • The expression Here is the NEW text, you’ll get the simple text
               </tr>Here is the NEW text</tr>
              
              • The expression is \r\nHere is the NEW text\r\n the output text will be :
              <tr>
              Here is the NEW text
              </tr>
              
              • Tick the Wrap around option

              • Click on the Replace All button, exclusively !


              Now, to search for the last element <tr> ••••• </tr>, of an HTML page, use the following regex :

              SEARCH (?s-i)<tr>\K((?!<tr>).)*?(?=</tr>((?!<tr>).)*?\z)

              Note that I use exactly the scheme proposed by @Peterjones :

              
              - find from <tr> to </tr> ( NOT included )          =>    (?s-i)<tr>\K •••••••••• (?=</tr> •••••••••• )
                                                                                         ^                 ^    ^
                                                                                         |                 |    |
              - WITHOUT any contained <tr>                        =>    ((?!<tr>).)*? ---•                 |    |
              																							 |    |
              - FOLLOWED by anything that’s NOT a <tr>            =>    ((?!<tr>).)*? ---------------------•    |
              																								  |
              - until the VERY END of the file                    =>    \z -------------------------------------•
              

              To All :

              You could ask me : why the regex to search for the last <tr> ••••• </tr> block is more complicated than the one to search for the first one ?

              This is because of the general direction used by the regex engine : from LEFT to RIGHT !

              • Indeed, when we search for (?s-i)\A.*?<tr>, part of the first regex, the range of any char (?s).* with the lazy quantifier ? is then extended to the first occurrence of the string <tr> and means that, necessarily, this range cannot contain any <tr> inside !

              • Similarly, the regex (?s).*?(?=</tr>) would search for any range of any char, possibly empty, till the nearest string </tr>, meaning, implicitly, that this range of chars cannot contain a </tr> string

              • Whereas, when searching the last <tr> ••••• </tr> block, as our reference is the anchor \z ( very end of current file ), we must build up the regex, using a kind of back-propagation method :

                • Starting from the very end of file

                • Moving back, through characters without any <tr> string

                • Till a </tr> string

                • Moving back, again, through characters without any <tr> string

                • Till a <tr> string

              Of course, I assume that any <tr> correctly ends with </tr> !

              Test these two regexes against this sample, derived from Peter’s one, which contains 4 blocks </tr> •••• </tr> :

              <html><body>
              <table>
              <tr>
              get rid of stuff, in case of \A anchor, including <embedded/> <tags/>
              </tr>
              <tr>
              keep stuff including <embedded/> <tags/>
              </tr>
              <tr>
              keep stuff including <embedded/> <tags/>
              </tr>
              <tr>
              get rid of stuff, in case of \z anchor, including <embedded/> <tags/>
              </tr>
              </table>
              </body>
              </html>
              

              The first regex, with the \A syntax should replace the first block, only and the last regex, with the \z syntax, should replace the fourth and last <tr> block

              Best Regards,

              guy038

              P.S. :

              @vasile-caraus, note that I’m willing, and probably, all people involved in that discussion, to help you if you have difficulty understanding a specific part of a regex tutorial, that you have decided to study. A different perspective will certainly be very useful to you … and others ;-))

              1 Reply Last reply Reply Quote 1
              • guy038G Online
                guy038
                last edited by

                Hi, @vasile-caraus, @Terry-R, @alan-kilborn, @peterjones and All,

                My God !! Of course, the @terry-r’s regex is just magic and so simple ! Congratulations, Terry ;-)) How could we not think of it ??

                If I adapt Terry concept to the regexes of my previous post, everything becomes crystal clear :

                SEARCH (?s-i)\A.*?<tr>\K.*?(?=</tr>) to search ( and replace ) the first <tr> ••••• </tr> block

                SEARCH (?s-i)\A.*<tr>\K.*?(?=</tr>) to search ( and replace ) the last <tr> ••••• </tr> block

                As usual, tick the Regular expression and Wrap around options and click on the Replace All button, exclusively


                @vasile-caraus, this demonstrates, in a masterful way, that things can be skillfully solved by other people than me and moreover… by @terry-r !!


                Now, @alan-kilborn you said :

                Match the last occurrence of a regular expression RE:

                (?s)\A.*(RE).*?\K\1

                But, unless I’m mistaken, doesn’t this regex, below, do the same search ?

                (?s)\A.*\KRE

                Best regards,

                guy038

                Alan KilbornA Vasile CarausV 2 Replies Last reply Reply Quote 2
                • Terry RT Offline
                  Terry R
                  last edited by Terry R

                  @guy038 said in Regex: Select only the first instance of search results / first match:

                  Hi, @vasile-caraus, @Terry-R, @alan-kilborn, @peterjones and All,
                  My God !! Of course, the @terry-r’s regex is just magic and so simple !

                  I feel like I’m being rewarded for something I stole borrowed now. ;-)) All I did was point out the marvellous creation of @PeterJones and how by the absence of a single character it turns one thing into another.

                  But hey, I’m happy that collectively we can show there are many answers, all work in various ways.

                  Terry

                  1 Reply Last reply Reply Quote 2
                  • Alan KilbornA Offline
                    Alan Kilborn @guy038
                    last edited by

                    @guy038 said in Regex: Select only the first instance of search results / first match:

                    But, unless I’m mistaken, doesn’t this regex, below, do the same search ?
                    (?s)\A.*\KRE

                    Yes, indeed.
                    That’s what I get for dabbling in the area of another master! :-)

                    1 Reply Last reply Reply Quote 1
                    • Vasile CarausV Offline
                      Vasile Caraus @guy038
                      last edited by

                      @guy038 thanks a lot !

                      dr ramaanandD 1 Reply Last reply Reply Quote 0
                      • dr ramaanandD Offline
                        dr ramaanand @Vasile Caraus
                        last edited by dr ramaanand

                        @Vasile-Caraus The regular expression (?s)\A.*?\Kstring(?:.*?)?> helps find the very first occurrence of a string and if you want to find the first occurrence of a tag, say TAG_2, AFTER the first occurrence of another tag, say TAG_1, my generic regex becomes :

                        (?s-i)\A.*?<TAG_1(?: .*?)?>.*?\K<TAG_2(?: .*?)?> as per @guy038

                        dr ramaanandD 1 Reply Last reply Reply Quote 0
                        • dr ramaanandD Offline
                          dr ramaanand @dr ramaanand
                          last edited by dr ramaanand

                          On testing the above, I observed that both the above regular expressions work only for tags or strings that begin with a < and end with a > - so if you are searching for a string between inverted commas, to find the first string, you should use the regular expression (?s)\A.*?\K"string(?:.*?)?"

                          dr ramaanandD 1 Reply Last reply Reply Quote 0
                          • dr ramaanandD Offline
                            dr ramaanand @dr ramaanand
                            last edited by dr ramaanand

                            This post is deleted!
                            1 Reply Last reply Reply Quote 0

                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                            With your input, this post could be even better 💗

                            Register Login
                            • First post
                              Last post
                            The Community of users of the Notepad++ text editor.
                            Powered by NodeBB | Contributors