Community
    • Login

    REGEX - Select everything before a particular word included the line with Word ?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    32 Posts 10 Posters 74.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Neculai I. FantanaruN
      Neculai I. Fantanaru
      last edited by

      hello. I want to use a Regex formula to select everything before a particular word (or before an expression) included that line with Word from the entire text document.

      Supose I have:


      text
      text
      text
      WORD
      text
      text


      How can I delete everything before WORD included the line with WORD ?

      1 Reply Last reply Reply Quote -1
      • guy038G
        guy038
        last edited by guy038

        Hello Neculai I. Fantanaru,

        Not very difficult !

        So let’s suppose you have the following text :

        Line 1
        Line 2
        Line 3
        Line 4
        This expression WORD is contained in this line 5
        Line 6
        Line 7
        Line 8
        WORD begins that line 9
        Line 10
        Line 11
        Line 12
        

        I think that there are four possible versions of your initial question :

        • How to delete any text from the beginning of the current file till the FIRST line, included, containing the string “WORD” ( A )
        • How to delete any text from the beginning of the current file till the LAST line, included, containing the string “WORD” ( B )
        • How to delete any text from the cursor location of the current file till the FIRST line, included, containing the string “WORD” ( C )
        • How to delete any text from the cursor location of the current file till the LAST line, included, containing the string “WORD” ( D )

        Remark : For cases C / D, in order to delete all the line, where the cursor is located, just move it at the very beginning of the current line, before running the S/R operation

        No worry :-) For each case, of course, there’s a suitable regex to perform ! So :

        • Open the Replace dialog ( CTRL + F )

        • Check the Match case option, if necessary

        • Check, preferably, the Wrap around option

        • Select, of course, the Regular expression search mode

        • In the Find what zone, type in, according to the wanted case :

          • \A(?s).*?WORD(?-s).*\R for case ( A )

          • \A(?s).*WORD(?-s).*\R for case ( B )

          • (?s).*?WORD(?-s).*\R for case ( C )

          • (?s).*WORD(?-s).*\R for case ( D )

        • Leave the Replace with zone EMPTY

        • Click, once, on the Find Next button, then click on the Replace button

        Et voilà !

        Notes :

        • The \A matches the zero-length position at the very beginning of the file

        • The (?s) modifier means that further dot symbol(s) will match, absolutely, any character ( Standard or EOL characters )

        • The (?-s) modifier means that further dot symbol(s) will match, ONLY the standard characters, ( usual form )

        • Note that when the . matches newline option is checked, it means that an invisible (?s) is supposed, in front of the whole regex. However, the use of the two modifiers, (?s) and (?-s), inside the regex, allow us enhanced regexes

        • The \R syntax represents any kind of EOL characters ( \r\n in Windows* files, \n in Unix/OSX files or \r in old MAC files

        • Remember, for instance, that :

          • The regex 0.*9 matches the longest string, beginning with a 0 digit and ending with a 9 digit

          • The regex 0.*?9 matches the shortest string, beginning with a 0 digit and ending with a 9 digit. That’s why there is NO other digit 9 inside that string

        Best Regards,

        guy038

        1 Reply Last reply Reply Quote 2
        • Neculai I. FantanaruN
          Neculai I. Fantanaru
          last edited by

          hello. Thanks, works fine.

          And I did find another 2 solutions, just as good as yours:

          [[\s\S]*.*(WORD).*[\s\S]]

          or

          ((?s)((^.*)WORD))(.*$)

          1 Reply Last reply Reply Quote 0
          • Neculai I. FantanaruN
            Neculai I. Fantanaru
            last edited by

            Hello guy038. Can u help me? What will be the regex code, if I want to delete everything before a particular word?

            suppose ex:

            word
            word
            bla bla word_to_delete
            word
            word

            I want to delete the line with “word_to_delete” and everything after that line. How can I do that?

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hi, Neculai,

              Again, two interpretations, of your question, are possible, because your file may contain several lines, with that specific “word_to_delete”

              So, let’s use the example text, below, where the simple string “ABC” stands for your expression “word_to_delete”

              Line 1
              Line 2
              Line 3
              Line 4 with the "ABC" string
              Line 5
              Line 6
              Line 7 containing ABC, too
              Line 8
              Line 9
              Line 10 is the last line, with the string ABC
              Line 11
              Line 12
              Line 13
              

              Then :

              • Go back to the beginning of your file ( CTRL + Origin )

              • Open the Replace dialog ( CTRL + H )

              • Select the Regular expression search mode

              • If you want to delete from the beginning of Line 4 ( FIRST line containing the string ABC, after the cursor locatiion ), till the end of the file, use the following S/R :

              SEARCH : (?-s)^.*ABC(?s).*

              REPLACE : Leave EMPTY

              • If you want to delete from the beginning of Line 10 ( LAST line containing the string ABC ), till the end of the file, use the following S/R :

              SEARCH : (?s).*\R\K.*ABC.*

              REPLACE : Leave EMPTY

              Notes :

              • The in-line modifiers, (?s) and (?-s) as well as the \R syntax, were explained in my previous post

              • The \K syntaxforces the regex engine to consider that any matched regex, before the \K form, is forgotten and that the final regex to match is, ONLY, the regex, located after the \K form

              IMPORTANT :

              Due to the \K feature, included in the second S/R, you must use the Replace All button, exclusively. In that case, the step by step replacement, with the Replace button, does NOT work !!

              Cheers

              guy038

              1 Reply Last reply Reply Quote 0
              • Neculai I. FantanaruN
                Neculai I. Fantanaru
                last edited by

                hi, works great.
                And please , guy38, one single question. If I want to delete a particular fragment from a text file:

                I want to delete everything before line 4 that contains the word “UNTIL THIS” (included line 4) , and in the same time to delete everything after line 10 that contain the word “AFTER THIS” (included the line 10)

                Line 1
                Line 2
                Line 3
                Line 4 UNTIL THIS
                Line 5 -----
                Line 6 -----
                Line 7 -----
                Line 8 -----
                Line 9 -----
                Line 10 AFTER THIS
                Line 11
                Line 12
                Line 13

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hello Neculai,

                  As we need to grab several lines at the same time, we’ll use, again, the (?s) modifier, in order to allow the dot symbol to match, absolutely, any character. In addition, I just add the (?-i) modifier which ensures that the search will be performed in a non-insensitive way, that is identical to say, in a sensitive way to case !

                  The search regex is, simply, an alternative to two regexes, successively used ( one regex searching, firstly, for lines 1 to 4, included and the other one searching, secondly, for lines 10 to 13, included. So :

                  We start with the example text below :

                  Line 1
                  Line 2
                  Line 3
                  Line 4 UNTIL THIS end
                  Line 5 -----
                  Line 6 -----
                  Line 7 -----
                  Line 8 -----
                  Line 9 -----
                  Line 10 AFTER THIS end
                  Line 11
                  Line 12
                  Line 13
                  
                  • Go back to the beginning of your file ( CTRL + Origin )

                  • Open the Replace dialog ( CTRL + H )

                  • Select the Regular expression search mode

                  • Preferably, uncheck the Wrap around option

                  SEARCH : (?s-i).*UNTIL THIS.*?\R|.*\R\K.*AFTER THIS.*

                  REPLACE : Leave EMPTY

                  • Click, one time, on the Replace All button

                  You should obtain the wanted result, below :

                  Line 5 -----
                  Line 6 -----
                  Line 7 -----
                  Line 8 -----
                  Line 9 -----
                  

                  These two regexes are rather similar to those, described in my previous posts and don’t need any further explanation !

                  Best Regards,

                  guy038

                  BTW, concerning my previous post, I noticed a funny behaviour :

                  • Copy the text, below, in a new file :

                    Line 1
                    Line 2
                    Line 3
                    Line 4 with the “ABC” string
                    Line 5
                    Line 6
                    Line 7 containing ABC, too
                    Line 8
                    Line 9
                    Line 10 is the last line, with the string ABC
                    Line 11
                    Line 12
                    Line 13

                  • Go back to the beginning of your file ( CTRL + Origin )

                  • Open the Replace dialog ( CTRL + H )

                  • Select the Regular expression search mode

                  • Check the Wrap around option

                  • Copy, in the Find what zone, the regex (?s-i).*\R\K.*ABC.*

                  • Click, a first time, on the Replace button ( NOT the Replace All button ! ) => The lines 10 to 13 included are selected

                  • Click, a second time, on the Replace button => The lines 10 to 13 included are deleted and, simultaneously, the lines 7 to 9 included are selected

                  • Click a third time, on the Replace button => The lines 7 to 9 included are deleted** and, simultaneously, the lines 4 to 6 included are selected

                  • Finally, a fourth click, on the Replace button, deletes the lines 4 to 6 included

                  Thus, contrary to what I had thought, up to now, although a \K form is used in the search regex, a mouse click on the Replace button ( step by step replacement ) still produces, in some cases, an action on the selected text !!

                  1 Reply Last reply Reply Quote 0
                  • Scott SumnerS
                    Scott Sumner
                    last edited by

                    @guy038 , I trust you will not be satisfied until you dig deeper to characterize more fully this contrary \K behavior you have discovered.

                    I think you might find that having the “Wrap around” option enabled is key to the Replace button doing what it does here.

                    1 Reply Last reply Reply Quote 0
                    • Neculai I. FantanaruN
                      Neculai I. Fantanaru
                      last edited by

                      hi, guy038. Works just fine !!

                      thank you very much !

                      1 Reply Last reply Reply Quote 0
                      • Ashton WattsA
                        Ashton Watts
                        last edited by

                        Hi @guy038 ,

                        I’m hoping you can help as the bits above were really helpful but I still have a bit to do.

                        I want to delete everything between two points in 47000 html files.

                        I can insert the points using a simple find an replace so i would be left with;

                        Want to keep
                        Want to keep
                        Want to keep
                        START-DELETING
                        delete
                        delete
                        delete
                        delete
                        delete
                        STOP-DELETING
                        Want to keep
                        Want to keep
                        Want to keep
                        Want to keep
                        Want to keep

                        Hoping you have the answer.

                        regards,

                        1 Reply Last reply Reply Quote 0
                        • guy038G
                          guy038
                          last edited by guy038

                          Hello, ashton-watts, and All,

                          I suppose that all your .html files are in a specific directory. So :

                          • First, I strongly advice you to backup the directory containing all your .html files !

                          • Start Notepad++

                          • Now, open the Replace in Files dialog ( Ctrl + Shift + F )

                          • Type, in the Find what: zone, the regex (?s-i).*\R\KSTART-DELETING.*STOP-DELETING\R

                          • Leave the Replace with: zone EMPTY

                          • Insert *.html in the Filters: zone

                          • Fill the Directory : zone with the absolute path of your specific folder

                          • Finally, click on the Replace in Files button

                          • Click on the Yes button, to confirm replacement

                          Et voilà :-))

                          So from the initial text, below :

                          Want to keep
                          Want to keep
                          Want to keep
                          START-DELETING
                          delete
                          delete
                          delete
                          delete
                          delete
                          STOP-DELETING
                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          

                          you’ll get :

                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          Want to keep
                          

                          Important :

                          It could be useless to insert marks, in order to determine the starting and ending boundary of the range of lines to be deleted. Two possibilities :

                          • The boundaries are easy to isolate, among text around and are unique. In that case, it could replace the generic START-DELETING and STOP-DELETING lines

                          • The boundaries may be literally different but follow a same template. In that case, they can be found with a regex, which would be mixed with my regex above !

                          So, if it’s not confidential information and if you don’t mind, give us an example of the START-DELETING and STOP-DELETING lines of your .html files ! You could also join one of your files, or part of it, as an attached file, with your mail at my e-mail address :

                          Thanks, for this additional information !

                          See you later,

                          Best Regards

                          guy038

                          Ashton WattsA blackburn1489B 2 Replies Last reply Reply Quote 0
                          • Ashton WattsA
                            Ashton Watts @guy038
                            last edited by

                            @guy038 Hi goy038,

                            You are a legend. the regex search string above worked perfectly. I had already inserted the start and stop points so it wasn’t an issue.

                            Thanks very much for your help.

                            regards,

                            1 Reply Last reply Reply Quote 0
                            • blackburn1489B
                              blackburn1489 @guy038
                              last edited by blackburn1489

                              @guy038 Hello! Can u help me, please?

                              I need to get WORD between another word and part of the WORD word
                              example

                              title = WORD_name

                              After I get WORD, I need to find all WORD in the document

                              and rename them in WORD_lttz
                              //
                              After that I need to repeat all operations. but with another WORD1, WORD2, WORD3 and so on
                              that placed between “title =” and “_name”

                              title = WORD1_name

                              find them in entire document and rename them in WORD1_lttz , WORD2_lttz , WORD3_lttz and so on

                              1 Reply Last reply Reply Quote 0
                              • guy038G
                                guy038
                                last edited by guy038

                                Hello, @blackburn1489, and All,

                                I took some time to figure out what you exactly wanted to do and I hope that my solution will be close enough to what you need !

                                OK, let’s suppose that we start with the sample text below :

                                title = ABC_name
                                title = DEF_name
                                title = YZ_name
                                title = GHI_name
                                title = JKL_name
                                title = MNO_name
                                title = YZ_name
                                title = ABC_name
                                title = MNO_name
                                title = MNO_name
                                title = PQR_name
                                title = MNO_name
                                title = STU_name
                                title = VWX_name
                                title = ABC_name
                                title = YZ_name
                                title = GHI_name
                                

                                Note that it contains 3 lines with the string ABC, 2 lines with the string GHI, 4 lines with the string MNO and 3 lines with the string YZ !

                                Now, let’s imagine that you would change each string ABC, DEF… into new strings, according to the table below :

                                ABC    ->    ABC111
                                DEF    ->    DEF-22222
                                GHI    ->    GHI_GHI
                                JKL    ->    J
                                MNO    ->    mno
                                PQR    ->    000PQR
                                STU    ->    Test
                                VWX    ->    99
                                YZ     ->    Y-Z
                                

                                Then, using the following regex S/R :

                                SEARCH (?-i)title\x20=\x20(?:(ABC)|(DEF)|(GHI)|(JKL)|(MNO)|(PQR)|(STU)|(VWX)|(YZ))(?=_name)

                                REPLACE title = (?1\1111)(?2\2-22222)(?3\3_\3)(?4J)(?5\L\5)(?{6}000\6)(?7Test)(?{8}99)(?9Y-Z)_lttz

                                would, simultaneously, change any occurrence of these 9 strings, into the new ones, defined in the table above ;-))

                                So, after clicking on the Replace All button, you would get, at once, the following text :

                                title = ABC111_lttz_name
                                title = DEF-22222_lttz_name
                                title = Y-Z_lttz_name
                                title = GHI_GHI_lttz_name
                                title = J_lttz_name
                                title = mno_lttz_name
                                title = Y-Z_lttz_name
                                title = ABC111_lttz_name
                                title = mno_lttz_name
                                title = mno_lttz_name
                                title = 000PQR_lttz_name
                                title = mno_lttz_name
                                title = Test_lttz_name
                                title = 99_lttz_name
                                title = ABC111_lttz_name
                                title = Y-Z_lttz_name
                                title = GHI_GHI_lttz_name
                                

                                Et voilà !

                                Notes :

                                • Regarding the search regex :

                                  • First, the (?-i) syntax forces the search to be processed, in a sensitive way ( NON-insensitive )

                                  • Now, the part title\x20=\x20 tries to match the string title =, with a space character, before and after the equal sign

                                  • Then, the (?: syntax starts a non-capturing group

                                  • The part (ABC)|(DEF)|(GHI)|(JKL)|(MNO)|(PQR)|(STU)|(VWX)|(YZ) are, simply, 9 alternatives, corresponding to our 9 strings to be changed. Thus, each of them, between parentheses, is stored as group 1, 2, 3…

                                  • The final part )(?=_name) corresponds to the closing parenthesis of the non-capturing group, followed with a look-ahead structure or condition ( Is there the string _name afterABC, DEF… ? ) which must be true for an overall match

                                • Regarding the replacement regex :

                                  • First, it rewrites the string title = , followed with a space character

                                  • Then any (?#....) syntax, where # represents a digit, is a conditional replacement and all the regex after the #, till the closing parenthesis, is evaluated, if the matched string is stored in group #

                                  • Note that the 9 conditional replacement structures (?1\1111)(?2\2-22222)(?3\3_\3)(?4J)(?5\L\5)(?{6}000\6)(?7Test)(?{8}99)(?9Y-Z) could be placed in any order

                                  • In some of them, we rewrite the searched string, stored in group # , due to the \# escape sequence

                                  • In the conditional replacement (?5\L\5) we, simply, rewrite the upper-case string MNO, in lower-case, because of the \L replacement escape sequence

                                  • Be aware, too, that concerning the groups 6 and 8, their conditional replacements are build with the alternate form (?{#}....). Indeed, we must distinguish between the group number # and the digits, which follows it !. If the braces would have been absent, the regex engine would think that groups 6000 and 899 were concerned :-((

                                  • And finally, of course, it rewrites, in all cases, your ending part, the string _lttz !

                                Best Regards,

                                guy038

                                1 Reply Last reply Reply Quote 0
                                • And BojaA
                                  And Boja
                                  last edited by

                                  Hi,

                                  I have some E-mails

                                  100km@laufwunder.at
                                  100km@tus-ahrweiler.de
                                  100kmbelves@free.fr
                                  12ahewitt@royalschoolcavan.ie
                                  12lfuller@royalschoolcavan.ie
                                  12oakinlabi@royalschoolcavan.ie
                                  12vkells@royalschoolcavan.ie
                                  13@123.com
                                  13362880852@zj165.com
                                  1573364@mail.ru
                                  1matoo@zoznam.sk
                                  2008.lizhigang@163.com

                                  I Want to delete all words till the @ sorry for my english i have 1 milion e-mails so i want to remove all words till the domain start example:

                                  i want to split them into this:

                                  @laufwunder.at
                                  @tus-ahrweiler.de
                                  @free.fr
                                  @royalschoolcavan.ie
                                  @royalschoolcavan.ie
                                  @royalschoolcavan.ie
                                  @royalschoolcavan.ie
                                  @123.com
                                  @zj165.com
                                  @mail.ru
                                  @zoznam.sk
                                  @163.com

                                  Hope someone understand me what i am trying to say :S

                                  Claudia FrankC 1 Reply Last reply Reply Quote 0
                                  • Claudia FrankC
                                    Claudia Frank @And Boja
                                    last edited by Claudia Frank

                                    @And-Boja

                                    if your real data looks like your posted data then something like

                                    find what:^.*?(?=@)

                                    will do the job. Replace with is empty.
                                    See FAQ for more info on regex.

                                    Cheers
                                    Claudia

                                    1 Reply Last reply Reply Quote 0
                                    • Md Abdullah Al NomanM
                                      Md Abdullah Al Noman
                                      last edited by

                                      I want to delete everything between two points with 36000 line xml files.
                                      which portion is repeated in files.
                                      I can insert the points using a simple find an replace so i would be left with…
                                      <Middle></Middle>
                                      <WebsiteList></WebsiteList>
                                      <EventList></EventList>
                                      <Note></Note>
                                      <LastName></LastName>
                                      START-DELETING… <Photo>
                                      nO3df2vjyB3H8akt18KyV8LCVpA5p/USwy3koIX77+7plT6XPpiWDeyCjw2Nl5izg4K1toJ8kUL/
                                      UEjS/HAcr0bzne98Xv8cy4E9Jm+P9XP0p3/8818CgIya6gEA/B8UCbSgSKAFRQItKBJoQZFAC4oE
                                      WlAk0IIigRYUCbSgSKAFRQItKBJoQZFAC4oEWlAk0IIigRYUCbSgSKAFRQItKBJoQZFAC4oEWlAk
                                      0IIigRYUCbSgSKAFRQItKBJoQZFAC4oEWlAk0IIigRYUCbSgSKAFRQItKBJosVQPQEuu2xJCuK5T
                                      /DOOEyFEsk6z/EblsFhAkbtyHDvou67rtNv2k//ZK/6TZfk6SeP4Ko6TOL6qeIQ8oMhXNJsN3+8E
                                      fe+5EB+zrLrnOp7rCNHLsjyKVlG0iuMEc+fuUOStoO/1A7ft2JZVL6Y6IUTxz/1e0LLqQeAFgSeE
                                      KNKMom9I81UoUgghxkdhkU6hmOpKfH3f7/h+R4gwilbLOImi1WZzXeLrc4IiRdD3HuYoVZHm+9HB
                                      ep3OF0uk+RSKFMNhr/o3bbftdvsAaT5lepF+t2PbDYUDeJjmbHaJbU3TiyyOLFLQbtvjcZhlwXy+
                                      nE4vjO3S6HM2rtuqbAtyR5ZVHwz8n38+CsOu6rGoYdwc2Ww2Doc93+/sfVinApZVfz86aDv25LeZ
                                      6rFUzawig743GgWUW3yomL9Ni9KgX+1ms6FRjoUg8PxuR/UoKmVQkeNxqFeOhfE4tOoG/ZlM+ah+
                                      t1PuaZjKWFZ9PB6oHkV1TClyNDpQPYT9+X6HzlEq2YwoMuh7ag+Df7/xkSnTpBFFKjlPWC7bbjD4
                                      FLvgXySDCbIwCLsm7OLw/4RsphbLqocDX/UopGNepPILKcplwjTJ/OP5PqvDyyZMkyhSM0Gf1qUh
                                      peNcpOu2dDxJs51tNxzn9XvQ9MW7SC1P0rwq6LuqhyAR7yJ5nufg+k0rcC7Sbv5Z9RCk2OXOcX2x
                                      LpLRcZ9HGG9Kci6SMcti+4dj+8FAUygSaEGRQAuKBFo4F7lep6qHAG/Gucgsz1UPAd6Mc5GMpSnb
                                      ZatQpJYYL6SGIoEWFAm0oEigBUUCLShSP8s4UT0EiTgXuWF6iCTPOC+/y7nIlOkhkuJZO1xxLpLr
                                      HBnjV1tTXOeShPX5es5FJkmaZdxObafpNe/HOHAuUgjB7wGuCdOJ/w7zIqNopXoIJeO6KXKHeZH8
                                      Dt3x3q0R7IvcbK6ZXbfLe7dGsC9SCDGbXaoeQpl479YI9kW6bovT7jaz+f5ZnJ/5NfprMOC12mK7
                                      bf/6ywchxGQymy+WqocjBec5klmOD7FZyvopzkUyxnhJIxSpJU4bx4+gSC3xOxd1h3ORjCcSfkf+
                                      73AukvEJN8yRWuL6Z8uynPH1FpyL5HeZRYHr5ypwLpLl9ZGC79xf4FykYDqdRNE31UOQCEVqZhkn
                                      vC+2YF4kv91tft+xR5gXudlcM1vYjvdGpGBfpOB1Y0qaXnP6OM/iXySn0xvsb2kQJhTJaVJhvxEp
                                      TCgyjq94HJXMsjy6RJEs8JhaeHyKVxlRJI8bAM6mF6qHUAUjiozjK033b+4OXZ2fR4xXw3+I851f
                                      D33+9NVp21l28/e/jVSP5Q2m04t1klpWjf1hyDumFJnlN8Ufdb1ONXpiehR9433O8CkjfrUf0uiQ
                                      3nqdmpajMLJIbX7+NPrylMjAIrX5M2v05SmRcUVm+Y0uB/Y0+vKUyLgihSZnuqNoZeBGpDCzSC3m
                                      SC0GKYOJRW4218T/3lmW8751YQsTixTkZyBjf7KFsUXOF0vKFwTxOBG/H0OLFEKcU117dxknZh73
                                      KZhb5Ow8ojlNLuax6iGoZG6RWX5DcJpM02uTf7KFyUUKktPk5Ldz1UNQzOgis/yG1GWwUbQyeQuy
                                      YHSRQojZ7JLIKZwsyycT0ydIgSKFEJ8/faXw2z2ZzIw9BvkQihRZfvPx5ExtlJPJzIT7DHeBIoUQ
                                      IklShVEyfjjNHlDkrSRJT0/n1b/vfL5Ejg+hyHvp5g8Vb2rEHYa7Q5FAC4pUrO1oc2NkNVDkPafy
                                      OLIsJ3WIngIUea/i6SrL8o8nZ5yWbisFirxX5RyJHF+CIm9Z9Vpla10gxy1Q5C3Xdap5I+S4HYq8
                                      5bqtCt4FOb4KRd7y/Xey3wI57gJFCiGE3+3YdkPqWyDHHaFIIYQIB12pr48cd4ciheu2PJm7Ncjx
                                      TVCkGA578l4cOb6V6UWGYVfeBIkc92B0kY5jvx8dSHpx5Lgfc4tsNhs/HR9KenHkuDdTVsZ/xKrX
                                      Pvz4g2XVZbz4fjlaVn18FEoaUumWcbKYL2VcbmxikY5j/3R8SC3H4+NDja6VdN3W4bB3Nr2Yln01
                                      nXFFBn1vNAqQYykOh708y8tdrMagIpvNxvvRge93JL2+aTkWhsP+fBGXeBunEUVa9Vo48AdhV95W
                                      mpk5CiEsq+Y4domL+DMv0nVbQd/z/Y7UPQZjcyy4bgtFbmPVa77/zvc7rtuqYNfV8BxLx6fIZrPh
                                      +52g71X52EPkWDrti1QSYgE5yqBxkcUGorx95+2QoyT6Fek49iDsyt5Z2Y5sjlmWn88uH+1ntB17
                                      JO30fel0KjLoe2HYVf50bLI5RtHqy+nvG80XEtKgyAqOJu6OZo5Jkn45/Z3HitGki2w2G4fDXhB4
                                      qgdyj1qOWZZPpxcEHzqxN6JFWvXaaHRAqsUCqRzPZ5fT6SLLWK0VTa5IUr/R30lejnGcfDmds7z+
                                      klaRfrczHmtzjeB2knJM0+vp9ILxsrxUirTqtfF4oOrgYunkzY7//s8XZj/Tj5C4q6HZbBwf/wU5
                                      7sLvvrj2huu2NDru+BL1c6TUK7qrJ/tAz3DYe/qTfXfpZ4nX4KiiuEip97soMT4KpR4Gt+1G0Pfu
                                      orSsWhj6hzJvOa+Y4iJ//PCD7AV3KlbBt+tumhyE3eGwx+n7LNQWGfQ9qcubcGXbjdHooILVs5RQ
                                      WaTU5U14G4Ryl85SSNm+NtevOHwndUVyOdYD5VJTpFWvoUh4lpoiff8dsz1EKIuqIjFBwvNQJNCi
                                      oEi/ixzhRSqKxAQJL0ORQEvVRTqOjb1s2KLqIjFBwnZVF1nN4wdBX1UXiQVGYLtKi8RGJLyq0iLt
                                      Ji72gVdUO0eqXrIH6Ku0SGxEwqsqLbJukbgZFyir9K6Gk5OzKt8OdIRJC2hRv4IAlMh1nV9/+aB6
                                      FN8FcyTQgiKBFhQJtKBIoAVFlqzEp6jqotylflFkyaJopXoIVSv3GREosmTzxZLHUzx2dDa9KPdn
                                      AUWW79Pnr4ZEeT67nE4vyn1NHCEvX5blH0/+G/S9IPAcx7bYnc2P46s0/WO+iGUs6YsiZZkvloyf
                                      qCAPt68v6A5FAi0oEmhBkUALigRaUCTQgiKBFhQJtKBIoAVFAi0oEmhBkUALigRaUCTQ8j+9xvaf
                                      +IjmkgAAAABJRU5ErkJggg==
                                      </Photo>
                                      END-DELETING
                                      <GroupList></GroupList>
                                      <Job></Job>

                                      Hoping you have the answer.

                                      1 Reply Last reply Reply Quote 0
                                      • Terry RT
                                        Terry R
                                        last edited by

                                        Given the example you provided the following would remove all text between and including the START and END-DELETING lines.
                                        Find: (START-DELETING.+\R)(.+\R)+(END-DELETING\R)
                                        Replace: empty string here

                                        So the assumption is that there must be at least 1 line between the 2 identifying lines (START and END), that’s the (.+\R)+ portion of the regex. Also note that the first group (START-DELETING.+\R) includes the .+ as your example also has 3 period characters after it. I’ve included brackets around each sub-portion just so as it makes it a bit easier to segment out and identify what each group is doing. Only the middle group brackets are absolutely necessary, i.e.(.+\R)+.

                                        You say you can/have replaced using a simple find and replace to get the START and END lines in there. With my regex you could replace those portions with the original string you used to find. That would save you 1 or 2 additional steps.

                                        Hope this helps.

                                        Terry

                                        1 Reply Last reply Reply Quote 0
                                        • guy038G
                                          guy038
                                          last edited by

                                          Hello, @md-abdullah-al-noman, @terry-r and All,

                                          I think, that the following regex S/R, could be used, too :

                                          SEARCH (?s-i)^\h*START-DELETING.+?END-DELETING\R

                                          REPLACE Leave EMPTY

                                          Notes :

                                          • First, the (?s-i) modifiers, means that, from now on :

                                            • Any regex dot symbol ( .) will match, absolutely, any single character ( standard ones and EOL ones )

                                            • The search will be processed in a sensitive way ( Non-insensitive ! )

                                          • Then, the part ^\h*START-DELETING looks, from beginning of line ( ^ ), for the upper-case string START-DELETING, possibly preceded with some horizontal space characters ( Usual space or tabulation )

                                          • At end, the part END-DELETING\R searches for the upper-case string END-DELETING, followed with its line-break character(s)

                                          • And the middle part .+? represents the shortest range, of any character, between the two strings START-DELETING and END-DELETING

                                          • Finally, as the replacement regex is empty, all the overall match is, simply, deleted

                                          Best regards,

                                          guy038

                                          1 Reply Last reply Reply Quote 0
                                          • David BennettD
                                            David Bennett
                                            last edited by

                                            @guy038 you truly are a legend, I agree with the other poster. You are so deep into notepad++ regex, impressive!
                                            I believe you may also know this - IMHO quite common - case, although I can’t find it described anywhere:

                                            Suppose you have just one large file (wordpress sql database in fact, opened in my favorite editor notepad++) and STRING A and STRING B should always belong together:
                                            FIND ALL INSTANCES OF ANY TEXT across lines
                                            WHERE STRING A sometime later
                                            IS FOLLOWED BY ANOTHER STRING A
                                            INSTEAD OF THE “CLOSING” STRING B

                                            Example: Find all instances where, across lines, there’s the literal string [/social]
                                            and after any kind and number of characters there’s another literal string [/social]
                                            BUT in between the two is nowhere a literal string [social] although it should be because [social] and [/social] belong together.

                                            So basically in the example case, string A and string B always belong together, there must never follow two A’s or two B’s. Always the A string, then the B string. Then again the A string, then the B string. Etc. And so you need to find any “fault”: where A is followed sometime later by another A, instead of first a B string.

                                            Did I explain this well enough?

                                            I am sure none of the above, nor anything else I have found, works because I’ve tried them all. Would you have an idea how to go about this?

                                            1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors