Community
    • Login

    How to delete a complete tag with specific content

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    15 Posts 6 Posters 8.4k Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Scott SumnerS Offline
      Scott Sumner
      last edited by

      Unfortunately, Claudia’s regex can grab too much sometimes. For example, if run on the following data, it will select the first 3 lines with the first Find Next press (good!), but the second Find Next press will match the 6 lines after that (BAD! as that contains a non-zero “Active” field and a replace with “nothing” operation would delete that!):

      <Report>
          <Active>0000000000</Active>
      </Report>
      <Report>
          <Active>0000000001</Active>
      </Report>
      <Report>
          <Active>0000000000</Active>
      </Report>
      

      Try this instead:

      Pre-step: Back up your data!
      Find what zone: (?s-i)<Report>.*?<Active>(?:(0000000000)|(\d+))</Active>.*?</Report>\R
      Replace with zone: (?1:$0\r\n
      Search mode: Regular expression
      Wrap around checkbox: Ticked
      Action: Press the Replace All button

      1 Reply Last reply Reply Quote 0
      • Christoph KahleC Offline
        Christoph Kahle
        last edited by

        Still the same.
        Now a lager sample

        <Report>
        	<UserName>Angebot </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000338</IndexPerm><TempIndex>0000000338</TempIndex></Report>
        <Report>
        	<UserName>Angebot.</UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000339</IndexPerm><TempIndex>0000000339</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000340</IndexPerm><TempIndex>0000000340</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000341</IndexPerm><TempIndex>0000000341</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000342</IndexPerm><TempIndex>0000000342</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000343</IndexPerm><TempIndex>0000000343</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000344</IndexPerm><TempIndex>0000000344</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000345</IndexPerm><TempIndex>0000000345</TempIndex></Report>
        <Report>
        	<UserName>Artikel mit hinterlegten Freifeldern </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>Artikelliste</Filter>
        	<VMB>artlist12.vmb</VMB>
        	<Active>0000000001</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<UserName filter="handwerk hw_plus">Material mit hinterlegten Freifeldern 1-6</UserName><IndexPerm>0000000346</IndexPerm><TempIndex>0000000346</TempIndex></Report>
        <Report>
        	<UserName>Kunden mit hinterlegten Freifeldern </UserName>
        	<UserName filter="Lbb Stb">UStIdNr-Anfragen Mandanten</UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>Kundenliste</Filter>
        	<VMB>KundFrei1.vmb</VMB>
        	<ChangeTarget>Nein</ChangeTarget>
        	<Remark/>
        	<Argument1/>
        	<Argument2/>
        	<UseProgram>Ja</UseProgram>
        	<Active>0000000001</Active>
        	<Group>Kundenliste</Group>
        	<CommandID>0x800B</CommandID>
        	<PreviewBmp>KunUstID.png</PreviewBmp>
        	<IndexPerm>0000000347</IndexPerm><TempIndex>0000000347</TempIndex></Report>
        <Report>
        	<UserName>Lieferanten mit hinterlegten Freifeldern 1-6</UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>Lieferantenliste</Filter>
        	<VMB>liefFrei1.vmb</VMB>
        	<Remark></Remark>
        	<Argument1/>
        	<Argument2/>
        	<UseProgram>Ja</UseProgram>
        	<Active>0000000001</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<Group>Lieferantenliste</Group>
        	<CommandID>0x8038</CommandID>
        	<PreviewBmp>KunUstID.png</PreviewBmp>
        	<IndexPerm>0000000348</IndexPerm><TempIndex>0000000348</TempIndex></Report>
        <Report>
        	<UserName>Angebot </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000349</IndexPerm><TempIndex>0000000349</TempIndex></Report>
        <Report>
        	<UserName>Rechnung </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro.vmb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000001</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000350</IndexPerm><TempIndex>0000000350</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000351</IndexPerm><TempIndex>0000000351</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000352</IndexPerm><TempIndex>0000000352</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000353</IndexPerm><TempIndex>0000000353</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000354</IndexPerm><TempIndex>0000000354</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000355</IndexPerm><TempIndex>0000000355</TempIndex></Report>
        <Report>
        	<UserName>Auftragsbestätigunglogo </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro.vmb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000001</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000388</IndexPerm><TempIndex>0000000388</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report>
        <Report>
        	<UserName>Gutschrift </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro.vmb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000358</IndexPerm><TempIndex>0000000358</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report>
        <Report>
        	<UserName>Gutschrift </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro.vmb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000001</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000357</IndexPerm><TempIndex>0000000357</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report>
        <Report>
        	<UserName>Gutschrift old</UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro.vmb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000356</IndexPerm><TempIndex>0000000356</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report>
        <Report>
        	<UserName>Gutschrift </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro.vmb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000360</IndexPerm><TempIndex>0000000360</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report>
        <Report>
        	<UserName>eigene Bestellungen</UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000001</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000359</IndexPerm><TempIndex>0000000359</TempIndex></Report>
        <Report>
        	<UserName>Angebot  </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000361</IndexPerm><TempIndex>0000000361</TempIndex></Report>
        <Report>
        	<UserName>Angebot </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000362</IndexPerm><TempIndex>0000000362</TempIndex></Report>
        <Report>
        	<UserName>Angebot </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000363</IndexPerm><TempIndex>0000000363</TempIndex></Report>
        <Report>
        	<UserName>Angebot </UserName>
        	<RefReport>Drucker</RefReport>
        	<Filter>netto Auftrag brutto layout</Filter>
        	<VMB>lay_pro_gue.umb</VMB>
        	<Argument1>layout.ini</Argument1>
        	<Active>0000000000</Active>
        	<ChangeTarget>Nein</ChangeTarget>
        	<IndexPerm>0000000364</IndexPerm><TempIndex>0000000364</TempIndex></Report>
        
        Scott SumnerS 1 Reply Last reply Reply Quote 0
        • Scott SumnerS Offline
          Scott Sumner @Christoph Kahle
          last edited by

          @Christoph-Kahle

          Well, trying my solution on that new data works for me!

          Claudia FrankC 1 Reply Last reply Reply Quote 1
          • Claudia FrankC Offline
            Claudia Frank
            last edited by

            I see my match get’s expanded until a report tag happens with action 000000.
            Scotts version is working for me as well.

            Cheers
            Claudia

            1 Reply Last reply Reply Quote 0
            • Claudia FrankC Offline
              Claudia Frank @Scott Sumner
              last edited by

              @Christoph-Kahle , @Scott-Sumner

              find what:

              (?s)<Report>(?:(?!</Report>).)*<Active>0000000000</Active>.*?</Report>
              

              replace with empty

              This seems to work as well :-D

              Cheers
              Claudia

              1 Reply Last reply Reply Quote 0
              • Scott SumnerS Offline
                Scott Sumner
                last edited by

                So some explanation is probably in order.

                The Find what will match EVERY <Report> thru </Report> (with line-ending after the closing tag) one-by-one as it moves thru the file. The difference is that when an all-zeroes Active field is encountered, it gets saved into “group #1”.

                The Replace with is where the difference comes in: If “group #1” was matched during the Find, the replacement text is “nothing” (thus a delete operation), otherwise it is the entire match (represented by $0) from the find stage.

                The ?1 is a “test” at replacement time. It’s general form is ?1a:b and means "if group #1 was matched at find time, replace with “a” at replace time, otherwise replace with “b”. In our specific case above “a” is omitted, which means “insert nothing here”.

                1 Reply Last reply Reply Quote 1
                • guy038G Offline
                  guy038
                  last edited by guy038

                  Hello, @christoph-kahle, @claudia-frank, @scott-sumner and All,

                  Now, most members of N++ community, who use regexes, in search/replacement operations, know the main difference between a lazy and a greedy quantifier ! For instance, given the * quantifier, which is a shortcut of the syntax {0,} :

                  • The (?-s)0.*9 regex matches the greatest range of characters, between digits 0 and 9 of a same line

                  • The (?-s)0.*?9 regex matches the tallest range of characters, between digits 0 and 9 of a same line

                  • The (?s)0.*9 regex matches the greatest range of characters, between digits 0 and 9, even of several lines

                  • The (?s)0.*?9 regex matches the tallest range of characters, between digits 0 and 9, even of several lines


                  So, assuming the one-line text, below :

                  1234567890<Report>First Block</Report>123457890<Report>Second Block</Report>1234567890<Report>Third Block</Report>1234567890
                  

                  You should, easily, see the difference between the regex (?-s)<Report>.*</Report> and the regex (?-s)<Report>.*?</Report>

                  Just note that the later regex could be rewritten, as well, (?-s)<Report>((?!</Report>).)*</Report> . Indeed, this regex means :

                  • Find a zone, beginning with <Report>, ending with </Report> and which does not contain, at any position, after <Report>, till </Report>, the string </Report> !

                  Now, why the first form the the Claudia’s regex, below, is not totally correct ?

                  (?s)<Report>.*?<Active>0000000000</Active>.*?</Report>

                  I, probably, would have build this one, at first sight, too :-) Indeed, this regex looks for the nearest string <Active>0000000000</Active>, after the start tag <Report>, itself followed by the nearest end tag </Report>. Everything seems OK…

                  However, we do a mistake because the fact of reaching the nearest <Active>0000000000</Active> and the fact of reaching the nearest </Report> are independent events ! And we may find, first, a <Active>0000000000</Active> block, after crossing many end tags </Reports :-((

                  So, in order to tell the regex engine to find, first, the nearest string <Active>0000000000</Active>, of the SAME block <Report>.....<Report>, we must use the second Claudia’s regex, below :

                  (?s)<Report>(?:(?!</Report>).)*<Active>0000000000</Active>.*?</Report>

                  Notes :

                  • No need to change the first * greedy quantifier, before <Active>, in its lazy form ! Actually, due to the negative look-ahead, you already know that the range, between <Report> and <Active>..., is part of the current <Report>...</Report> zone, which always contains, in Christoph’s file, an unique block <Active>.....</Active> !

                  • On the contrary, the last *? lazy modifier, before </Report>, is, of course, mandatory, to get the end of the current <Report> tag

                  Cheers,

                  guy038

                  Scott SumnerS Luiz Antonio Souza RibeiroL 2 Replies Last reply Reply Quote 1
                  • Scott SumnerS Offline
                    Scott Sumner @guy038
                    last edited by

                    @guy038 said:

                    Find a zone, beginning with <Report>, ending with </Report> and which does not contain, at any position, after <Report>, till </Report>, the string </Report> !

                    Hi Guy! I’m sure you can appreciate the humor in your statement. Of course the string </Report> isn’t going to occur before the string </Report> ! Because if it did, then it would have! ;)

                    1 Reply Last reply Reply Quote 2
                    • guy038G Offline
                      guy038
                      last edited by guy038

                      Hi, @scott-sumner,

                      Yes, my statement looks a bit weird ! But, English is not my mother language, and probably, a better formulation exists, in fluent English-American :-)

                      Oh !, Perhaps, I should had write :

                      • Find a zone, beginning with <Report>, ending with </Report> and which does not contain, at any position, after <Report>, till </Report>, AN OTHER string </Report> ?

                      Actually, I just wanted to point out, that the simple case (?-s)a.*?b, with a lazy quantifier, may, as well, be written, with a greedy quantifier :

                      • (?-s)a((?!b).)*b

                      • a[^b\r\n]*b

                      Cheers,

                      guy038

                      1 Reply Last reply Reply Quote 1
                      • PeterJonesP Offline
                        PeterJones
                        last edited by

                        @Scott-Sumner,

                        You’ve got to admit, even for native speakers, it’s sometimes hard to translate a regular expression into clear English. I would probably phrase the non-greedy version as

                        Find a zone, beginning with <Report> and ending with the first instance of </Report>

                        and the greedy version as

                        Find a zone, beginning with <Report> and ending with the last instance of </Report>

                        1 Reply Last reply Reply Quote 2
                        • Luiz Antonio Souza RibeiroL Offline
                          Luiz Antonio Souza Ribeiro @guy038
                          last edited by

                          @guy038 said in How to delete a complete tag with specific content:

                          (?s)<Report>(?:(?!</Report>).)<Active>0000000000</Active>.?</Report>

                          Thank you. It works for me in a similar case.

                          1 Reply Last reply Reply Quote 1

                          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                          With your input, this post could be even better 💗

                          Register Login
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors