How to delete a complete tag with specific content
-
@Christoph-Kahle said:
With the info provided it seems that it is working.

Not sure what I’m missing.
Cheers
Claudia -
Unfortunately, Claudia’s regex can grab too much sometimes. For example, if run on the following data, it will select the first 3 lines with the first Find Next press (good!), but the second Find Next press will match the 6 lines after that (BAD! as that contains a non-zero “Active” field and a replace with “nothing” operation would delete that!):
<Report> <Active>0000000000</Active> </Report> <Report> <Active>0000000001</Active> </Report> <Report> <Active>0000000000</Active> </Report>Try this instead:
Pre-step: Back up your data!
Find what zone:(?s-i)<Report>.*?<Active>(?:(0000000000)|(\d+))</Active>.*?</Report>\R
Replace with zone:(?1:$0\r\n
Search mode: Regular expression
Wrap around checkbox: Ticked
Action: Press the Replace All button -
Still the same.
Now a lager sample<Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000338</IndexPerm><TempIndex>0000000338</TempIndex></Report> <Report> <UserName>Angebot.</UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000339</IndexPerm><TempIndex>0000000339</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000340</IndexPerm><TempIndex>0000000340</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000341</IndexPerm><TempIndex>0000000341</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000342</IndexPerm><TempIndex>0000000342</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000343</IndexPerm><TempIndex>0000000343</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000344</IndexPerm><TempIndex>0000000344</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000345</IndexPerm><TempIndex>0000000345</TempIndex></Report> <Report> <UserName>Artikel mit hinterlegten Freifeldern </UserName> <RefReport>Drucker</RefReport> <Filter>Artikelliste</Filter> <VMB>artlist12.vmb</VMB> <Active>0000000001</Active> <ChangeTarget>Nein</ChangeTarget> <UserName filter="handwerk hw_plus">Material mit hinterlegten Freifeldern 1-6</UserName><IndexPerm>0000000346</IndexPerm><TempIndex>0000000346</TempIndex></Report> <Report> <UserName>Kunden mit hinterlegten Freifeldern </UserName> <UserName filter="Lbb Stb">UStIdNr-Anfragen Mandanten</UserName> <RefReport>Drucker</RefReport> <Filter>Kundenliste</Filter> <VMB>KundFrei1.vmb</VMB> <ChangeTarget>Nein</ChangeTarget> <Remark/> <Argument1/> <Argument2/> <UseProgram>Ja</UseProgram> <Active>0000000001</Active> <Group>Kundenliste</Group> <CommandID>0x800B</CommandID> <PreviewBmp>KunUstID.png</PreviewBmp> <IndexPerm>0000000347</IndexPerm><TempIndex>0000000347</TempIndex></Report> <Report> <UserName>Lieferanten mit hinterlegten Freifeldern 1-6</UserName> <RefReport>Drucker</RefReport> <Filter>Lieferantenliste</Filter> <VMB>liefFrei1.vmb</VMB> <Remark></Remark> <Argument1/> <Argument2/> <UseProgram>Ja</UseProgram> <Active>0000000001</Active> <ChangeTarget>Nein</ChangeTarget> <Group>Lieferantenliste</Group> <CommandID>0x8038</CommandID> <PreviewBmp>KunUstID.png</PreviewBmp> <IndexPerm>0000000348</IndexPerm><TempIndex>0000000348</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000349</IndexPerm><TempIndex>0000000349</TempIndex></Report> <Report> <UserName>Rechnung </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro.vmb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000001</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000350</IndexPerm><TempIndex>0000000350</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000351</IndexPerm><TempIndex>0000000351</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000352</IndexPerm><TempIndex>0000000352</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000353</IndexPerm><TempIndex>0000000353</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000354</IndexPerm><TempIndex>0000000354</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000355</IndexPerm><TempIndex>0000000355</TempIndex></Report> <Report> <UserName>Auftragsbestätigunglogo </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro.vmb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000001</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000388</IndexPerm><TempIndex>0000000388</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report> <Report> <UserName>Gutschrift </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro.vmb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000358</IndexPerm><TempIndex>0000000358</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report> <Report> <UserName>Gutschrift </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro.vmb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000001</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000357</IndexPerm><TempIndex>0000000357</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report> <Report> <UserName>Gutschrift old</UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro.vmb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000356</IndexPerm><TempIndex>0000000356</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report> <Report> <UserName>Gutschrift </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro.vmb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000360</IndexPerm><TempIndex>0000000360</TempIndex><Group></Group><Remark></Remark><Argument2></Argument2><UseProgram>Ja</UseProgram></Report> <Report> <UserName>eigene Bestellungen</UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000001</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000359</IndexPerm><TempIndex>0000000359</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000361</IndexPerm><TempIndex>0000000361</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000362</IndexPerm><TempIndex>0000000362</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000363</IndexPerm><TempIndex>0000000363</TempIndex></Report> <Report> <UserName>Angebot </UserName> <RefReport>Drucker</RefReport> <Filter>netto Auftrag brutto layout</Filter> <VMB>lay_pro_gue.umb</VMB> <Argument1>layout.ini</Argument1> <Active>0000000000</Active> <ChangeTarget>Nein</ChangeTarget> <IndexPerm>0000000364</IndexPerm><TempIndex>0000000364</TempIndex></Report> -
Well, trying my solution on that new data works for me!
-
I see my match get’s expanded until a report tag happens with action 000000.
Scotts version is working for me as well.Cheers
Claudia -
@Christoph-Kahle , @Scott-Sumner
find what:
(?s)<Report>(?:(?!</Report>).)*<Active>0000000000</Active>.*?</Report>replace with empty
This seems to work as well :-D
Cheers
Claudia -
So some explanation is probably in order.
The Find what will match EVERY
<Report>thru</Report>(with line-ending after the closing tag) one-by-one as it moves thru the file. The difference is that when an all-zeroes Active field is encountered, it gets saved into “group #1”.The Replace with is where the difference comes in: If “group #1” was matched during the Find, the replacement text is “nothing” (thus a delete operation), otherwise it is the entire match (represented by
$0) from the find stage.The
?1is a “test” at replacement time. It’s general form is?1a:band means "if group #1 was matched at find time, replace with “a” at replace time, otherwise replace with “b”. In our specific case above “a” is omitted, which means “insert nothing here”. -
Hello, @christoph-kahle, @claudia-frank, @scott-sumner and All,
Now, most members of N++ community, who use regexes, in search/replacement operations, know the main difference between a lazy and a greedy quantifier ! For instance, given the
*quantifier, which is a shortcut of the syntax{0,}:-
The
(?-s)0.*9regex matches the greatest range of characters, between digits0and9of a same line -
The
(?-s)0.*?9regex matches the tallest range of characters, between digits0and9of a same line -
The
(?s)0.*9regex matches the greatest range of characters, between digits0and9, even of several lines -
The
(?s)0.*?9regex matches the tallest range of characters, between digits0and9, even of several lines
So, assuming the one-line text, below :
1234567890<Report>First Block</Report>123457890<Report>Second Block</Report>1234567890<Report>Third Block</Report>1234567890You should, easily, see the difference between the regex
(?-s)<Report>.*</Report>and the regex(?-s)<Report>.*?</Report>Just note that the later regex could be rewritten, as well,
(?-s)<Report>((?!</Report>).)*</Report>. Indeed, this regex means :- Find a zone, beginning with
<Report>, ending with</Report>and which does not contain, at any position, after<Report>, till</Report>, the string</Report>!
Now, why the first form the the Claudia’s regex, below, is not totally correct ?
(?s)<Report>.*?<Active>0000000000</Active>.*?</Report>I, probably, would have build this one, at first sight, too :-) Indeed, this regex looks for the nearest string
<Active>0000000000</Active>, after the start tag<Report>, itself followed by the nearest end tag</Report>. Everything seems OK…However, we do a mistake because the fact of reaching the nearest
<Active>0000000000</Active>and the fact of reaching the nearest</Report>are independent events ! And we may find, first, a<Active>0000000000</Active>block, after crossing many end tags</Reports:-((So, in order to tell the regex engine to find, first, the nearest string
<Active>0000000000</Active>, of the SAME block<Report>.....<Report>, we must use the second Claudia’s regex, below :(?s)<Report>(?:(?!</Report>).)*<Active>0000000000</Active>.*?</Report>Notes :
-
No need to change the first
*greedy quantifier, before<Active>, in its lazy form ! Actually, due to the negative look-ahead, you already know that the range, between<Report>and<Active>..., is part of the current<Report>...</Report>zone, which always contains, in Christoph’s file, an unique block<Active>.....</Active>! -
On the contrary, the last
*?lazy modifier, before</Report>, is, of course, mandatory, to get the end of the current<Report>tag
Cheers,
guy038
-
-
@guy038 said:
Find a zone, beginning with <Report>, ending with </Report> and which does not contain, at any position, after <Report>, till </Report>, the string </Report> !
Hi Guy! I’m sure you can appreciate the humor in your statement. Of course the string
</Report>isn’t going to occur before the string</Report>! Because if it did, then it would have! ;) -
Hi, @scott-sumner,
Yes, my statement looks a bit weird ! But, English is not my mother language, and probably, a better formulation exists, in fluent English-American :-)
Oh !, Perhaps, I should had write :
- Find a zone, beginning with
<Report>, ending with</Report>and which does not contain, at any position, after<Report>, till</Report>, AN OTHER string</Report>?
Actually, I just wanted to point out, that the simple case
(?-s)a.*?b, with a lazy quantifier, may, as well, be written, with a greedy quantifier :-
(?-s)a((?!b).)*b -
a[^b\r\n]*b
Cheers,
guy038
- Find a zone, beginning with
-
You’ve got to admit, even for native speakers, it’s sometimes hard to translate a regular expression into clear English. I would probably phrase the non-greedy version as
Find a zone, beginning with
<Report>and ending with the first instance of</Report>and the greedy version as
Find a zone, beginning with
<Report>and ending with the last instance of</Report> -
@guy038 said in How to delete a complete tag with specific content:
(?s)<Report>(?:(?!</Report>).)<Active>0000000000</Active>.?</Report>
Thank you. It works for me in a similar case.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login