Notepad++ How to delete sections of text starting with a line containing a certain phrase
-
I’m trying to edit a few calibre files that have tags attached, but the tag line is not always formatted the same.
Eg.
div class=“pcalibre1 pcalibre2 pcalibre tags-list”
div class=“pcalibre1 pcalibre2 tags-list pcalibre”.
I want to delete everything including and between the lines containing tags-list and entry-speaker.
Is there an easy way to do this with regex?
-
@Banjo-G said in Notepad++ How to delete sections of text starting with a line containing a certain phrase:
delete everything including and between the lines containing tags-list and entry-speaker
Your example doesn’t show any
entry-speakerso I suggest you have a look HERE. -
@Alan-Kilborn Sorry, those were examples of the lines not being formatted the same.
An better example would be<div class=“pcalibre2 pcalibre1 pcalibre tags-list”>
…
…
…
…
…
…
…
<h4 class=“pcalibre2 pcalibre1 pcalibre entry-speaker”>With all lines wanting to be deleted. The problem is that the tags-list and entry-speaker lines are often scrambled.
Eg.
<div class=“pcalibre2 pcalibre tags-list pcalibre1”>
<div class=“pcalibre1 pcalibre2 tags-list pcalibre”>
<div class=“pcalibre1 pcalibre2 pcalibre tags-list”>
<div class=“pcalibre2 pcalibre1 pcalibre tags-list”> -
Hello, @banjo-g, @alan-kilborn and All,
I think you could test this regex S/R, below :
SEARCH
(?-s)^.+?tags-list(?s).+?entry-speaker.+?$\RREPLACE
Leave EMPTYagainst this sample text :
blabla bhahblah blabla <div class=“pcalibre2 pcalibre1 pcalibre tags-list”> … … … … <h4 class=“pcalibre2 entry-speaker pcalibre1 pcalibre”> blabla bhahblah blabla <div class=“pcalibre1 tags-list pcalibre2 pcalibre”> … … … … <h4 class=“pcalibre2 pcalibre1 pcalibre entry-speaker”> blabla bhahblah blabla <div class=“pcalibre2 pcalibre tags-list pcalibre1”> … … … … <h4 class=“entry-speaker pcalibre2 pcalibre1 pcalibre”> blabla bhahblah blabla rt <div class=“tags-list pcalibre1 pcalibre2 pcalibre”> … … … … <h4 class=“pcalibre2 pcalibre1 entry-speaker pcalibre”> blabla bhahblah blablashould be OK ;-))
Best Regards,
guy038
-
@guy038 That works wonders, thanks!
-
Hi, @banjo-g, @alan-kilborn and All,
A generic and general form of the regex, described in my previous post, could be :
SEARCH
(?-is)^.*?Expression A(?s).*?Expression B.*?$\RBasically, this regex :
-
Searches for two lines :
-
A line
A, containingExpression A, at any location of lineA -
A line
B, containingExpression B, at any location of lineB
-
-
Selects all range of characters, generally multi-lines, from the beginning of line
Atill the end of lineB, with its EOL characters -
The lines
AandBmay be identical. However, in that case,Expression Bmust be located afterExpression A, in current line !
Notes :
-
First, the in-line modifier
(?-is)-
Carries a non-insensitive search ( so sensitive to case )
-
Forces the regex engine to interpret the regex dot symbol
.as matching a single standard character ( not EOL ones )
-
-
Then, the part
^.*?Expression A matches, from beginnning of line, the shortest range, possibly null, of standard characters, followed byExpression A, with that exact case -
Now, the part
(?s).*?Expression B looks for the shortest range, possibly null, of characters, including EOL, followed byExpression B, with that exact case -
Finally, the part
.*?$\Rsearches for the shortest range, possibly null, of characters till an end of line, followed with its line-break
Cheers,
guy038
-