Notepad++ How to delete sections of text starting with a line containing a certain phrase
-
I’m trying to edit a few calibre files that have tags attached, but the tag line is not always formatted the same.
Eg.
div class=“pcalibre1 pcalibre2 pcalibre tags-list”
div class=“pcalibre1 pcalibre2 tags-list pcalibre”.
I want to delete everything including and between the lines containing tags-list and entry-speaker.
Is there an easy way to do this with regex?
-
@Banjo-G said in Notepad++ How to delete sections of text starting with a line containing a certain phrase:
delete everything including and between the lines containing tags-list and entry-speaker
Your example doesn’t show any
entry-speaker
so I suggest you have a look HERE. -
@Alan-Kilborn Sorry, those were examples of the lines not being formatted the same.
An better example would be<div class=“pcalibre2 pcalibre1 pcalibre tags-list”>
…
…
…
…
…
…
…
<h4 class=“pcalibre2 pcalibre1 pcalibre entry-speaker”>With all lines wanting to be deleted. The problem is that the tags-list and entry-speaker lines are often scrambled.
Eg.
<div class=“pcalibre2 pcalibre tags-list pcalibre1”>
<div class=“pcalibre1 pcalibre2 tags-list pcalibre”>
<div class=“pcalibre1 pcalibre2 pcalibre tags-list”>
<div class=“pcalibre2 pcalibre1 pcalibre tags-list”> -
Hello, @banjo-g, @alan-kilborn and All,
I think you could test this regex S/R, below :
SEARCH
(?-s)^.+?tags-list(?s).+?entry-speaker.+?$\R
REPLACE
Leave EMPTY
against this sample text :
blabla bhahblah blabla <div class=“pcalibre2 pcalibre1 pcalibre tags-list”> … … … … <h4 class=“pcalibre2 entry-speaker pcalibre1 pcalibre”> blabla bhahblah blabla <div class=“pcalibre1 tags-list pcalibre2 pcalibre”> … … … … <h4 class=“pcalibre2 pcalibre1 pcalibre entry-speaker”> blabla bhahblah blabla <div class=“pcalibre2 pcalibre tags-list pcalibre1”> … … … … <h4 class=“entry-speaker pcalibre2 pcalibre1 pcalibre”> blabla bhahblah blabla rt <div class=“tags-list pcalibre1 pcalibre2 pcalibre”> … … … … <h4 class=“pcalibre2 pcalibre1 entry-speaker pcalibre”> blabla bhahblah blabla
should be OK ;-))
Best Regards,
guy038
-
@guy038 That works wonders, thanks!
-
Hi, @banjo-g, @alan-kilborn and All,
A generic and general form of the regex, described in my previous post, could be :
SEARCH
(?-is)^.*?
Expression A(?s).*?
Expression B.*?$\R
Basically, this regex :
-
Searches for two lines :
-
A line
A
, containingExpression A
, at any location of lineA
-
A line
B
, containingExpression B
, at any location of lineB
-
-
Selects all range of characters, generally multi-lines, from the beginning of line
A
till the end of lineB
, with its EOL characters -
The lines
A
andB
may be identical. However, in that case,Expression B
must be located afterExpression A
, in current line !
Notes :
-
First, the in-line modifier
(?-is)
-
Carries a non-insensitive search ( so sensitive to case )
-
Forces the regex engine to interpret the regex dot symbol
.
as matching a single standard character ( not EOL ones )
-
-
Then, the part
^.*?
Expression A matches, from beginnning of line, the shortest range, possibly null, of standard characters, followed byExpression A
, with that exact case -
Now, the part
(?s).*?
Expression B looks for the shortest range, possibly null, of characters, including EOL, followed byExpression B
, with that exact case -
Finally, the part
.*?$\R
searches for the shortest range, possibly null, of characters till an end of line, followed with its line-break
Cheers,
guy038
-