Match everything except the text and <br> tags
-
Hi, @dr-ramaanand and All,
Ah, of course, if you add a
<div class="left">
line, right after the first<div style=".....
line, it will not work !
So, given this INPUT text, pasted in a new tab:
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <p class=MsoNormal><b><span style='font-size:13.5pt;line-height:115%; font-family:"Verdana","sans-serif";color:red'>SYNONYMS </span></b> </p> <div class="left">
Simply, change the previous search regex by this new version :
(?s)\A.+?\R\s*\K<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.*?\s*<div class="left">
Note the différence : between
#EBF4FB;">\s*
and\s*<div class="left">
, I changed the part.+?
by.*?
I also slightly change the position of the
\K
feature
Ax expected, this new regex will match the two consecutive lines :
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left">
BR
guy038
-
@guy038 This RegEx:
(?s)\A.+?\R\K\s*<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">.+(?=\s*+<div class="left">)
would have stopped searching just before the second occurrence of<div class="left">
if the sample to be searched was like this:-<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <div class="left">
-
Yes, your regex does match the same amount of text as my version but my regex seems more simple and logic !
BR
guy038
-
@guy038 d’accord, merci beaucoup!
-
@guy038 your last RegEx finds the first occurrence of
<div class="left">
even if there is some other text above it. Lovely! -
Hi, @dr-ramaanand and All,
Again, I did not check all the possibilities before posting. Sorry for the NOISE !
So, the right regex to use should be :
(?s)\A.*?\s*\K<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.*?\s*<div class="left">
This time, it will work if you pasted this text, in a new tab
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <div class="left">
But it will also works, if you pasted the following text, in a new tab
First non-blank line second line Third line before the block to match <div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <div class="left">
Best Regards,
guy038
-
@guy038 I am not sure if I am allowed to do it (as the solution was provided by you), so I am requesting you to post the last Regular Expression you provided with the sample to be edited with a new heading, “How to find the first occurrence of a tag ?” so that people can search and find it online. Thank you!
-
Hello, @dr-ramaanand and All,
You said in your previous post :
… so I am requesting you to post the last Regular Expression you provided with the sample to be edited with a new heading, “How to find the first occurrence of a tag ?” so that people can search and find it online. Thank you!
But, actually, my regex finds the first occurrence of the
<div class="left">
tag, AFTER a first occurrence of the<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">
tag !
So, to my mind, the correct way to match the first occurrence of a specific tag, in current file, is to use the generic regex :
(?s-i)\A.*?\K<
TAG Name(?: .*?)?>
Just replace the generic TAG Name value with a valid
HTML
tagNote that, in case of the comment tag, replace the generic TAG Name, into the above regex, by the literal string
!--.*?--
Similarly, the correct way to match the last occurrence of a specific tag, in current file, is to use the generic regex :
(?s-i)\A.*\K<
TAG Name(?: .*?)?>
BR
guy038
-
@guy038 said in Match everything except the text and <br> tags:
(?s-i)\A.\K<TAG Name(?: .?)?>
I think that that should be
(?s-i)\A.*\K<TAG Name(?:.*?)?>
with no spaces anywhere in the middle -
Hi, @dr-ramaanand and All,
In order to use a valid INPUT text to do some tests, just open the main page of our forum. Then hit the
Ctrl + U
shortcut to open theHTML
source page of our forum and paste its contents in a new tab
My generic regex tries to match the syntax
<TAG......
, till the nearest>
character and must be valid for any kind of tag.Thus, I prefer to insert a space char to verify that the tag is a valid one . Indeed, this regex will match, either, tags like
<head>
or for example<span style="color:blue">blue</span>
If you replace the TAG Name in the generic regex
(?s-i)\A.*?\K<
TAG Name(?: .*?)?>
, which matches the first tag, named TAG, in current file, you get, from the examples, the regexes :-
(?s-i)\A.*?\K<head(?: .*?)?>
-
(?s-i)\A.*?\K<span(?: .*?)?>
Just test them against the HTML code source of our forum
Now, let’s suppose, for example, that you want to find out the first
input ...>
tag, AFTER the firstimg ......>
tag, in theHTML
code source of our forum :Then, from my previous post, you would have to use the following regex :
(?s-i)\A.*?<img(?: .*?)?>.*?\K<input(?: .*?)?>
which matches, as expected, the following line :
<input autocomplete="off" type="text" class="form-control hidden" name="term" placeholder="Search"/>
BR
guy038
P.S. : You also replied in an old post, regarding this extra
space
char. However, I’ll not reply because this topic is old and not exactly related to the present discussion ! -
-
@guy038 Okay, thank you!