Match everything except the text and <br> tags
-
@PeterJones Thank you very much. This Regular expression worked:
(?s)<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*<p[^<>]*+>\s*<span[^<>]*+>\s*<span[^<>]*+>\s*<span[^<>]*+>.+?</span>\s*</span>\s*</span>\s*</p>\s*</div>\s*</div>\s*<div class="container">\s*<div class="left">
-
Hi, @dr-ramaanand, @peterjones and All,
Ah, I was a bit too slow and Peter just beats me ! Note that I used the same process than Peter to determine where the error occurs !
@dr-ramaanand, you just did a small typo error in the regex that you provided !
The correct regex, to match your text, is not that one, with a
/s*
syntax :V (?s)<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*<p[^<>]*+>\s*<span[^<>]*+>\s*<span[^<>]*+>\s*<span[^<>]*+>.+?</span>\s*</span>\s*</span>\s*</p>\s*</div>/s*</div>\s*<div class="container">\s*<div class="left">
but this one, with a correct
\s*
syntax :V (?s)<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*<p[^<>]*+>\s*<span[^<>]*+>\s*<span[^<>]*+>\s*<span[^<>]*+>.+?</span>\s*</span>\s*</span>\s*</p>\s*</div>\s*</div>\s*<div class="container">\s*<div class="left">
Remarks :
-
May be, it would be preferable to add a
\s*
syntax at the very end of your regex ! -
You could also simplify this regex, significantly, by using the version below :
SEARCH
(?s)<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.+?\s*<div class="left">\s*
BR
guy038
-
-
@guy038 I have more than one
<div class="left">
, so how do I make it stop searching after finding the first<div class="left">
? -
Hello, @dr-ramaanand and All,
To solve this case, I would use the following regex S/R :
SEARCH
(?s)\A.+?\R\K\s*<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.+?\s*<div class="left">
REPLACE
Whatever you want to !
Note that I did not add, this time, the
\s*
part at the end of the search regex.Also notice the two
lazy
syntaxes (.+?
), right after\A
and right before\s*<div class="left">
, in order to select only the first sections*<div style=.....\s*<div class="left">
, only !BR
guy038
-
@guy038 I used this as a sample:-
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <p class=MsoNormal><b><span style='font-size:13.5pt;line-height:115%; font-family:"Verdana","sans-serif";color:red'>SYNONYMS </span></b> </p> <div class="left">
Your Regular expression does not stop searching at the first occurrence of
<div class="left">
-
@guy038 This RegEx helped stop searching as soon as it found a
<p........>
:-
(?s)\A.+?\R\K\s*<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.+?\s*<div class="left">(?=\s*+<p[^<>]*+>)
-
Hi, @dr-ramaanand and All,
Ah, of course, if you add a
<div class="left">
line, right after the first<div style=".....
line, it will not work !
So, given this INPUT text, pasted in a new tab:
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <p class=MsoNormal><b><span style='font-size:13.5pt;line-height:115%; font-family:"Verdana","sans-serif";color:red'>SYNONYMS </span></b> </p> <div class="left">
Simply, change the previous search regex by this new version :
(?s)\A.+?\R\s*\K<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.*?\s*<div class="left">
Note the différence : between
#EBF4FB;">\s*
and\s*<div class="left">
, I changed the part.+?
by.*?
I also slightly change the position of the
\K
feature
Ax expected, this new regex will match the two consecutive lines :
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left">
BR
guy038
-
@guy038 This RegEx:
(?s)\A.+?\R\K\s*<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">.+(?=\s*+<div class="left">)
would have stopped searching just before the second occurrence of<div class="left">
if the sample to be searched was like this:-<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <div class="left">
-
Yes, your regex does match the same amount of text as my version but my regex seems more simple and logic !
BR
guy038
-
@guy038 d’accord, merci beaucoup!
-
@guy038 your last RegEx finds the first occurrence of
<div class="left">
even if there is some other text above it. Lovely! -
Hi, @dr-ramaanand and All,
Again, I did not check all the possibilities before posting. Sorry for the NOISE !
So, the right regex to use should be :
(?s)\A.*?\s*\K<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">\s*.*?\s*<div class="left">
This time, it will work if you pasted this text, in a new tab
<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <div class="left">
But it will also works, if you pasted the following text, in a new tab
First non-blank line second line Third line before the block to match <div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;"> <div class="left"> <div class="left">
Best Regards,
guy038
-
@guy038 I am not sure if I am allowed to do it (as the solution was provided by you), so I am requesting you to post the last Regular Expression you provided with the sample to be edited with a new heading, “How to find the first occurrence of a tag ?” so that people can search and find it online. Thank you!
-
Hello, @dr-ramaanand and All,
You said in your previous post :
… so I am requesting you to post the last Regular Expression you provided with the sample to be edited with a new heading, “How to find the first occurrence of a tag ?” so that people can search and find it online. Thank you!
But, actually, my regex finds the first occurrence of the
<div class="left">
tag, AFTER a first occurrence of the<div style="margin-bottom:-15px;width: 100%;background-color:#EBF4FB;">
tag !
So, to my mind, the correct way to match the first occurrence of a specific tag, in current file, is to use the generic regex :
(?s-i)\A.*?\K<
TAG Name(?: .*?)?>
Just replace the generic TAG Name value with a valid
HTML
tagNote that, in case of the comment tag, replace the generic TAG Name, into the above regex, by the literal string
!--.*?--
Similarly, the correct way to match the last occurrence of a specific tag, in current file, is to use the generic regex :
(?s-i)\A.*\K<
TAG Name(?: .*?)?>
BR
guy038
-
@guy038 said in Match everything except the text and <br> tags:
(?s-i)\A.\K<TAG Name(?: .?)?>
I think that that should be
(?s-i)\A.*\K<TAG Name(?:.*?)?>
with no spaces anywhere in the middle -
Hi, @dr-ramaanand and All,
In order to use a valid INPUT text to do some tests, just open the main page of our forum. Then hit the
Ctrl + U
shortcut to open theHTML
source page of our forum and paste its contents in a new tab
My generic regex tries to match the syntax
<TAG......
, till the nearest>
character and must be valid for any kind of tag.Thus, I prefer to insert a space char to verify that the tag is a valid one . Indeed, this regex will match, either, tags like
<head>
or for example<span style="color:blue">blue</span>
If you replace the TAG Name in the generic regex
(?s-i)\A.*?\K<
TAG Name(?: .*?)?>
, which matches the first tag, named TAG, in current file, you get, from the examples, the regexes :-
(?s-i)\A.*?\K<head(?: .*?)?>
-
(?s-i)\A.*?\K<span(?: .*?)?>
Just test them against the HTML code source of our forum
Now, let’s suppose, for example, that you want to find out the first
input ...>
tag, AFTER the firstimg ......>
tag, in theHTML
code source of our forum :Then, from my previous post, you would have to use the following regex :
(?s-i)\A.*?<img(?: .*?)?>.*?\K<input(?: .*?)?>
which matches, as expected, the following line :
<input autocomplete="off" type="text" class="form-control hidden" name="term" placeholder="Search"/>
BR
guy038
P.S. : You also replied in an old post, regarding this extra
space
char. However, I’ll not reply because this topic is old and not exactly related to the present discussion ! -
-
@guy038 Okay, thank you!