Regex: Find those tags that contain a string, but which do not contain other string
-
1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name</a> is beyond magic.</p> 2. <p class="mb-40px">I love my home s< because I stay with my lovely cat.</p> 3. <p class="mb-40px">Because of this book t< I cannot sleep well.</p>I want to find only the lines that have the operator < included in the html tag <p class=“mb-40px”> </p> , except those lines that have
In my example above, the output should be line 2 (that have s< ) and line 3 ( that have t< )
So, I use @guy032 generic formula: (REGION-START)+(.)+\K(FIND REGEX)(?s:(?=.*(REGION-FINAL)))
In my case FIND: (<p class=“mb-40px”>)+(.)+\K(\w<)(?s:(?=.*(</p>)))
The problem is that my regex find also the e</a> from the first line. And I don’t wanna find the tags with
</a>Maybe @guy038 have a better GENERIC for this kind of problem
-
Hello, @rodica-f and All,
Just consider this example :
1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name </a> is z<beyond f< magic.</p> 2. <p class="Test">I love my home s< because I stay with my b<lovely cat.</p> 3. <p class="mb-40px">Because of this book t<I cannot a< sleep well.</p>Within this text :
-
Two tags begin with
<p class="mb-40px">and one begins with<p class="Test"> -
Each
<ptag contains two<operators ( one followed with aspacechar, the other followed with aletter)
So :
- To find any
<p...tag containing any string\w<, preceded with aspacechar, use the regex :
SEARCH / MARK
(?-si:<p class=".+?">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<- To find any
<p...tag and containing any string\w<, preceded and followed with aspace, use the regex :
SEARCH / MARK
(?-si:<p class=".+?">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<(?=\x20)- To find the specific tag
<p class="mb-40px">containing any string\w<, preceded with aspacechar, use the regex :
SEARCH / MARK
(?-si:<p class="mb-40px">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<- To find the specific tag
<p class="mb-40px">containing any string\w<, preceded and followed with aspacechar, use the regex :
SEARCH / MARK
(?-si:<p class="mb-40px">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<(?=\x20)Best Regards,
guy038
P.S. :
BTW, no need to use a new profile. You’re certainly @robin-cruise !
-
-
@guy038 thanks for the solution.
1 mobile account and 1 desktop account. No difference. It’s all about where you are at that time…
-
@guy038 So, the generic formulas for this kind of problem (contain a string, but doesn’t contain other string) should be this:
(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\K(FR)(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)BSR (begin part) =
<p class="mb-40px">
ESR (end part) =</p>
FR - (FIND Regex) =\w< -
Hi, @rodica-f and All,
Just a remainder :
- Don’t forget to move the caret to the very beginning of file, before running the regex, with the
Ctrl + Homeshortcut
BR
guy038
- Don’t forget to move the caret to the very beginning of file, before running the regex, with the
-
@guy038 but what if I have the following case?
Must use a regex as to find all lines which contain
<p class="sd-23">but does not contain the closing tag</p><p class="sd-23">Somebody to love</p> <p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen. <p class="sd-23">Holy Birth of God, have mercy on us!</p>my regex doesn’t work at all. It should have found the second line.
FIND:
(?<p class="sd-23">).*(?!</p>) -
Hello, @Robin-cruise and All,
Well, not so difficult ! I assume that each line must end with the
</p>tag and that you’re not speaking about any multi-lines block !
Then, use the following regex in order to find out all the lines beginning with
<p class="sd-23">and not ending with the</p>tag :(?-i)\h*<p class="sd-23">((?!</p>).)*$For instance, using this four-lines text :
p class="sd-23">Somebody to love</p> <p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen. <p class="sd-23"> <p class="sd-23">Holy Birth of God, have mercy on us!</p>The regex would select the entire lines
2and3!
Notes :
-
The regex finds, first, the string
<p class="sd-23">, with this exact case, after possible leading blank characters -
Then, it grasps all remaining text (
.*) till the end of the current line ($)… -
…But ONLY IF it does not meet the
</p>tag at any position after<p class="sd-23">, till the end of current line
Best Regards,
guy038
-
-
@guy038 thanks ! but I don’t understand what does this doing:
(?-i)\h* -
Hi, @robin-cruise,
-
The (
?-i)part means that, from thiat point, the search will be sentitive to case. So, it will match the string<p class="sd-23">, but not, for instance, the string<P class="sd-23">nor the string<p CLASS="sd-23">! -
Then, the
\hclass character represents any horizontal blank character ( so, either, the\t[ Tabulation ] char or the\x20[ space] char or the\xa0character [No-breaking Space] char) -
Thus, the
\h*syntax represents any range of horizontal blank chars, from0ton
BR
guy038
-
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login