Regex: Find those tags that contain a string, but which do not contain other string
-
1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name</a> is beyond magic.</p> 2. <p class="mb-40px">I love my home s< because I stay with my lovely cat.</p> 3. <p class="mb-40px">Because of this book t< I cannot sleep well.</p>I want to find only the lines that have the operator < included in the html tag <p class=“mb-40px”> </p> , except those lines that have
In my example above, the output should be line 2 (that have s< ) and line 3 ( that have t< )
So, I use @guy032 generic formula: (REGION-START)+(.)+\K(FIND REGEX)(?s:(?=.*(REGION-FINAL)))
In my case FIND: (<p class=“mb-40px”>)+(.)+\K(\w<)(?s:(?=.*(</p>)))
The problem is that my regex find also the e</a> from the first line. And I don’t wanna find the tags with
</a>Maybe @guy038 have a better GENERIC for this kind of problem
-
Hello, @rodica-f and All,
Just consider this example :
1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name </a> is z<beyond f< magic.</p> 2. <p class="Test">I love my home s< because I stay with my b<lovely cat.</p> 3. <p class="mb-40px">Because of this book t<I cannot a< sleep well.</p>Within this text :
-
Two tags begin with
<p class="mb-40px">and one begins with<p class="Test"> -
Each
<ptag contains two<operators ( one followed with aspacechar, the other followed with aletter)
So :
- To find any
<p...tag containing any string\w<, preceded with aspacechar, use the regex :
SEARCH / MARK
(?-si:<p class=".+?">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<- To find any
<p...tag and containing any string\w<, preceded and followed with aspace, use the regex :
SEARCH / MARK
(?-si:<p class=".+?">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<(?=\x20)- To find the specific tag
<p class="mb-40px">containing any string\w<, preceded with aspacechar, use the regex :
SEARCH / MARK
(?-si:<p class="mb-40px">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<- To find the specific tag
<p class="mb-40px">containing any string\w<, preceded and followed with aspacechar, use the regex :
SEARCH / MARK
(?-si:<p class="mb-40px">|(?!\A)\G)(?s-i:(?!</p>).)*?\x20\K\w<(?=\x20)Best Regards,
guy038
P.S. :
BTW, no need to use a new profile. You’re certainly @robin-cruise !
-
-
@guy038 thanks for the solution.
1 mobile account and 1 desktop account. No difference. It’s all about where you are at that time…
-
@guy038 So, the generic formulas for this kind of problem (contain a string, but doesn’t contain other string) should be this:
(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\K(FR)(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)BSR (begin part) =
<p class="mb-40px">
ESR (end part) =</p>
FR - (FIND Regex) =\w< -
Hi, @rodica-f and All,
Just a remainder :
- Don’t forget to move the caret to the very beginning of file, before running the regex, with the
Ctrl + Homeshortcut
BR
guy038
- Don’t forget to move the caret to the very beginning of file, before running the regex, with the
-
@guy038 but what if I have the following case?
Must use a regex as to find all lines which contain
<p class="sd-23">but does not contain the closing tag</p><p class="sd-23">Somebody to love</p> <p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen. <p class="sd-23">Holy Birth of God, have mercy on us!</p>my regex doesn’t work at all. It should have found the second line.
FIND:
(?<p class="sd-23">).*(?!</p>) -
Hello, @Robin-cruise and All,
Well, not so difficult ! I assume that each line must end with the
</p>tag and that you’re not speaking about any multi-lines block !
Then, use the following regex in order to find out all the lines beginning with
<p class="sd-23">and not ending with the</p>tag :(?-i)\h*<p class="sd-23">((?!</p>).)*$For instance, using this four-lines text :
p class="sd-23">Somebody to love</p> <p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen. <p class="sd-23"> <p class="sd-23">Holy Birth of God, have mercy on us!</p>The regex would select the entire lines
2and3!
Notes :
-
The regex finds, first, the string
<p class="sd-23">, with this exact case, after possible leading blank characters -
Then, it grasps all remaining text (
.*) till the end of the current line ($)… -
…But ONLY IF it does not meet the
</p>tag at any position after<p class="sd-23">, till the end of current line
Best Regards,
guy038
-
-
@guy038 thanks ! but I don’t understand what does this doing:
(?-i)\h* -
Hi, @robin-cruise,
-
The (
?-i)part means that, from thiat point, the search will be sentitive to case. So, it will match the string<p class="sd-23">, but not, for instance, the string<P class="sd-23">nor the string<p CLASS="sd-23">! -
Then, the
\hclass character represents any horizontal blank character ( so, either, the\t[ Tabulation ] char or the\x20[ space] char or the\xa0character [No-breaking Space] char) -
Thus, the
\h*syntax represents any range of horizontal blank chars, from0ton
BR
guy038
-