Regex: Find those tags that contain a string, but which do not contain other string

rodica F

1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name</a> is beyond magic.</p>
    
2. <p class="mb-40px">I love my home s< because I stay with my lovely cat.</p>

3. <p class="mb-40px">Because of this book t< I cannot sleep well.</p>

I want to find only the lines that have the operator < included in the html tag , except those lines that have

In my example above, the output should be line 2 (that have s< ) and line 3 ( that have t< )

So, I use @guy032 generic formula: (REGION-START)+(.)+\K(FIND REGEX)(?s:(?=.*(REGION-FINAL)))

In my case FIND: ()+(.)+\K(\w<)(?s:(?=.*()))

The problem is that my regex find also the e</a> from the first line. And I don’t wanna find the tags with </a>

Maybe @guy038 have a better GENERIC for this kind of problem

guy038

Hello, @rodica-f and All,

Just consider this example :


1. <p class="mb-40px">My nick name is Prince and <a href="https://mywebsite.com/bla.html" class="color-gege" target="_new">my real name </a> is z<beyond f< magic.</p>
    
2. <p class="Test">I love my home s< because I stay with my b<lovely cat.</p>

3. <p class="mb-40px">Because of this book t<I cannot a< sleep well.</p>

Within this text :

Two tags begin with  and one begins with 
Each <p tag contains two < operators ( one followed with a space char, the other followed with a letter )

So :

To find any <p... tag containing any string \w<, preceded with a space char, use the regex :

SEARCH / MARK (?-si:|(?!\A)\G)(?s-i:(?!).)*?\x20\K\w<

To find any <p... tag and containing any string \w<, preceded and followed with a space, use the regex :

SEARCH / MARK (?-si:|(?!\A)\G)(?s-i:(?!).)*?\x20\K\w<(?=\x20)

To find the specific tag  containing any string \w<, preceded with a space char, use the regex :

SEARCH / MARK (?-si:|(?!\A)\G)(?s-i:(?!).)*?\x20\K\w<

To find the specific tag  containing any string \w<, preceded and followed with a space char, use the regex :

SEARCH / MARK (?-si:|(?!\A)\G)(?s-i:(?!).)*?\x20\K\w<(?=\x20)

Best Regards,

guy038

P.S. :

BTW, no need to use a new profile. You’re certainly @robin-cruise !

rodica F

@guy038 thanks for the solution.

1 mobile account and 1 desktop account. No difference. It’s all about where you are at that time…

rodica F

@guy038 So, the generic formulas for this kind of problem (contain a string, but doesn’t contain other string) should be this:

(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\K(FR)

(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)

(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR

(?-si:BSR|(?!\A)\G)(?s-i:(?!ESR).)*?\x20\KFR(?=\x20)

BSR (begin part) = 
ESR (end part) = 
FR - (FIND Regex) = \w<

guy038

Hi, @rodica-f and All,

Just a remainder :

Don’t forget to move the caret to the very beginning of file, before running the regex, with the Ctrl + Home shortcut

BR

guy038

Robin Cruise

@guy038 but what if I have the following case?

Must use a regex as to find all lines which contain  but does not contain the closing tag 

<p class="sd-23">Somebody to love</p>

<p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen.

<p class="sd-23">Holy Birth of God, have mercy on us!</p>

my regex doesn’t work at all. It should have found the second line.

FIND: (?).*(?!)

guy038

Hello, @Robin-cruise and All,

Well, not so difficult ! I assume that each line must end with the  tag and that you’re not speaking about any multi-lines block !

Then, use the following regex in order to find out all the lines beginning with  and not ending with the  tag :

(?-i)\h*((?!).)*$

For instance, using this four-lines text :

p class="sd-23">Somebody to love</p>

<p class="sd-23">In 1495, the Grand Prince gave this icon as a blessing to his daughter Helen.

    <p class="sd-23">

    <p class="sd-23">Holy Birth of God, have mercy on us!</p>

The regex would select the entire lines 2 and 3 !

Notes :

The regex finds, first, the string , with this exact case, after possible leading blank characters
Then, it grasps all remaining text ( .* ) till the end of the current line ( $ )…
…But ONLY IF it does not meet the  tag at any position after , till the end of current line

Best Regards,

guy038

Robin Cruise

@guy038 thanks ! but I don’t understand what does this doing:

(?-i)\h*

guy038

Hi, @robin-cruise,

The (?-i) part means that, from thiat point, the search will be sentitive to case. So, it will match the string , but not, for instance, the string  nor the string  !
Then, the \h class character represents any horizontal blank character ( so, either, the \t [ Tabulation ] char or the \x20 [ space] char or the \xa0 character [No-breaking Space] char)
Thus, the \h* syntax represents any range of horizontal blank chars, from 0 to n

BR

guy038