Find and tags but skip it between and skip words like "stronger"/"strongest"

dr ramaanand

How to find and tags but skip it between and skip words like “stronger”/“strongest”
Block for testing:-

<p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p></div>
<p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p>
stronger
strongest
<span style="color:blue;font-weight:bold;">

This regular expression is invalid: (<p\s*class="translate"[^>]*>[^<>]*<\/strong)\K|strong, so is (?<!<p\s*class="translate"[^>]*>[^<>]*<\/)strong and (<p\s*class="translate"[^>]*>[^<>]*<\/)strong>\K|strong> helps skip only the words stronger and strongest

dr ramaanand

https://regex101.com/r/NDip3K/5 helped find the first  tag only but https://regex101.com/r/NDip3K/3 helped find even the  if it was not between 

PeterJones

@dr-ramaanand ,

\bstrong\b will match “strong” but not “stronger” or “Armstrong”. Hopefully that gives you what you need for the “only strong, no words with strong inside” requirement.

Please note, that word-boundary is documented in the User Manual. I would recommend that as something fundamental enough that you should be able to come up with that idea yourself without asking, at the level of regex that you are aiming for.

Hopefully, someone with more ability to explain skip-between syntax will be able to chime in on the more difficult half of your question.

dr ramaanand

@PeterJones Thanks for the information but the regular expression (RegEx) at https://regex101.com/r/NDip3K/3 helped solve my problem. If, however, @guy038 has a better solution/answer, I will use it (if he has the time to work out a better solution/answer). Note to future readers: The SKIP/FAIL method of finding something after skipping some other strings is explained at https://community.notepad-plus-plus.org/topic/26812/generic-regex-how-to-use-the-couple-of-backtracking-control-verbs-skip-fail-or-skip-f-in-regexes (if one clicks this link, one can see that it is easy because it shows the part skipped and the part matched)

PeterJones

@dr-ramaanand ,

In a small file, you might not notice a difference, but \b is computationally significantly more efficient than the SKIP/FAIL, so for big files, SKIP/FAIL could cease to work, depending on how far back it has to backtrack; \b has no such “breaking point”, as it never needs to look farther than one character in either direction from the current match point.

Please stop trying to convince me to read @guy038’s post; all it serves to do is annoy me. I have read his post, and understand how to use the formula.

But searching for strong-but-not-stronger-or-Armstrong does not need something as powerful and computationally expensive as SKIP/FAIL, and if someone was asking for help (as you did), I would give the advice that I did, whether or not I know how to use @guy038’s formula, or whether or not I fully understand the implications of SKIP/FAIL, because I know enough about it to make an informed decision for myself, and for what advice I am willing and able to give.

If you want to use SKIP/FAIL, and if you figured out how to make that work for your question, great for you. But please stop asking for help or advice if you don’t want help or advice.

I really didn’t reply to this to fight with you. In fact, I desparately wanted to avoid it. But since you claimed you wanted help, and I knew a way to accomplish part of your goal, i thought I would try to put the past behind us and offer you what I consider to be good advice. Since you obviously don’t want or value my advice, I am going to go back to just ignoring all your posts, because even when I give advice that answers some or all of your question, you still fight back, and my trying to help you is just a waste of everyone’s time.

guy038

Hello, @dr-ramaanand, @peterjones and all,

Not very difficult to achieve ! You can use, either :

The regex with the (*SKIP)(*F) feature, so :
- FIND (?s-i)(*SKIP)(*F)|</?strong>
The regex with the \K feature :
- FIND (?s-i)|</?strong>

Against this INPUT text which corresponds to your example, duplicated three times :

<p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p></div>
<p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p>
stronger
strongest
<span style="color:blue;font-weight:bold;">
<p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p></div>
<p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p>
stronger
strongest
<span style="color:blue;font-weight:bold;">
<p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p></div>
<p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
</p>
stronger
strongest
<span style="color:blue;font-weight:bold;">

It would mark 6 occurrences of both  or  only, when found outside the ranges of text  !

So, the snapshot, below :

Note that the second regex, with \K works nicely too because the regex engine tries :

The first branch of the alternative which must be matched before that the \K cancels the current match so far and just matches the part </?strong>
If the first branch cannot be matched, then the regex engine tries the second branch of the alternative </?strong> which returns only one of the two right expressions

Best Regards,

guy038

dr ramaanand

@guy038 Thank you very much. @PeterJones I don’t come here to “fight” with anyone - I linked to the explanation of the SKIP/FAIL method by @guy038 just to help future readers to understand it. I am sorry if I have hurt you but that was not my intention. I request others to comment if any of my posts seem offensive and why they seem so (if they deem it so) - I will correct myself.

Find <strong> and </strong> tags but skip it between <p class="translate and </p> and skip words like "stronger"/"strongest"