Find <strong> and </strong> tags but skip it between <p class="translate and </p> and skip words like "stronger"/"strongest"
-
How to find <strong> and </strong> tags but skip it between <p class="translate and the first </p> and skip words like “stronger”/“strongest”
Block for testing:-<p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p></div> <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p> stronger strongest <span style="color:blue;font-weight:bold;">
This regular expression is invalid:
(<p\s*class="translate"[^>]*>[^<>]*<\/strong)\K|strong
, so is(?<!<p\s*class="translate"[^>]*>[^<>]*<\/)strong
and(<p\s*class="translate"[^>]*>[^<>]*<\/)strong>\K|strong>
helps skip only the words stronger and strongest -
https://regex101.com/r/NDip3K/5 helped find the first
<strong>
tag only but https://regex101.com/r/NDip3K/3 helped find even the</strong>
if it was not between<p class="translate
and</p>
-
\bstrong\b
will match “strong” but not “stronger” or “Armstrong”. Hopefully that gives you what you need for the “onlystrong
, no words withstrong
inside” requirement.Please note, that word-boundary is documented in the User Manual. I would recommend that as something fundamental enough that you should be able to come up with that idea yourself without asking, at the level of regex that you are aiming for.
Hopefully, someone with more ability to explain skip-between syntax will be able to chime in on the more difficult half of your question.
-
@PeterJones Thanks for the information but the regular expression (RegEx) at https://regex101.com/r/NDip3K/3 helped solve my problem. If, however, @guy038 has a better solution/answer, I will use it (if he has the time to work out a better solution/answer). Note to future readers: The
SKIP/FAIL
method of finding something after skipping some other strings is explained at https://community.notepad-plus-plus.org/topic/26812/generic-regex-how-to-use-the-couple-of-backtracking-control-verbs-skip-fail-or-skip-f-in-regexes (if one clicks this link, one can see that it is easy because it shows the part skipped and the part matched) -
In a small file, you might not notice a difference, but
\b
is computationally significantly more efficient than theSKIP/FAIL
, so for big files,SKIP/FAIL
could cease to work, depending on how far back it has to backtrack;\b
has no such “breaking point”, as it never needs to look farther than one character in either direction from the current match point.Please stop trying to convince me to read @guy038’s post; all it serves to do is annoy me. I have read his post, and understand how to use the formula.
But searching for
strong
-but-not-stronger
-or-Armstrong
does not need something as powerful and computationally expensive asSKIP/FAIL
, and if someone was asking for help (as you did), I would give the advice that I did, whether or not I know how to use @guy038’s formula, or whether or not I fully understand the implications ofSKIP/FAIL
, because I know enough about it to make an informed decision for myself, and for what advice I am willing and able to give.If you want to use
SKIP/FAIL
, and if you figured out how to make that work for your question, great for you. But please stop asking for help or advice if you don’t want help or advice.I really didn’t reply to this to fight with you. In fact, I desparately wanted to avoid it. But since you claimed you wanted help, and I knew a way to accomplish part of your goal, i thought I would try to put the past behind us and offer you what I consider to be good advice. Since you obviously don’t want or value my advice, I am going to go back to just ignoring all your posts, because even when I give advice that answers some or all of your question, you still fight back, and my trying to help you is just a waste of everyone’s time.
-
Hello, @dr-ramaanand, @peterjones and all,
Not very difficult to achieve ! You can use, either :
-
The regex with the
(*SKIP)(*F)
feature, so :- FIND
(?s-i)<p class\s*=\s*"translate.+?</p>(*SKIP)(*F)|</?strong>
- FIND
-
The regex with the
\K
feature :- FIND
(?s-i)<p class\s*=\s*"translate.+?</p.+?\K</?strong>|</?strong>
- FIND
Against this INPUT text which corresponds to your example, duplicated three times :
<p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p></div> <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p> stronger strongest <span style="color:blue;font-weight:bold;"> <p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p></div> <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p> stronger strongest <span style="color:blue;font-weight:bold;"> <p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p></div> <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong> </p> stronger strongest <span style="color:blue;font-weight:bold;">
It would mark
6
occurrences of both<strong>
or</strong>
only, when found outside the ranges of text<p class="translate"...............</p>
!So, the snapshot, below :
Note that the second regex, with
\K
works nicely too because the regex engine tries :-
The first branch of the alternative which must be matched before that the \K cancels the current match so far and just matches the part
</?strong>
-
If the first branch cannot be matched, then the regex engine tries the second branch of the alternative
</?strong>
which returns only one of the two right expressions
Best Regards,
guy038
-
-
@guy038 Thank you very much. @PeterJones I don’t come here to “fight” with anyone - I linked to the explanation of the
SKIP/FAIL
method by @guy038 just to help future readers to understand it. I am sorry if I have hurt you but that was not my intention. I request others to comment if any of my posts seem offensive and why they seem so (if they deem it so) - I will correct myself.