Community
    • Login

    Find <strong> and </strong> tags but skip it between <p class="translate and </p> and skip words like "stronger"/"strongest"

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 3 Posters 202 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dr ramaanandD
      dr ramaanand
      last edited by dr ramaanand

      How to find <strong> and </strong> tags but skip it between <p class="translate and the first </p> and skip words like “stronger”/“strongest”
      Block for testing:-

      <p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
      </p></div>
      <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
      </p>
      stronger
      strongest
      <span style="color:blue;font-weight:bold;">
      

      This regular expression is invalid: (<p\s*class="translate"[^>]*>[^<>]*<\/strong)\K|strong, so is (?<!<p\s*class="translate"[^>]*>[^<>]*<\/)strong and (<p\s*class="translate"[^>]*>[^<>]*<\/)strong>\K|strong> helps skip only the words stronger and strongest

      dr ramaanandD PeterJonesP 2 Replies Last reply Reply Quote 0
      • dr ramaanandD
        dr ramaanand @dr ramaanand
        last edited by

        https://regex101.com/r/NDip3K/5 helped find the first <strong> tag only but https://regex101.com/r/NDip3K/3 helped find even the </strong> if it was not between <p class="translate and </p>

        1 Reply Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @dr ramaanand
          last edited by

          @dr-ramaanand ,

          \bstrong\b will match “strong” but not “stronger” or “Armstrong”. Hopefully that gives you what you need for the “only strong, no words with strong inside” requirement.

          Please note, that word-boundary is documented in the User Manual. I would recommend that as something fundamental enough that you should be able to come up with that idea yourself without asking, at the level of regex that you are aiming for.

          Hopefully, someone with more ability to explain skip-between syntax will be able to chime in on the more difficult half of your question.

          dr ramaanandD 1 Reply Last reply Reply Quote 0
          • dr ramaanandD
            dr ramaanand @PeterJones
            last edited by dr ramaanand

            @PeterJones Thanks for the information but the regular expression (RegEx) at https://regex101.com/r/NDip3K/3 helped solve my problem. If, however, @guy038 has a better solution/answer, I will use it (if he has the time to work out a better solution/answer). Note to future readers: The SKIP/FAIL method of finding something after skipping some other strings is explained at https://community.notepad-plus-plus.org/topic/26812/generic-regex-how-to-use-the-couple-of-backtracking-control-verbs-skip-fail-or-skip-f-in-regexes (if one clicks this link, one can see that it is easy because it shows the part skipped and the part matched)

            PeterJonesP 1 Reply Last reply Reply Quote 0
            • PeterJonesP
              PeterJones @dr ramaanand
              last edited by PeterJones

              @dr-ramaanand ,

              In a small file, you might not notice a difference, but \b is computationally significantly more efficient than the SKIP/FAIL, so for big files, SKIP/FAIL could cease to work, depending on how far back it has to backtrack; \b has no such “breaking point”, as it never needs to look farther than one character in either direction from the current match point.

              Please stop trying to convince me to read @guy038’s post; all it serves to do is annoy me. I have read his post, and understand how to use the formula.

              But searching for strong-but-not-stronger-or-Armstrong does not need something as powerful and computationally expensive as SKIP/FAIL, and if someone was asking for help (as you did), I would give the advice that I did, whether or not I know how to use @guy038’s formula, or whether or not I fully understand the implications of SKIP/FAIL, because I know enough about it to make an informed decision for myself, and for what advice I am willing and able to give.

              If you want to use SKIP/FAIL, and if you figured out how to make that work for your question, great for you. But please stop asking for help or advice if you don’t want help or advice.

              I really didn’t reply to this to fight with you. In fact, I desparately wanted to avoid it. But since you claimed you wanted help, and I knew a way to accomplish part of your goal, i thought I would try to put the past behind us and offer you what I consider to be good advice. Since you obviously don’t want or value my advice, I am going to go back to just ignoring all your posts, because even when I give advice that answers some or all of your question, you still fight back, and my trying to help you is just a waste of everyone’s time.

              1 Reply Last reply Reply Quote 0
              • guy038G
                guy038
                last edited by guy038

                Hello, @dr-ramaanand, @peterjones and all,

                Not very difficult to achieve ! You can use, either :

                • The regex with the (*SKIP)(*F) feature, so :

                  • FIND (?s-i)<p class\s*=\s*"translate.+?</p>(*SKIP)(*F)|</?strong>
                • The regex with the \K feature :

                  • FIND (?s-i)<p class\s*=\s*"translate.+?</p.+?\K</?strong>|</?strong>

                Against this INPUT text which corresponds to your example, duplicated three times :

                <p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
                </p></div>
                <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
                </p>
                stronger
                strongest
                <span style="color:blue;font-weight:bold;">
                <p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
                </p></div>
                <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
                </p>
                stronger
                strongest
                <span style="color:blue;font-weight:bold;">
                <p class="translate" style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
                </p></div>
                <p style="font-size: 15px; color: black; line-height: 18px; font-style: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; background-color: transparent; font-family: Verdana; text-align: right;"><strong><span style="color: blue;">Lorem</span><br><span style="color: navy;">ipsum</span></strong>
                </p>
                stronger
                strongest
                <span style="color:blue;font-weight:bold;">
                

                It would mark 6 occurrences of both <strong> or </strong> only, when found outside the ranges of text <p class="translate"...............</p> !

                So, the snapshot, below :

                9ca22059-bede-4c25-ad9f-f385ae3431fe-image.png


                Note that the second regex, with \K works nicely too because the regex engine tries :

                • The first branch of the alternative which must be matched before that the \K cancels the current match so far and just matches the part </?strong>

                • If the first branch cannot be matched, then the regex engine tries the second branch of the alternative </?strong> which returns only one of the two right expressions

                Best Regards,

                guy038

                dr ramaanandD 1 Reply Last reply Reply Quote 1
                • dr ramaanandD
                  dr ramaanand @guy038
                  last edited by dr ramaanand

                  @guy038 Thank you very much. @PeterJones I don’t come here to “fight” with anyone - I linked to the explanation of the SKIP/FAIL method by @guy038 just to help future readers to understand it. I am sorry if I have hurt you but that was not my intention. I request others to comment if any of my posts seem offensive and why they seem so (if they deem it so) - I will correct myself.

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors