Community
    • Login

    RegEx to find a "p" tag followed by a space, followed by an angled bracket but skip any capital letters and span tags

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    html
    16 Posts 2 Posters 937 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Scott NielsonS
      Scott Nielson
      last edited by

      <p[^>]*>\s*(<)(?![A-Z])(?!span) is skipping the span tags but not the words before the first opening angled bracket <

      Scott NielsonS 1 Reply Last reply Reply Quote 0
      • Scott NielsonS
        Scott Nielson @Scott Nielson
        last edited by Scott Nielson

        @Scott-Nielson For example, <p style=color: black> <head should be found but not <p style=color: black> Head<

        PeterJonesP 1 Reply Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @Scott Nielson
          last edited by

          @Scott-Nielson

          I cannot replicate the match with your given regex on either of those two examples.

          ec6dba86-9f8e-4f82-be7f-5eef59f2f3ce-image.png

          If I remove the lookaheads, it will match the first, not the second

          ae0224bf-3b5c-4aa8-b12d-4f1c801cf51a-image.png

          So I’m not seeing why you think it matches the wrong one. Please make sure the regex was properly shown in your message, and make sure the data displayed in your post matches what it is in your actual file.

          Scott NielsonS 1 Reply Last reply Reply Quote 1
          • Scott NielsonS
            Scott Nielson @PeterJones
            last edited by Scott Nielson

            @PeterJones OK, forget my RegEx. Please tell me how to find, <p style=color: black> < but not <p style=color: black> Head<, <p style=color: black> Sing< or <p style=color: blue> <span

            PeterJonesP 1 Reply Last reply Reply Quote 0
            • PeterJonesP
              PeterJones @Scott Nielson
              last edited by

              @Scott-Nielson ,

              Well, the reduced one I showed in screenshot, it only matches 2/4, so it’s almost there. <p[^>]*>\s*(<)

              10b8d9e9-fd0b-4f66-852d-0b0b35dab7eb-image.png

              Then you just have to put back in the (?!span) to eliminate the fourth line to not match the final instance <p[^>]*>\s*(<)(?!span)
              ba458005-5d04-4d0c-8fe7-cd48876cef50-image.png

              Scott NielsonS 2 Replies Last reply Reply Quote 0
              • Scott NielsonS
                Scott Nielson @PeterJones
                last edited by

                @PeterJones your RegEx helps skip only the span tag after the p tag but please tell me how to find, <p style=color: black> < but not <p style=color: black> Head<, <p style=color: black> Sing<, <p style=color: black> France< or <p style=color: blue> <span

                1 Reply Last reply Reply Quote 0
                • Scott NielsonS
                  Scott Nielson @PeterJones
                  last edited by Scott Nielson

                  @PeterJones, I want one RegEx to find everything at once (and skip what I mentioned above), not 2 RegExes!

                  PeterJonesP 1 Reply Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones @Scott Nielson
                    last edited by PeterJones

                    This post is deleted!
                    PeterJonesP 1 Reply Last reply Reply Quote 1
                    • PeterJonesP
                      PeterJones @PeterJones
                      last edited by PeterJones

                      (deleted first attempt and rephrased)

                      In the document below, which has all of your test cases,

                      <p style=color: black> <
                      <p style=color: black> Head<
                      <p style=color: black> Sing<
                      <p style=color: blue> <span
                      <p style=color: black> France<
                      

                      … the one final regex I already gave you, <p[^>]*>\s*(<)(?!span) already matches only the <p style=color: black> <, and none of the other examples you have shown. As was shown by my screenshot before, for all but the new “France” example. And I’m not sure why you though that the “France” example was any different than your other two text-before-the-< examples. From the regex perspective, whether it says Head or Sing or France is completely irrelevant. If there is ANY non-space between the > after p and the next <, it won’t match. The same has been true for all the regex I have presented in this topic.

                      73a63392-b87e-409a-8847-c83b935bd36b-image.png

                      ----

                      Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.

                      Scott NielsonS 5 Replies Last reply Reply Quote 1
                      • Scott NielsonS
                        Scott Nielson @PeterJones
                        last edited by

                        OK, thanks a lot @PeterJones

                        1 Reply Last reply Reply Quote 0
                        • Scott NielsonS
                          Scott Nielson @PeterJones
                          last edited by

                          @PeterJones, If I want to find a p tag followed by a span tag but skip an a name tag after those tags, will this RegEx work: <p[^>]*>\s*<span[^>]*>\s*(<)(?!a\s*name) or does it need to be tweaked (I feel the \s* just before the name may not be right)?

                          1 Reply Last reply Reply Quote 0
                          • Scott NielsonS
                            Scott Nielson @PeterJones
                            last edited by

                            @PeterJones I also want it to skip an, a href tag, for example, <p class="MsoNormal" style="font-family: &quot;verdana&quot;; font-size: 18px; color: rgb(102, 102, 102); line-height: 18px; text-align: right;" align="right"><span style="font-family: Verdana,sans-serif;"> <a href="#" style="text-decoration: none; color: rgb(46, 150, 226);"> but it should find, <p class="MsoNormal" style="font-family: &quot;verdana&quot;; font-size: 18px; color: rgb(102, 102, 102); line-height: 18px; text-align: right;" align="right"><span style="font-family: Verdana,sans-serif;"> <

                            1 Reply Last reply Reply Quote 0
                            • Scott NielsonS
                              Scott Nielson @PeterJones
                              last edited by

                              @PeterJones This RegEx might work: <p[^>]*>\s*<span[^>]*>\s*(<)(?!a\s*name|href) but I’m waiting for your expert guidance!

                              1 Reply Last reply Reply Quote 0
                              • Scott NielsonS
                                Scott Nielson @PeterJones
                                last edited by Scott Nielson

                                @PeterJones The RegEx I typed just above is not working, so I will need your help. Please help! This seems better, <p[^>]*>\s*<span[^>]*>\s*(<)(?!a\s*name)(?!a\s*href), but is it right?

                                PeterJonesP 1 Reply Last reply Reply Quote 0
                                • PeterJonesP
                                  PeterJones @Scott Nielson
                                  last edited by

                                  @Scott-Nielson ,

                                  This RegEx might work

                                  Try it and see. We cannot tell you if it will actually meet your needs.

                                  I’m waiting for your expert guidance!

                                  Seems an inefficient way to solve your problems.

                                  The RegEx I typed just above is not working

                                  Good for you for trying. Sorry it didn’t work. Maybe try reading some more of the docs, or breaking the problem down into smaller pieces and try to get those smaller pieces working right before putting them together.

                                  This seems better, but is it right?

                                  How can we know? You give minor examples, but whether it matches all the appropriate data for your data set or not, including exceptions that you don’t show us, is not something that any of us here can tell you.

                                  With regex, you have to spend the effort, and try to figure it out. In every one of your half-dozen or mroe regex questions you’ve asked in the last few months, when I or one of the other regulars here who answer regex questions have answered, we have had to experiment to get it – it’s not like we can magically say “abracadabra” and the right regex suddenly appears. Each time I’ve helped you, I take the bits of regex syntax knowledge I have, and try combinations of those terms that seem like they’d do the job, and try them; if each piece does what I think, then I expand it to the next requirement, until it seems to match (in which case, job’s done, yay!) or until it stops matching parts that I think it should, at which point I back up and try a different tactic. The only way you’re going to learn regex is by figuring out what the individual pieces do, and try to take those pieces to solve the problems you have. You are not going to learn if I keep on handing you the answers.

                                  The reasons I am in this forum are to 1) learn more about Notepad++, 2) help others learn more about Notepad++, and 3) enjoy myself while doing so. Unfortunately, repeated questions from the same person just asking us to solve their data transformation needs doesn’t tick boxes #1 or #3 for me; and after a certain number of answers, it becomes obvious that the individual isn’t wanting to learn or isn’t able to learn by the way I answer, so box #2 is failing. If I fail at #2 enough, then eventually, I have to leave room for someone else to succeed on #2 with that person.

                                  So I wish you all the best, but my teaching style doesn’t seem to be helping you, so it’s not doing either of us any good for me to continue. Maybe someone else here will be able to help you learn where I could not. Good luck.

                                  Scott NielsonS 1 Reply Last reply Reply Quote 3
                                  • Scott NielsonS
                                    Scott Nielson @PeterJones
                                    last edited by

                                    @PeterJones the last RegEx I typed above worked for me. I thank you you for your time, patience, help and support. I was not sure about it and thought I should ask but since you dislike it, I will ask for a solution here only if I can’t figure out what to do on my own next time. Thanks again.

                                    1 Reply Last reply Reply Quote 0
                                    • First post
                                      Last post
                                    The Community of users of the Notepad++ text editor.
                                    Powered by NodeBB | Contributors