Community
    • Login

    How to delete a duplicate paragraph at a particular place in multiple files

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    10 Posts 2 Posters 947 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dr ramaanandD Offline
      dr ramaanand
      last edited by

      <H1…>Heading1</H1>
      <H2…>Some text</H2>
      <H2…>Different text</H2>
      <H2…>Altogether different text</H2>
      Some paragraphs here
      </P> (or </ul>)
      <P…><span…><b>Please E-mail us</b></span></P>
      <H2…>Heading that should not be reproduced</H2>
      Some paragraphs here
      </ul> (or </P>)
      <P…>We have</P>
      <P …>Some text</P>
      <P …>Different text</P>
      <P …>Same text</P>
      <P …>Same text</P>
      <P…><b><span…>Please E-mail us</span></b></P>

      dr ramaanandD 1 Reply Last reply Reply Quote 0
      • dr ramaanandD Offline
        dr ramaanand @dr ramaanand
        last edited by dr ramaanand

        @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

        <H1…>Heading1</H1>
        <H2…>Some text</H2>
        <H2…>Different text</H2>
        <H2…>Altogether different text</H2>
        Some paragraphs here
        </P> (or </ul>)
        <P…><span…><b>Please E-mail us</b></span></P>
        <H2…>Heading that should not be reproduced</H2>
        Some paragraphs here
        </ul> (or </P>)
        <P…>We have</P>
        <P …>Some text</P>
        <P …>Different text</P>
        <P …>Same text</P>
        <P …>Same text</P>
        <P…><b><span…>Please E-mail us</span></b></P>

        For the above test string, if I put (?s)\A.+?\K((<h2.+?</h2>\R)+).*\K(?=<p.*?</p>\R<p.*?Please\s*E-mail\s*us) in the Find field and select the Regular expression mode, I can find (and remove) a paragraph just before the paragraph with the “Please E-mail us” text as that has the same text as the paragraph above it in most files of a folder. However, in some cases (in other files), it doesn’t have the same text, so how do I avoid finding/removing it if it doesn’t have the same text? I believe this paragraph with the same text was added by Notepad++ during my previous find and replace exercise due to a bug.

        dr ramaanandD 1 Reply Last reply Reply Quote 0
        • dr ramaanandD Offline
          dr ramaanand @dr ramaanand
          last edited by

          @dr-ramaanand Please don’t tell me to do it on my own. I have tried and failed already.

          Alan KilbornA 1 Reply Last reply Reply Quote -1
          • Alan KilbornA Offline
            Alan Kilborn @dr ramaanand
            last edited by

            @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

            Please don’t tell me to do it on my own. I have tried and failed already.

            Probably the best thing to do is to seek help on a site that specializes in regular-expression help.

            dr ramaanandD 1 Reply Last reply Reply Quote 0
            • dr ramaanandD Offline
              dr ramaanand @Alan Kilborn
              last edited by

              @Alan-Kilborn I asked at www.regex101.com and they told me to put (?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us) in the Find field, select the Regular Expression mode and $2 in the Replace field and hit “Replace All” and all the duplicate paragraphs disappeared.

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA Offline
                Alan Kilborn @dr ramaanand
                last edited by

                @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

                and all the duplicate paragraphs disappeared.

                So that’s good, right?
                What you wanted?

                dr ramaanandD 1 Reply Last reply Reply Quote 0
                • dr ramaanandD Offline
                  dr ramaanand @Alan Kilborn
                  last edited by

                  @Alan-Kilborn yes and thanks for your time also. Please keep this community going as there are lots of people who will ask for solutions here (notepad++ community)!

                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA Offline
                    Alan Kilborn @dr ramaanand
                    last edited by

                    @dr-ramaanand

                    Our goals are to get you the best help available.
                    We can answer regex questions here, but the same/similar questions from the same poster get tiring as we are interested in much more diverse Notepad++ topics than just data conversion with regex.
                    So, if we can redirect you to a site where they are excited about regex, and only regex, well, we’ll do that.
                    I think maybe you’ve found a site for that now.
                    But I encourage you to learn to do it yourself – if someone else can write something that works, then so can you!

                    dr ramaanandD 2 Replies Last reply Reply Quote 0
                    • dr ramaanandD Offline
                      dr ramaanand @Alan Kilborn
                      last edited by dr ramaanand

                      @Alan-Kilborn I have learnt quite a bit but not everything which is why I seek solutions here. Notepad++ has a “delete duplicate lines” in an open file feature which is why I asked for a solution here first.

                      1 Reply Last reply Reply Quote 0
                      • dr ramaanandD Offline
                        dr ramaanand @Alan Kilborn
                        last edited by dr ramaanand

                        @Alan-Kilborn I can even explain the above. In that RegEx, (?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us) - (?s) means “search”, ^ means at the beginning of the line, (<p.*?<\/p>\R) means the first captured group, from <p...................</p> including the next line (which is done with the \R) and the rest is the second captured group in which \1 is to search for a duplicate of the first captured group, followed by another <p...................</p> string, followed by, “Please E-mail us”. The \s* before and after the, “E-mail” will make the words, “Please E-mail us” to be captured even if they are all on different lines (as well as if they are all on the same line).
                        The $2 in the Replace field (“Replace in files” in this case) is to reproduce the second captured group in the final result.

                        1 Reply Last reply Reply Quote 1

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors