Community
    • Login

    How to delete a duplicate paragraph at a particular place in multiple files

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    10 Posts 2 Posters 382 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dr ramaanandD
      dr ramaanand
      last edited by

      <H1…>Heading1</H1>
      <H2…>Some text</H2>
      <H2…>Different text</H2>
      <H2…>Altogether different text</H2>
      Some paragraphs here
      </P> (or </ul>)
      <P…><span…><b>Please E-mail us</b></span></P>
      <H2…>Heading that should not be reproduced</H2>
      Some paragraphs here
      </ul> (or </P>)
      <P…>We have</P>
      <P …>Some text</P>
      <P …>Different text</P>
      <P …>Same text</P>
      <P …>Same text</P>
      <P…><b><span…>Please E-mail us</span></b></P>

      dr ramaanandD 1 Reply Last reply Reply Quote 0
      • dr ramaanandD
        dr ramaanand @dr ramaanand
        last edited by dr ramaanand

        @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

        <H1…>Heading1</H1>
        <H2…>Some text</H2>
        <H2…>Different text</H2>
        <H2…>Altogether different text</H2>
        Some paragraphs here
        </P> (or </ul>)
        <P…><span…><b>Please E-mail us</b></span></P>
        <H2…>Heading that should not be reproduced</H2>
        Some paragraphs here
        </ul> (or </P>)
        <P…>We have</P>
        <P …>Some text</P>
        <P …>Different text</P>
        <P …>Same text</P>
        <P …>Same text</P>
        <P…><b><span…>Please E-mail us</span></b></P>

        For the above test string, if I put (?s)\A.+?\K((<h2.+?</h2>\R)+).*\K(?=<p.*?</p>\R<p.*?Please\s*E-mail\s*us) in the Find field and select the Regular expression mode, I can find (and remove) a paragraph just before the paragraph with the “Please E-mail us” text as that has the same text as the paragraph above it in most files of a folder. However, in some cases (in other files), it doesn’t have the same text, so how do I avoid finding/removing it if it doesn’t have the same text? I believe this paragraph with the same text was added by Notepad++ during my previous find and replace exercise due to a bug.

        dr ramaanandD 1 Reply Last reply Reply Quote 0
        • dr ramaanandD
          dr ramaanand @dr ramaanand
          last edited by

          @dr-ramaanand Please don’t tell me to do it on my own. I have tried and failed already.

          Alan KilbornA 1 Reply Last reply Reply Quote -1
          • Alan KilbornA
            Alan Kilborn @dr ramaanand
            last edited by

            @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

            Please don’t tell me to do it on my own. I have tried and failed already.

            Probably the best thing to do is to seek help on a site that specializes in regular-expression help.

            dr ramaanandD 1 Reply Last reply Reply Quote 0
            • dr ramaanandD
              dr ramaanand @Alan Kilborn
              last edited by

              @Alan-Kilborn I asked at www.regex101.com and they told me to put (?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us) in the Find field, select the Regular Expression mode and $2 in the Replace field and hit “Replace All” and all the duplicate paragraphs disappeared.

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @dr ramaanand
                last edited by

                @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

                and all the duplicate paragraphs disappeared.

                So that’s good, right?
                What you wanted?

                dr ramaanandD 1 Reply Last reply Reply Quote 0
                • dr ramaanandD
                  dr ramaanand @Alan Kilborn
                  last edited by

                  @Alan-Kilborn yes and thanks for your time also. Please keep this community going as there are lots of people who will ask for solutions here (notepad++ community)!

                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @dr ramaanand
                    last edited by

                    @dr-ramaanand

                    Our goals are to get you the best help available.
                    We can answer regex questions here, but the same/similar questions from the same poster get tiring as we are interested in much more diverse Notepad++ topics than just data conversion with regex.
                    So, if we can redirect you to a site where they are excited about regex, and only regex, well, we’ll do that.
                    I think maybe you’ve found a site for that now.
                    But I encourage you to learn to do it yourself – if someone else can write something that works, then so can you!

                    dr ramaanandD 2 Replies Last reply Reply Quote 0
                    • dr ramaanandD
                      dr ramaanand @Alan Kilborn
                      last edited by dr ramaanand

                      @Alan-Kilborn I have learnt quite a bit but not everything which is why I seek solutions here. Notepad++ has a “delete duplicate lines” in an open file feature which is why I asked for a solution here first.

                      1 Reply Last reply Reply Quote 0
                      • dr ramaanandD
                        dr ramaanand @Alan Kilborn
                        last edited by dr ramaanand

                        @Alan-Kilborn I can even explain the above. In that RegEx, (?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us) - (?s) means “search”, ^ means at the beginning of the line, (<p.*?<\/p>\R) means the first captured group, from <p...................</p> including the next line (which is done with the \R) and the rest is the second captured group in which \1 is to search for a duplicate of the first captured group, followed by another <p...................</p> string, followed by, “Please E-mail us”. The \s* before and after the, “E-mail” will make the words, “Please E-mail us” to be captured even if they are all on different lines (as well as if they are all on the same line).
                        The $2 in the Replace field (“Replace in files” in this case) is to reproduce the second captured group in the final result.

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors