• Login
Community
  • Login

How to delete a duplicate paragraph at a particular place in multiple files

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
10 Posts 2 Posters 384 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    dr ramaanand
    last edited by Dec 5, 2022, 9:25 AM

    <H1…>Heading1</H1>
    <H2…>Some text</H2>
    <H2…>Different text</H2>
    <H2…>Altogether different text</H2>
    Some paragraphs here
    </P> (or </ul>)
    <P…><span…><b>Please E-mail us</b></span></P>
    <H2…>Heading that should not be reproduced</H2>
    Some paragraphs here
    </ul> (or </P>)
    <P…>We have</P>
    <P …>Some text</P>
    <P …>Different text</P>
    <P …>Same text</P>
    <P …>Same text</P>
    <P…><b><span…>Please E-mail us</span></b></P>

    D 1 Reply Last reply Dec 5, 2022, 9:55 AM Reply Quote 0
    • D
      dr ramaanand @dr ramaanand
      last edited by dr ramaanand Dec 5, 2022, 10:05 AM Dec 5, 2022, 9:55 AM

      @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

      <H1…>Heading1</H1>
      <H2…>Some text</H2>
      <H2…>Different text</H2>
      <H2…>Altogether different text</H2>
      Some paragraphs here
      </P> (or </ul>)
      <P…><span…><b>Please E-mail us</b></span></P>
      <H2…>Heading that should not be reproduced</H2>
      Some paragraphs here
      </ul> (or </P>)
      <P…>We have</P>
      <P …>Some text</P>
      <P …>Different text</P>
      <P …>Same text</P>
      <P …>Same text</P>
      <P…><b><span…>Please E-mail us</span></b></P>

      For the above test string, if I put (?s)\A.+?\K((<h2.+?</h2>\R)+).*\K(?=<p.*?</p>\R<p.*?Please\s*E-mail\s*us) in the Find field and select the Regular expression mode, I can find (and remove) a paragraph just before the paragraph with the “Please E-mail us” text as that has the same text as the paragraph above it in most files of a folder. However, in some cases (in other files), it doesn’t have the same text, so how do I avoid finding/removing it if it doesn’t have the same text? I believe this paragraph with the same text was added by Notepad++ during my previous find and replace exercise due to a bug.

      D 1 Reply Last reply Dec 5, 2022, 10:19 AM Reply Quote 0
      • D
        dr ramaanand @dr ramaanand
        last edited by Dec 5, 2022, 10:19 AM

        @dr-ramaanand Please don’t tell me to do it on my own. I have tried and failed already.

        A 1 Reply Last reply Dec 5, 2022, 12:19 PM Reply Quote -1
        • A
          Alan Kilborn @dr ramaanand
          last edited by Dec 5, 2022, 12:19 PM

          @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

          Please don’t tell me to do it on my own. I have tried and failed already.

          Probably the best thing to do is to seek help on a site that specializes in regular-expression help.

          D 1 Reply Last reply Dec 5, 2022, 3:52 PM Reply Quote 0
          • D
            dr ramaanand @Alan Kilborn
            last edited by Dec 5, 2022, 3:52 PM

            @Alan-Kilborn I asked at www.regex101.com and they told me to put (?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us) in the Find field, select the Regular Expression mode and $2 in the Replace field and hit “Replace All” and all the duplicate paragraphs disappeared.

            A 1 Reply Last reply Dec 5, 2022, 3:53 PM Reply Quote 0
            • A
              Alan Kilborn @dr ramaanand
              last edited by Dec 5, 2022, 3:53 PM

              @dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:

              and all the duplicate paragraphs disappeared.

              So that’s good, right?
              What you wanted?

              D 1 Reply Last reply Dec 5, 2022, 3:55 PM Reply Quote 0
              • D
                dr ramaanand @Alan Kilborn
                last edited by Dec 5, 2022, 3:55 PM

                @Alan-Kilborn yes and thanks for your time also. Please keep this community going as there are lots of people who will ask for solutions here (notepad++ community)!

                A 1 Reply Last reply Dec 5, 2022, 4:00 PM Reply Quote 0
                • A
                  Alan Kilborn @dr ramaanand
                  last edited by Dec 5, 2022, 4:00 PM

                  @dr-ramaanand

                  Our goals are to get you the best help available.
                  We can answer regex questions here, but the same/similar questions from the same poster get tiring as we are interested in much more diverse Notepad++ topics than just data conversion with regex.
                  So, if we can redirect you to a site where they are excited about regex, and only regex, well, we’ll do that.
                  I think maybe you’ve found a site for that now.
                  But I encourage you to learn to do it yourself – if someone else can write something that works, then so can you!

                  D 2 Replies Last reply Dec 5, 2022, 5:53 PM Reply Quote 0
                  • D
                    dr ramaanand @Alan Kilborn
                    last edited by dr ramaanand Dec 5, 2022, 5:56 PM Dec 5, 2022, 5:53 PM

                    @Alan-Kilborn I have learnt quite a bit but not everything which is why I seek solutions here. Notepad++ has a “delete duplicate lines” in an open file feature which is why I asked for a solution here first.

                    1 Reply Last reply Reply Quote 0
                    • D
                      dr ramaanand @Alan Kilborn
                      last edited by dr ramaanand Dec 6, 2022, 2:48 AM Dec 6, 2022, 2:22 AM

                      @Alan-Kilborn I can even explain the above. In that RegEx, (?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us) - (?s) means “search”, ^ means at the beginning of the line, (<p.*?<\/p>\R) means the first captured group, from <p...................</p> including the next line (which is done with the \R) and the rest is the second captured group in which \1 is to search for a duplicate of the first captured group, followed by another <p...................</p> string, followed by, “Please E-mail us”. The \s* before and after the, “E-mail” will make the words, “Please E-mail us” to be captured even if they are all on different lines (as well as if they are all on the same line).
                      The $2 in the Replace field (“Replace in files” in this case) is to reproduce the second captured group in the final result.

                      1 Reply Last reply Reply Quote 1
                      3 out of 10
                      • First post
                        3/10
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors