How to delete a duplicate paragraph at a particular place in multiple files
-
<H1…>Heading1</H1>
<H2…>Some text</H2>
<H2…>Different text</H2>
<H2…>Altogether different text</H2>
Some paragraphs here
</P> (or </ul>)
<P…><span…><b>Please E-mail us</b></span></P>
<H2…>Heading that should not be reproduced</H2>
Some paragraphs here
</ul> (or </P>)
<P…>We have</P>
<P …>Some text</P>
<P …>Different text</P>
<P …>Same text</P>
<P …>Same text</P>
<P…><b><span…>Please E-mail us</span></b></P> -
@dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:
<H1…>Heading1</H1>
<H2…>Some text</H2>
<H2…>Different text</H2>
<H2…>Altogether different text</H2>
Some paragraphs here
</P> (or </ul>)
<P…><span…><b>Please E-mail us</b></span></P>
<H2…>Heading that should not be reproduced</H2>
Some paragraphs here
</ul> (or </P>)
<P…>We have</P>
<P …>Some text</P>
<P …>Different text</P>
<P …>Same text</P>
<P …>Same text</P>
<P…><b><span…>Please E-mail us</span></b></P>For the above test string, if I put
(?s)\A.+?\K((<h2.+?</h2>\R)+).*\K(?=<p.*?</p>\R<p.*?Please\s*E-mail\s*us)in the Find field and select the Regular expression mode, I can find (and remove) a paragraph just before the paragraph with the “Please E-mail us” text as that has the same text as the paragraph above it in most files of a folder. However, in some cases (in other files), it doesn’t have the same text, so how do I avoid finding/removing it if it doesn’t have the same text? I believe this paragraph with the same text was added by Notepad++ during my previous find and replace exercise due to a bug. -
@dr-ramaanand Please don’t tell me to do it on my own. I have tried and failed already.
-
@dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:
Please don’t tell me to do it on my own. I have tried and failed already.
Probably the best thing to do is to seek help on a site that specializes in regular-expression help.
-
@Alan-Kilborn I asked at www.regex101.com and they told me to put
(?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us)in the Find field, select the Regular Expression mode and$2in the Replace field and hit “Replace All” and all the duplicate paragraphs disappeared. -
@dr-ramaanand said in How to delete a duplicate paragraph at a particular place in multiple files:
and all the duplicate paragraphs disappeared.
So that’s good, right?
What you wanted? -
@Alan-Kilborn yes and thanks for your time also. Please keep this community going as there are lots of people who will ask for solutions here (notepad++ community)!
-
Our goals are to get you the best help available.
We can answer regex questions here, but the same/similar questions from the same poster get tiring as we are interested in much more diverse Notepad++ topics than just data conversion with regex.
So, if we can redirect you to a site where they are excited about regex, and only regex, well, we’ll do that.
I think maybe you’ve found a site for that now.
But I encourage you to learn to do it yourself – if someone else can write something that works, then so can you! -
@Alan-Kilborn I have learnt quite a bit but not everything which is why I seek solutions here. Notepad++ has a “delete duplicate lines” in an open file feature which is why I asked for a solution here first.
-
@Alan-Kilborn I can even explain the above. In that RegEx,
(?s)^(<p.*?<\/p>\R)(\1<p.*?Please\s*E-mail\s*us)-(?s)means “search”,^means at the beginning of the line,(<p.*?<\/p>\R)means the first captured group, from<p...................</p>including the next line (which is done with the\R) and the rest is the second captured group in which\1is to search for a duplicate of the first captured group, followed by another<p...................</p>string, followed by, “Please E-mail us”. The\s*before and after the, “E-mail” will make the words, “Please E-mail us” to be captured even if they are all on different lines (as well as if they are all on the same line).
The$2in the Replace field (“Replace in files” in this case) is to reproduce the second captured group in the final result.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login