How to remove duplicate entries in each line
-
I’ve an HTML file with duplicate
hrefattributes like this:<a href="https://example.com/" href="https://example.com/" target="_blank">Example</a> <a href="https://website.com/" href="https://website.com/" target="_blank">Website</a> <a href="https://sample.com/" href="https://sample.com/" target="_blank">Sample</a>How can I remove the duplicate
hrefattribute in each line using Notepad++? -
Maybe try:
find:
(?-s)(href="https://.+?\.com/" )(?=\1)
repl: nothing
mode: Regular expression -
Hi @alan-kilborn, thanks a lot for the reply. Appreciate your time.
I forgot to mention that there are various domains like.org,.co,.govetc in the file.Is there a regex that handles all of these?
-
@zcraber said in How to remove duplicate entries in each line:
I forgot to mention that
Changes spec after solution is provided. :-(
-
@zcraber said in How to remove duplicate entries in each line:
there are various domains like .org, .co, .gov etc in the file
(?-s)(href="https://.+?\.(?:com|org|gov)/" )(?=\1) -
@alan-kilborn Thank you.
Changes spec after solution is provided. :-(
Sorry about that. Next time I’ll be more specific. :)
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login