How to remove duplicate entries in each line
-
I’ve an HTML file with duplicate
hrefattributes like this:<a href="https://example.com/" href="https://example.com/" target="_blank">Example</a> <a href="https://website.com/" href="https://website.com/" target="_blank">Website</a> <a href="https://sample.com/" href="https://sample.com/" target="_blank">Sample</a>How can I remove the duplicate
hrefattribute in each line using Notepad++? -
Maybe try:
find:
(?-s)(href="https://.+?\.com/" )(?=\1)
repl: nothing
mode: Regular expression -
Hi @alan-kilborn, thanks a lot for the reply. Appreciate your time.
I forgot to mention that there are various domains like.org,.co,.govetc in the file.Is there a regex that handles all of these?
-
@zcraber said in How to remove duplicate entries in each line:
I forgot to mention that
Changes spec after solution is provided. :-(
-
@zcraber said in How to remove duplicate entries in each line:
there are various domains like .org, .co, .gov etc in the file
(?-s)(href="https://.+?\.(?:com|org|gov)/" )(?=\1) -
@alan-kilborn Thank you.
Changes spec after solution is provided. :-(
Sorry about that. Next time I’ll be more specific. :)