• Login
Community
  • Login

How to remove duplicate entries in each line

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
6 Posts 2 Posters 322 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Z
    zcraber
    last edited by zcraber Feb 3, 2022, 1:07 PM Feb 3, 2022, 1:06 PM

    I’ve an HTML file with duplicate href attributes like this:

    <a href="https://example.com/" href="https://example.com/" target="_blank">Example</a>
    <a href="https://website.com/" href="https://website.com/" target="_blank">Website</a>
    <a href="https://sample.com/" href="https://sample.com/" target="_blank">Sample</a>
    

    How can I remove the duplicate href attribute in each line using Notepad++?

    A 1 Reply Last reply Feb 3, 2022, 1:19 PM Reply Quote 0
    • A
      Alan Kilborn @zcraber
      last edited by Alan Kilborn Feb 3, 2022, 1:20 PM Feb 3, 2022, 1:19 PM

      @zcraber

      Maybe try:

      find: (?-s)(href="https://.+?\.com/" )(?=\1)
      repl: nothing
      mode: Regular expression

      Z 1 Reply Last reply Feb 3, 2022, 1:27 PM Reply Quote 0
      • Z
        zcraber @Alan Kilborn
        last edited by zcraber Feb 3, 2022, 1:27 PM Feb 3, 2022, 1:27 PM

        Hi @alan-kilborn, thanks a lot for the reply. Appreciate your time.
        I forgot to mention that there are various domains like .org, .co, .gov etc in the file.

        Is there a regex that handles all of these?

        A 2 Replies Last reply Feb 3, 2022, 1:30 PM Reply Quote 0
        • A
          Alan Kilborn @zcraber
          last edited by Feb 3, 2022, 1:30 PM

          @zcraber said in How to remove duplicate entries in each line:

          I forgot to mention that

          Changes spec after solution is provided. :-(

          1 Reply Last reply Reply Quote 1
          • A
            Alan Kilborn @zcraber
            last edited by Alan Kilborn Feb 3, 2022, 1:32 PM Feb 3, 2022, 1:32 PM

            @zcraber said in How to remove duplicate entries in each line:

            there are various domains like .org, .co, .gov etc in the file

            (?-s)(href="https://.+?\.(?:com|org|gov)/" )(?=\1)

            Z 1 Reply Last reply Feb 3, 2022, 1:37 PM Reply Quote 3
            • Z
              zcraber @Alan Kilborn
              last edited by Feb 3, 2022, 1:37 PM

              @alan-kilborn Thank you.

              Changes spec after solution is provided. :-(

              Sorry about that. Next time I’ll be more specific. :)

              1 Reply Last reply Reply Quote 0
              4 out of 6
              • First post
                4/6
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors