Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    How to remove duplicate entries in each line

    Help wanted · · · – – – · · ·
    2
    6
    86
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • zcraber
      zcraber last edited by zcraber

      I’ve an HTML file with duplicate href attributes like this:

      <a href="https://example.com/" href="https://example.com/" target="_blank">Example</a>
      <a href="https://website.com/" href="https://website.com/" target="_blank">Website</a>
      <a href="https://sample.com/" href="https://sample.com/" target="_blank">Sample</a>
      

      How can I remove the duplicate href attribute in each line using Notepad++?

      Alan Kilborn 1 Reply Last reply Reply Quote 0
      • Alan Kilborn
        Alan Kilborn @zcraber last edited by Alan Kilborn

        @zcraber

        Maybe try:

        find: (?-s)(href="https://.+?\.com/" )(?=\1)
        repl: nothing
        mode: Regular expression

        zcraber 1 Reply Last reply Reply Quote 0
        • zcraber
          zcraber @Alan Kilborn last edited by zcraber

          Hi @alan-kilborn, thanks a lot for the reply. Appreciate your time.
          I forgot to mention that there are various domains like .org, .co, .gov etc in the file.

          Is there a regex that handles all of these?

          Alan Kilborn 2 Replies Last reply Reply Quote 0
          • Alan Kilborn
            Alan Kilborn @zcraber last edited by

            @zcraber said in How to remove duplicate entries in each line:

            I forgot to mention that

            Changes spec after solution is provided. :-(

            1 Reply Last reply Reply Quote 1
            • Alan Kilborn
              Alan Kilborn @zcraber last edited by Alan Kilborn

              @zcraber said in How to remove duplicate entries in each line:

              there are various domains like .org, .co, .gov etc in the file

              (?-s)(href="https://.+?\.(?:com|org|gov)/" )(?=\1)

              zcraber 1 Reply Last reply Reply Quote 3
              • zcraber
                zcraber @Alan Kilborn last edited by

                @alan-kilborn Thank you.

                Changes spec after solution is provided. :-(

                Sorry about that. Next time I’ll be more specific. :)

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Copyright © 2014 NodeBB Forums | Contributors