Community
    • Login

    How to remove duplicate entries in each line

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    6 Posts 2 Posters 620 Views 1 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • zcraberZ Offline
      zcraber
      last edited by zcraber

      I’ve an HTML file with duplicate href attributes like this:

      <a href="https://example.com/" href="https://example.com/" target="_blank">Example</a>
      <a href="https://website.com/" href="https://website.com/" target="_blank">Website</a>
      <a href="https://sample.com/" href="https://sample.com/" target="_blank">Sample</a>
      

      How can I remove the duplicate href attribute in each line using Notepad++?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA Offline
        Alan Kilborn @zcraber
        last edited by Alan Kilborn

        @zcraber

        Maybe try:

        find: (?-s)(href="https://.+?\.com/" )(?=\1)
        repl: nothing
        mode: Regular expression

        zcraberZ 1 Reply Last reply Reply Quote 0
        • zcraberZ Offline
          zcraber @Alan Kilborn
          last edited by zcraber

          Hi @alan-kilborn, thanks a lot for the reply. Appreciate your time.
          I forgot to mention that there are various domains like .org, .co, .gov etc in the file.

          Is there a regex that handles all of these?

          Alan KilbornA 2 Replies Last reply Reply Quote 0
          • Alan KilbornA Offline
            Alan Kilborn @zcraber
            last edited by

            @zcraber said in How to remove duplicate entries in each line:

            I forgot to mention that

            Changes spec after solution is provided. :-(

            1 Reply Last reply Reply Quote 1
            • Alan KilbornA Offline
              Alan Kilborn @zcraber
              last edited by Alan Kilborn

              @zcraber said in How to remove duplicate entries in each line:

              there are various domains like .org, .co, .gov etc in the file

              (?-s)(href="https://.+?\.(?:com|org|gov)/" )(?=\1)

              zcraberZ 1 Reply Last reply Reply Quote 3
              • zcraberZ Offline
                zcraber @Alan Kilborn
                last edited by

                @alan-kilborn Thank you.

                Changes spec after solution is provided. :-(

                Sorry about that. Next time I’ll be more specific. :)

                1 Reply Last reply Reply Quote 0

                Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                With your input, this post could be even better 💗

                Register Login
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors