Community
    • Login

    Remove duplicate links from the end - notpad++

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 3 Posters 352 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • R
      robaned
      last edited by

      Below are links with duplicate names.
      The file names are similar.
      For example

      https://mysite.to/73cn05wqida5/Fabulous.WEB.H264-BRC.mp4.html
      https://mysite.to/lg0g5t7bc6d8/Fabulous.WEB.H264-BRC.mp4.html
      https://mysite.to/1eyjovgzk1f5/Fabulous.720p.WEB.H264-OND.mkv.html
      https://mysite.to/gjm1xuyxmgy5/Fabulous.720p.WEB.H264-OND.mkv.html
      https://mysite.to/lwjny2xiuatk/Fabulous.1080p.AMZN.WEBRip.DD5.1.x264-TTH.mkv.html
      https://mysite.to/6aivx4f1xe86/Fabulous.1080p.AMZN.WEBRip.DD5.1.x264-TTH.mkv.html
      

      I wanted duplicate links to be removed.

      In this way :

      https://mysite.to/73cn05wqida5/Fabulous.WEB.H264-BRC.mp4.html
      https://mysite.to/gjm1xuyxmgy5/Fabulous.720p.WEB.H264-OND.mkv.html
      https://mysite.to/lwjny2xiuatk/Fabulous.1080p.AMZN.WEBRip.DD5.1.x264-TTH.mkv.html
      
      Terry RT 1 Reply Last reply Reply Quote 0
      • Terry RT
        Terry R @robaned
        last edited by

        @robaned
        I suspect you actually just made these up and weren’t consistent.
        For the first 2 lines you picked the first of them to remain, the second being the duplicate. You repeated this for the 5th and 6th lines. However when it comes to the 3rd and 4th lines, you picked the 4th line to output.

        Unless you can identify why you need to do that instead of just selecting the first of any duplicates to remain, no one is going to be able to help you.

        Terry

        1 Reply Last reply Reply Quote 0
        • R
          robaned
          last edited by

          It doesn’t matter which links are removed, I just want duplicates removed.

          Terry RT 1 Reply Last reply Reply Quote 0
          • Terry RT
            Terry R @robaned
            last edited by

            @robaned

            Any solution will always remove the same one in a duplicate set. What you also haven’t told us, is if there will be more than 2 duplicate lines.

            Terry

            1 Reply Last reply Reply Quote 0
            • R
              robaned
              last edited by

              Yes, there are more than two duplicate lines.

              Terry RT 1 Reply Last reply Reply Quote 0
              • Terry RT
                Terry R @robaned
                last edited by

                @robaned
                This solution will keep the last of the duplicate lines for each set.

                This is a regular expression (regex), so search mode in the Replace function must be set to “regular expression”. Make sure the cursor is at the start of the first line and click on Replace All.
                Find What:(?-s)^.+?/([^/]+)\R(?=.+?\1)
                Replace With: nothing here, an empty field.

                Note this will only remove lines that are together in the set, leaving the last of each set.

                Terry

                1 Reply Last reply Reply Quote 2
                • guy038G
                  guy038
                  last edited by guy038

                  Hello @robaned, @terry-r and All,

                  Terry, your regex (?-s)^.+?/([^/]+)\R(?=.+?\1) works as expected. Howewer you still could shorten it !

                  Indeed, if you have begun your regex with (?-s)^.+/, obviously, the remainder of current line cannot contain any / char anymore !

                  Thus, your search regex can be simplified to :

                  (?-s)^.+/(.+)\R(?=.+\1)


                  I found out an other solution which could be faster in case of numerous duplicates :

                  FIND (?-s)^(.+/(.+)\R)(.+/\2\R)+

                  REPLACE $1

                  My solution acts as the opposite of yours : it keeps the first duplicate line of each set !

                  Best Regards,

                  guy038

                  1 Reply Last reply Reply Quote 1
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors