Community
    • Login

    remove duplicated line

    Scheduled Pinned Locked Moved General Discussion
    12 Posts 6 Posters 905 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE
      Ekopalypse @pinuzzu99
      last edited by

      @pinuzzu99

      if it is not needed to keep the ordering you can do
      Edit->Line Operations->Sort Lines …
      Edit->Line Operations->Remove Consecutive Duplicate Lines

      1 Reply Last reply Reply Quote 0
      • pinuzzu99P
        pinuzzu99
        last edited by

        oh great, tanxs.
        anyway i need reg ex string to delete my duplicate line without intervening in the order…

        rinku singhR 1 Reply Last reply Reply Quote 0
        • rinku singhR
          rinku singh @pinuzzu99
          last edited by

          @pinuzzu99
          use remove duplicate line plugin

          1 Reply Last reply Reply Quote 0
          • PeterJonesP
            PeterJones
            last edited by

            @pinuzzu99 ,

            If you are willing to hit Replace All multiple times, until all duplicates are removed, this worked for me with your example:

            • FIND = (?s)((^.*?$)\R.*)\R*\2(\R|\Z)
            • REPLACE = $1
            • MODE = regular expression

            After three runs, it had become:

            dangsjceamkales@gsnail.com:c6718e7c
            Tom34f@sogbug.com:y7vk5z9292
            zesorex@gmail.com:ploksfasd
            j096875244@gmail.com:st608g410000
            doniel.ctz@homail.com:Cotvxbza22523286
            levjaamel@hetmail.com:camxmel2004
            Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
            szaborefeupert666@gail.com:Rupejffgano666
            jodgsjny0531@cofx.net:Draskakgon357
            wse_adgel_one@hogmail.com:6947903024
            jringahdhsque@hotmail.com:nadfjddkalgo
            
            

            … which I think is what you wanted.

            But yes, @gurikbal-singh’s Remove Duplicate Lines plugin should do what you want, too. Just go to Plugijns > Plugins Admin to install it.

            1 Reply Last reply Reply Quote 0
            • pinuzzu99P
              pinuzzu99
              last edited by

              oh yes PeterJones, work well! tanxs
              but I have to click each time to delete 1 row at a time … and if I had 5000 double rows ???
              isn’t there a single command to bulk remove everything in one go?

              and thanks for the advice of the “remove duplicate line” plug-in. I didn’t know it existed, now I prove it. thank you

              1 Reply Last reply Reply Quote 0
              • PeterJonesP
                PeterJones
                last edited by

                @pinuzzu99 said in remove duplicated line:

                but I have to click each time to delete 1 row at a time … and if I had 5000 double rows ???
                isn’t there a single command to bulk remove everything in one go?

                Regex aren’t infinitely powerful. You can do a lot with them, but if you want to do super-complicated things, sometimes it’s better to use a full-blown programming language (which is what the plugin does, obviously).

                For example, in perl, running from the command line, it could be done with a readable 3-line script, or the condensed oneliner: perl -pi.bak -e "chomp($k=$_);$_=''if$h{$k};++$h{$k}" filename, which would save the original to filename.bak, and delete the duplicate lines when re-generating filename, assuming there’s enough memory to create the hash (map) which checks for duplicates. If memory became a concern, you could sacrifice speed for memory and generate a shorter key (maybe using crc32 or similar algorithm) to get a 1:1 mapping of line-of-text to key, but have the keys be short enough that they don’t overflow your memory – but this isn’t a general programming-help forum, so I won’t go any farther than that.

                1 Reply Last reply Reply Quote 0
                • pinuzzu99P
                  pinuzzu99
                  last edited by

                  ok, understand. you have been very clear.
                  at this point I will use the reg-ex for simple things, and the plug-in for the more complicated txt. thank you for your support.

                  1 Reply Last reply Reply Quote 0
                  • pinuzzu99P
                    pinuzzu99
                    last edited by

                    hey guy038 do you don’t have valid recipe to do it all in one shot?
                    I do not mean like string (?s)((^.?$)\R.)\R*\2(\R|\Z)
                    REPLACE = $1
                    work only with one value at a time…
                    plug-in duplicate line work fine, but refine reg-ex it’s not possible?

                    Alan KilbornA 1 Reply Last reply Reply Quote 0
                    • Alan KilbornA
                      Alan Kilborn @pinuzzu99
                      last edited by

                      @pinuzzu99

                      It is possible that regex could work, but it is possible to overwhelm the regex engine with such an execution. You will know you have done this because the entire document will become selected. Better to do it in a non-regex way.

                      1 Reply Last reply Reply Quote 1
                      • guy038G
                        guy038
                        last edited by guy038

                        Hello @pinuzzu99, @ekopalypse, @gurikbal-singh, @peterjones, @alan-kilborn and All,

                        Sorry for my late answer : I did a 3-days ski trip to Les Arcs 1800 French resort. We were a group of 14 people. Unfortunately, sun was not there the first two days and on the last day, no skiing due to snow showers !


                        Luckily, a one-go regex S/R is possible ;-))

                        So, assuming the input text, below :

                        Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
                        dangsjceamkales@gsnail.com:c6718e7c
                        Tom34f@sogbug.com:y7vk5z9292
                        zesorex@gmail.com:ploksfasd
                        j096875244@gmail.com:st608g410000
                        doniel.ctz@homail.com:Cotvxbza22523286
                        zesorex@gmail.com:ploksfasd
                        levjaamel@hetmail.com:camxmel2004
                        Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
                        szaborefeupert666@gail.com:Rupejffgano666
                        jodgsjny0531@cofx.net:Draskakgon357
                        zesorex@gmail.com:ploksfasd
                        wse_adgel_one@hogmail.com:6947903024
                        j096875244@gmail.com:st608g410000
                        j096875244@gmail.com:st608g410000
                        jringahdhsque@hotmail.com:nadfjddkalgo
                        Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
                        

                        Use the following regex S/R :

                        SEARCH (?-is)^(.+)\R(?=(?s).*^\1)

                        REPLACE Leave EMPTY

                        And you’ll get the output text

                        dangsjceamkales@gsnail.com:c6718e7c
                        Tom34f@sogbug.com:y7vk5z9292
                        doniel.ctz@homail.com:Cotvxbza22523286
                        levjaamel@hetmail.com:camxmel2004
                        szaborefeupert666@gail.com:Rupejffgano666
                        jodgsjny0531@cofx.net:Draskakgon357
                        zesorex@gmail.com:ploksfasd
                        wse_adgel_one@hogmail.com:6947903024
                        j096875244@gmail.com:st608g410000
                        jringahdhsque@hotmail.com:nadfjddkalgo
                        Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
                        

                        Notes :

                        • This regex searches for any non-empty line, separated from an identical line, case included, by any range of characters, possibly nul and/or multi-lines Thus, it deletes all duplicates of a line, located before this original line

                        • The first part (?-is) is the traditional in-line modifiers ( so dot = 1 standard char and case taken in account )

                        • Then, the part ^(.+)\R, searches the contents of any non-empty line, from the beginning, stored as group 1 and followed with its line-break \R

                        • The last part (?=(?s).*^\1) is a positive look-ahead structure, (?=........), that is to say a condition which must be true, in order to validate the overall match, but which is never part of the overall match !

                          • The part (?s).* represents any range, even nul, of any kind of characters ( standard or EOL chars ), due to the (?s) modifier

                          • The part ^\1 matches the same range of characters \1, beginning a line

                        • As the replacement zone is empty, any line, with its line-break, which is repeated downwards, is then deleted

                        Remark :

                        In an huge file, if two identical lines are separated by a lot of text/lines, this regex S/R may fail and wrongly finds an all contents file match. For instance :

                        • Two lines, separated with 1600 all different lines, of 32 characters each, give a correct result of 1 occurrence ( The line with a duplicate )

                        • Two lines, separated with 1700 all different lines, of 32 characters each, give a incorrect result of 2 occurrences ( The line with a duplicate and all file contents )

                        Best Regards,

                        guy038

                        1 Reply Last reply Reply Quote 1
                        • pinuzzu99P
                          pinuzzu99
                          last edited by

                          tanxs guy038.
                          I’m glad you went ski, even if the weather was not perfect… every now and then it is good to detach from the pc!
                          tanxs for your reply, but not just for the answer itself, as for the spirit you put into it…
                          thank you so much for your very appreciated answers.

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors