remove duplicated line
-
oh great, tanxs.
anyway i need reg ex string to delete my duplicate line without intervening in the order… -
@pinuzzu99
use remove duplicate lineplugin
-
If you are willing to hit Replace All multiple times, until all duplicates are removed, this worked for me with your example:
- FIND =
(?s)((^.*?$)\R.*)\R*\2(\R|\Z)
- REPLACE =
$1
- MODE = regular expression
After three runs, it had become:
dangsjceamkales@gsnail.com:c6718e7c Tom34f@sogbug.com:y7vk5z9292 zesorex@gmail.com:ploksfasd j096875244@gmail.com:st608g410000 doniel.ctz@homail.com:Cotvxbza22523286 levjaamel@hetmail.com:camxmel2004 Andrewhsjfmesjones00@yahoo.com:Winpfgston99001 szaborefeupert666@gail.com:Rupejffgano666 jodgsjny0531@cofx.net:Draskakgon357 wse_adgel_one@hogmail.com:6947903024 jringahdhsque@hotmail.com:nadfjddkalgo
… which I think is what you wanted.
But yes, @gurikbal-singh’s Remove Duplicate Lines plugin should do what you want, too. Just go to Plugijns > Plugins Admin to install it.
- FIND =
-
oh yes PeterJones, work well! tanxs
but I have to click each time to delete 1 row at a time … and if I had 5000 double rows ???
isn’t there a single command to bulk remove everything in one go?and thanks for the advice of the “remove duplicate line” plug-in. I didn’t know it existed, now I prove it. thank you
-
@pinuzzu99 said in remove duplicated line:
but I have to click each time to delete 1 row at a time … and if I had 5000 double rows ???
isn’t there a single command to bulk remove everything in one go?Regex aren’t infinitely powerful. You can do a lot with them, but if you want to do super-complicated things, sometimes it’s better to use a full-blown programming language (which is what the plugin does, obviously).
For example, in perl, running from the command line, it could be done with a readable 3-line script, or the condensed oneliner:
perl -pi.bak -e "chomp($k=$_);$_=''if$h{$k};++$h{$k}" filename
, which would save the original tofilename.bak
, and delete the duplicate lines when re-generatingfilename
, assuming there’s enough memory to create the hash (map) which checks for duplicates. If memory became a concern, you could sacrifice speed for memory and generate a shorter key (maybe using crc32 or similar algorithm) to get a 1:1 mapping of line-of-text to key, but have the keys be short enough that they don’t overflow your memory – but this isn’t a general programming-help forum, so I won’t go any farther than that. -
ok, understand. you have been very clear.
at this point I will use the reg-ex for simple things, and the plug-in for the more complicated txt. thank you for your support. -
hey guy038 do you don’t have valid recipe to do it all in one shot?
I do not mean like string (?s)((^.?$)\R.)\R*\2(\R|\Z)
REPLACE = $1
work only with one value at a time…
plug-in duplicate line work fine, but refine reg-ex it’s not possible? -
It is possible that regex could work, but it is possible to overwhelm the regex engine with such an execution. You will know you have done this because the entire document will become selected. Better to do it in a non-regex way.
-
Hello @pinuzzu99, @ekopalypse, @gurikbal-singh, @peterjones, @alan-kilborn and All,
Sorry for my late answer : I did a 3-days ski trip to Les Arcs 1800 French resort. We were a group of 14 people. Unfortunately, sun was not there the first two days and on the last day, no skiing due to snow showers !
Luckily, a one-go regex S/R is possible ;-))
So, assuming the input text, below :
Andrewhsjfmesjones00@yahoo.com:Winpfgston99001 dangsjceamkales@gsnail.com:c6718e7c Tom34f@sogbug.com:y7vk5z9292 zesorex@gmail.com:ploksfasd j096875244@gmail.com:st608g410000 doniel.ctz@homail.com:Cotvxbza22523286 zesorex@gmail.com:ploksfasd levjaamel@hetmail.com:camxmel2004 Andrewhsjfmesjones00@yahoo.com:Winpfgston99001 szaborefeupert666@gail.com:Rupejffgano666 jodgsjny0531@cofx.net:Draskakgon357 zesorex@gmail.com:ploksfasd wse_adgel_one@hogmail.com:6947903024 j096875244@gmail.com:st608g410000 j096875244@gmail.com:st608g410000 jringahdhsque@hotmail.com:nadfjddkalgo Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
Use the following regex S/R :
SEARCH
(?-is)^(.+)\R(?=(?s).*^\1)
REPLACE
Leave EMPTY
And you’ll get the output text
dangsjceamkales@gsnail.com:c6718e7c Tom34f@sogbug.com:y7vk5z9292 doniel.ctz@homail.com:Cotvxbza22523286 levjaamel@hetmail.com:camxmel2004 szaborefeupert666@gail.com:Rupejffgano666 jodgsjny0531@cofx.net:Draskakgon357 zesorex@gmail.com:ploksfasd wse_adgel_one@hogmail.com:6947903024 j096875244@gmail.com:st608g410000 jringahdhsque@hotmail.com:nadfjddkalgo Andrewhsjfmesjones00@yahoo.com:Winpfgston99001
Notes :
-
This regex searches for any non-empty line, separated from an identical line, case included, by any range of characters, possibly nul and/or multi-lines Thus, it deletes all duplicates of a line, located before this original line
-
The first part
(?-is)
is the traditional in-line modifiers ( so dot =1
standard char and case taken in account ) -
Then, the part
^(.+)\R
, searches the contents of any non-empty line, from the beginning, stored as group1
and followed with its line-break\R
-
The last part
(?=(?s).*^\1)
is a positive look-ahead structure,(?=........)
, that is to say a condition which must be true, in order to validate the overall match, but which is never part of the overall match !-
The part
(?s).*
represents any range, even nul, of any kind of characters ( standard or EOL chars ), due to the(?s)
modifier -
The part
^\1
matches the same range of characters\1
, beginning a line
-
-
As the replacement zone is empty, any line, with its line-break, which is repeated downwards, is then deleted
Remark :
In an huge file, if two identical lines are separated by a lot of text/lines, this regex S/R may fail and wrongly finds an all contents file match. For instance :
-
Two lines, separated with
1600
all different lines, of32
characters each, give a correct result of1
occurrence ( The line with a duplicate ) -
Two lines, separated with
1700
all different lines, of32
characters each, give a incorrect result of2
occurrences ( The line with a duplicate and all file contents )
Best Regards,
guy038
-
-
tanxs guy038.
I’m glad you went ski, even if the weather was not perfect… every now and then it is good to detach from the pc!
tanxs for your reply, but not just for the answer itself, as for the spirit you put into it…
thank you so much for your very appreciated answers.