Eliminating duplicate (identical) lines
-
Hell!
Is there any way that will find and eliminate duplicate lines?Example:
Machine 1
Machine 2
Machine 2
Machine 2
Machine 3
Machine 4
Machine 4
Machine 5Once cleaned up, should give:
Machine 1
Machine 2
Machine 3
Machine 4
Machine 5Thank you for any suggestion!
Ed -
Hello, Ed,
The suppression of all the duplicate lines, in a pre-sorted file, can be easily obtained with a Search/Replacement, in Regular expression mode !
-
Open your file, containing the sorted list of items
-
Open the Replace dialog ( CTRL + H )
-
In the Search what: field, type
(?-s)(^.+\R)\1+ -
In the Replace with: field, type
\1 -
Check the Regular expression radio button
-
Click on the Replace All button
Et voilà !!
Notes :
-
The
(?s)in-line modifier ensures you that the special regex dot character will match standard characters, only, even if you, previously, checked the . matches newline option ! -
Then, the part
^.+\Rmatches all the characters (.+) of any non-empty line, between the beginning of line (^) and its End of Line character(s) (\R), included -
So, the part
(^.+\R), enclosed by round brackets, simply stores any complete line contents, as group 1 -
Finally the part
\1+tries to match any positive amount of subsequent identical lines, following the previous line -
And if a overall match can be found, all that block of identical lines, is just replaced by the group 1 (
\1), that is to say, ONE copy of that block
Best Regards,
guy038
-
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login