possible to delete almost duplicate lines?
-
I use the line operations to delete duplicate lines in a comma delimited text file. But I get left with a lot of this almost duplicate lines, where I want to only keep the longest line.
Is this possible easily enough?
The shorter lines have double comma at the end, in case not immediately visible. Longer has (usually) 2 chars between those commas
example:
I just want to keep the 2nd line
G7ODA,IO93WS,
G7ODA,IO93WS,PE, -
Does order of the lines matter in the final results?
Can there ever be 3 or more lines that you want to compress into one (ie, could there ever be three or more of the G7ODA lines, or will it always only be a single short and a single long?)Assuming order doesn’t matter, assuming never more than a pair of almost-duplicate lines:
P01AZ,IO55WS,XY, P01AZ,IO55WS,, G7ODA,IO93WS, G7ODA,IO93WS,PE,- Edit > Line Operations > Sort Lines Lexicographically Ascending
- Search > Replace
FIND WHAT =^(.*?,.*?,),*\R\1
REPLACE WITH =$1
SEARCH MODE = regular expression
REPLACE ALL
End Result:
G7ODA,IO93WS,PE, P01AZ,IO55WS,XY,If one or both of my assumptions are wrong, provide enough example data to counter my assumptions (use the
</>button on the toolbar and put the text between the ``` lines it creates), showing both the original data, and how you want it to look at the end…(It’s possible to restore the order, by adding/removing numbers in extra steps… but that gets complicated, and I didn’t want to overwhelm you if the final order of data doesn’t matter. Similarly, the FIND WHAT regex can be made more complex to handle removing one-or-more short lines, but if your data is as simple as my example, then this should be sufficient.)
-
Hi Peter,
No there is only ever the 2 forms of the lines. I usually applut a lex sort then remove duplicate lines.
So I would end up with:G7ODA,IO93WS,
G7ODA,IO93WS,PE,
P01AZ,IO55WS,
P01AZ,IO55WS,XY,I can sort again after as that takes split second.
Thanks for the suggestion, I shall try that.
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login