How to compare 2 text files and delete duplicates
-
How to compare 2 text files and delete duplicates
would compare + perform this
Thanks
-
@JAK ,
ComparePlus is great for doing a “diff”, where it shows the difference. But it’s not really for quickly/easily deleting extras
It depends on exactly what you have, but there are some simple ways to delete things from one file that are found in another:
If you have
file1:apple banana carrot daikon eggplant fig grapefile2:apple carrot jalapenoIf you copy all the contents of
file2and paste them before a---at the beginning offile1, like:apple carrot jalapeno --- apple banana carrot daikon eggplant fig grapeThen File > Line Operations > Remove Duplicate Linees will remove the second (or more) occurrence of any line. So the second
appleandcarrotlines, leavingapple carrot jalapeno --- banana daikon eggplant fig grapethen delete everything before and including the
---line, and yourfilewill now have every line that was infile2removed fromfile1. (It will also delete duplicates insidefile1, so iffile1had started with an extra fig after the grape, only the first fig would remain.)If that doesn’t do what you want, you will have to give more details about your rules.
Also, if you want an easy way to delete everything before and including the
---line, use FIND WHAT =(?s)\A.*^---\R, REPLACE = empty, Wrap Around = Checkmarked, Search Mode = Regular Expression, Replace All -
Hi @JAK ,
You can use ComparePlus but from the menu choose
Find Unique Linescommand. It will mark all unique lines in a file (that are not found in the other file).
Then useDiff Visual Filters...to hide all diffs (this will hide all unique lines). Don’t worry if there is still one visible diff line on top - first document line cannot be hidden but it is not a problem for the next operation.
Then select the file from which you’d like to remove duplicating lines (set the focus/caret in that file) and executeDelete all/selected visible linesfrom the ComparePlus plugin menu.
This will do what you are trying to accomplish.BR