@Rockberto-Manenti said in Compare two txt files:
Each of the two files has each line with the name of the artist and the title of the album.
Now I would like to compare the two txt files in order to highlight on the second file all the lines that are NOT present in the first file in order to understand if the albums are already cataloged or not.
For this problem, I would use the find and replace facility with regular expressions. This is going to look like a lot of steps, but it’s not as complicated as it sounds when you see it in action.
First, open the file with the catalog. To ensure you don’t accidentally mess it up, I recommend immediately using File | Save As… to save it under a new name.
Select Search | Replace… from the menu, fill in this:
Find what: $
Replace with: \tCatalog
Wrap around: checked
Search Mode: Regular expression
and click Replace All.
You’ll see that adds a tab and the word Catalog to the end of each line. You can close the dialog for now.
Now, without closing this file, open the file that has your current list. (It will open in another tab.) From the main menu, select Edit | Select All and after that, select Edit | Copy. Now select File | Close and you’ll be back to the file with the “Catalog” tags.
Scroll to the end of the file by pressing Ctrl+End on your keyboard. You will either be at the end of the last line of text or at the beginning of an empty line. If you are at the end of a line of text, press the Enter key so you are at the beginning of an empty line.
Now, select Edit | Paste. That will copy the other file at the end of this file.
Now select Edit | Line Operations | Sort Lines Lexicographically Ascending.
That will put pairs of lines from the two files together. Now all that remains is to remove matched pairs, which you can do with another regular expression. Open Search | Replace… again, but this time, enter:
Find what: (?i-s)^(.*)\R\1\tCatalog(\R|\z)
Replace with: (empty — not even a blank)
Leave the other settings as they were the last time and click Replace All.
You’ll be left with a list of lines that were in the current list but not in the catalog list. If any lines were in the catalog list but not the current list, they’ll have a tab and the word “Catalog” at the end.