Need Help Comparing Two Text Files
-
Hello, I need help comparing the differences between two large txt. files that contains over 20k lines.
They’re basically a game’s string files and an old volunteered translation project of ours.
They contain texts with unique ID’s at the beginning of each line - you can see them at the end of the post.
One is an old version, other is the new version. I’ll call the old version “file1”, and the new version “file2”.
There are removed, changed and new lines in the file2.
I want to update our file1 to the latest version but for that to happen I need to work on the file2 to locate/translate the new lines and replace the same lines from the file1.
But I cannot just detect the new lines from the file2 and add them to the file1. Or I cannot copy the whole file1 and paste it into file2 and translate the new lines. It wouldn’t work that way because the new lines are scattered, not added at the end of the file. And there are also removed and changed lines too. So I have to make the changes in the file2.
First I have to copy and paste our file1 into the file2 but I cannot do that as a whole since there are removed, same, edited and new lines in the file2 - so the alignment of the ID’s are different. I have to do it in a way that doesn’t break the alignment. Because the tool I use to import the changes only allow import from the same file with the same exact layout, down to the number of lines.
These are the things I need to find out how to do in an effective way:- Removed lines: two files should be compared, ID’s that don’t exist in the file2 should be deleted from the file1 so that they don’t make a mess while copy pasting unchanged lines later.
- Unchanged lines: already exist in the file1 need to be transferred to their relevant spot in the file2, maybe matching ID’s in two files can help locate their new place/line number?
- Edited lines: lines/ID’s that have updated content in the file2 should be detected, extracted, translated and imported back to where they were
- New lines: should be detected, extracted, translated and imported back to where they were in the file2
After this I should have an updated file2 that contains our old translation + new translation and compatible with the latest version.
————————————
I have already tried the following but could not figure out how to do what I wanted:
-
First I used Notepad++’s Compare plug-in, it showed me the differences but I could only view them, cannot export or import them the way I wanted. (https://imgur.com/a/vAOzeiw)
-
Then I deleted the contents of every line using |.* command to have only their ID’s shown, so I can detect the different lines easier without the disruption of texts. I then used an online tool to detect and extract the deleted and added lines, but it still didn’t help me to do the things I wanted - I could only view them, couldn’t export, edit, import them as a whole. (https://imgur.com/a/ds8yxnS)
-
@usamarslan I’m afraid Notepad++ isn’t the right tool to do your task. It sounds like you need a tool for joining CSV files like the freeware command line tools
xsv
(https://github.com/BurntSushi/xsv) orlistcompare
(https://mg-tools.weebly.com/). However, probably no tool can do exactly what you want. In this case, you need to write a script (e.g. in Python) to fulfil your requirements.