Replace Lines from different files
- 
 Hi, @guy038 
 Original file has 550k lines :) (14.6MB)
 and it has 30 different languages
 my translated file has 17k lines only with translation
 of the specific language
 can I hear about your solution also?
- 
 Hi, @alexei-kurakin, @peterjones and **All, Thanks for your additional information. This confirms that a regex solution should be OK ! Just follow all the following steps, carefully ! 
 - 
Copy-paste your originalfile in a new file named, lest’s say,output.txt
- 
At the end of the output.txtfile, add a new line with, for instance, severalequalsigns
- 
Then, right after, add the contents of your T translatedfile
 Thus, the new output.txtfile should contain the temporary text :GER;10 ;Ja ENG;10 ;Yes ITA;10 ;Sì BGL;10 ; FRC;10 ;Oui NLD;10 ;Ja ESP;10 ;Sí GER;11 ;\nHöhe des Bodens ENG;11 ;\nHeight of bottom panel ITA;11 ;\nAltezza ripiano (distanza da terra) BGL;11 ; FRC;11 ;\nHauteur du fond NLD;11 ;\nHoogte van de bodem ESP;11 ;\nEspesor de la base =============================================== BGL;10 ;TEXT1 BGL;11 ;TEXT2 BGL;12 ;TEXT3 BGL;14 ;TEST- 
Move at the very beginning of the output.txtfile
- 
Now, open the Replace dilaog ( Ctrl+ H)
- 
Uncheck all box options 
- 
SEARCH (?x-is) ^ ( .+ ) \h+ ; $ (?= (?s: .+ =+ .+? ) ^ \1 \h+ ; ( .+ ) $ ) | (?s) ^ =+ .+
- 
REPLACE ?1$0\2
- 
Click once on the Replace Allbutton ( or many times on theReplacedialog )
 Here you are ! You get your expected text : GER;10 ;Ja ENG;10 ;Yes ITA;10 ;Sì BGL;10 ;TEXT1 FRC;10 ;Oui NLD;10 ;Ja ESP;10 ;Sí GER;11 ;\nHöhe des Bodens ENG;11 ;\nHeight of bottom panel ITA;11 ;\nAltezza ripiano (distanza da terra) BGL;11 ;TEXT2 FRC;11 ;\nHauteur du fond NLD;11 ;\nHoogte van de bodem ESP;11 ;\nEspesor de la base=> - 
All the lines, which ended with a semicolon, are now completed with their translated counterparts 
- 
The line of equalsignes and the contents of thetranslatedfile, below, have been removed, as well
 
 I also assume that your translatedfile does not contain the “same” line with different translations !For instance : BGL;10 ;TEXT1 BGL;11 ;TEXT2 BGL;10 ;A SECOND translation BGL;12 ;TEXT3 BGL;11 ;an other text BGL;13 ;TEST BGL;10 ;A THIRD translationHowever, don’t worry as this regex always choose the first tranlatedword found of the list !So, regarding this part of the tranlatedfile, theoriginalfile would contain :BGL;10 ;TEXT1 BGL;11 ;TEXT2 BGL;12 ;TEXT3 BGL;13 ;TESTBest Regards, guy038 
- 
- 
 Hello, @alexei-kurakin, I think that I spoke too quickly ! Because I initially thought that your originalfile had a size of550 kb( not550klines ! )So, I’m afraid that my regex method is useless and that you need a scripting solution ! BR guy038 You may test my regex solution with a smaller originaland, may be, a smallertranslatedfiles ! But it should not work for your exact file sizes :-((I did a quick test : with an original file of about 100,000lines, it will be still OK. But for a file containing500,000lines, the results are erroneous !
- 
 @guy038 Hi, this is very close to what I’m doing. Could you help me as well? I have two files. One has text like this (a total of 27k lines): { 
 “id”: 2257,
 “Key”: “gfr”,
 “enUS”: “ÿc8•ÿc:Flawed R”,
 “zhTW”: “瑕疵紅寶石”,
 “deDE”: “fehlerhafter Rubin”,
 “esES”: “Rubí estropeado”,
 “frFR”: “Rubis imparfait”,
 “itIT”: “Rubino Incrinato”,
 “koKR”: “하급 루비”,
 “plPL”: “Rubin ze skazą”,
 “esMX”: “Rubí imperfecto”,
 “jaJP”: “傷のあるルビー”,
 “ptBR”: “Rubi Imperfeito”,
 “ruRU”: “Мутный рубин”,
 “zhCN”: “有瑕疵的红宝石”
 },
 {
 “id”: 2258,
 “Key”: “gsr”,
 “enUS”: “ÿc8•ÿc:Ruby”,
 “zhTW”: “紅寶石”,
 “deDE”: “Rubin”,
 “esES”: “Rubí”,
 “frFR”: “Rubis”,
 “itIT”: “Rubino”,
 “koKR”: “루비”,
 “plPL”: “Rubin”,
 “esMX”: “[ms]Rubí”,
 “jaJP”: “ルビー”,
 “ptBR”: “Rubi”,
 “ruRU”: “[ms]Рубин”,
 “zhCN”: “红宝石”
 },and another file that has different values for the enUS strings. I want to replace every instance of text inside quotation marks in every enUS line with text from a similar file. For example: replace the sting from file 1: “enUS”: “ÿc8•ÿc:Flawed R”, with the string from file 2: “enUS”: “Flawed ruby”, What do I modify in this search line?: (?x-is) ^ ( .+ ) \h+ ; $ (?= (?s: .+ =+ .+? ) ^ \1 \h+ ; ( .+ ) $ ) | (?s) ^ =+ .+ Thanks! 
- 
 @Ted-Plum 
 This file is JSON.
 I would strongly recommend using a scripting language like Python to work with it. There are various nice JSON plugins, but none that work well for this specific use case.
 You can use regex to work with JSON, but regexes will hit lots of evil edge cases in JSON that will make even the most hardened regexers question their sanity.I don’t feel like writing an actual Python script for this right now, but it would probably look something like # this is pseudo-code, don't try to execute it! import json import Npp text_from = Npp.getTextOfFile(filename_from) text_to = Npp.getTextOfFile(filename_to) json_from = json.loads(text1) json_to = json.loads(text2) # at this point I'm assuming that both json files are objects mapping strings like "id" to other things for key, val in json_from.items(): json_to[key] = val mutated_text_to = json.dumps(json_to, indent=4) Npp.setTextOfFile(filename_to, mutated_text_to)
- 
 @Mark-Olson I’m not a programmer, I’m in the process of learning HTML and CSS currently, so my knowledge is limited. I’d like to learn this sort of skill, could you point me to where to begin, to make it faster. I hope it doesn’t involve learning the whole of Python. I want to learn it eventually, but that would probably take too long to be relevant to this particular problem. 
- 
 Hello, @ted-plum, Could you, first, show us some records of your File_2?To my mind, it should be listed, like below, in the SAME order than the enUSvalues found inFile_1?"enUS": "Flawed Ruby", "enUS": "Text_2" ... ...Best Regards, guy038 
- 
 @guy038 here: https://drive.google.com/file/d/1D6eQbp0ZZWdsTO-ARlFCNgd4zJ2zvhvI/view?usp=sharing https://drive.google.com/file/d/1zKY6pC3KK0egW1IyklMopP-6bJjrAAjw/view?usp=sharing these are the two versions of the same list. Not all enUS values need replacing, some are the same, some are absent in the file, from which I want to take the replacements, and they need to remain. 
- 
 @Ted-Plum said in Replace Lines from different files: I’m not a programmer, I’m in the process of learning HTML and CSS currently, so my knowledge is limited. I’d like to learn this sort of skill, could you point me to where to begin, to make it faster. I hope it doesn’t involve learning the whole of Python. Nobody learns the whole of Python. It’s a general-purpose language, and learning just a little bit can be really useful for just making your everyday life easier. For better or for worse, you are currently working with raw JSON, and Python is one of the best ways for working with JSON. JavaScript is also great for this, but Notepad++ doesn’t have JS scripting support. I’ve looked at your data (I’m a huge Diablo 2 fan, by the way) and here’s my solution: from Npp import * import json # both files must be initially open in Notepad++ FROM_PATH = r'c:\full\path\to\from_items.json' notepad.activateFile(FROM_PATH) # the file that you will be taking enUS values FROM text_from = editor.getText() # read the entire file as a Python string json_from = json.loads(text_from) # this translates the file into Python objects that can be manipulated more easily than text # now we do the same thing for the file that you will be moving enUS values TO TO_PATH = r'c:\full\path\to\to_items.json' notepad.activateFile(TO_PATH) json_to = json.loads(editor.getText()) # we don't know for sure what items are in from_items, # so we will only get the ones that exist from_keys_to_enUS = {} for item in json_from: # if this item has an enUS translation in this file, we map the key # (which should be the same across both lists) # to the enUS translation. enUS = item.get('enUS') if enUS: from_keys_to_enUS[item['Key']] = enUS for item in json_to: key = item['Key'] # now we check if the old file has an "enUS" entry for this item from_enUS = from_keys_to_enUS.get(key) if from_enUS: # this transfers the enUS entry to the target file item['enUS'] = from_enUS # now we've transferred enUS values FROM the source file to the target file # open up the target file and dump our edited values into it notepad.activateFile(TO_PATH) editor.setText(json.dumps(json_to, indent=4))There are so many different ways to approach this problem in Python, and I chose the one that’s most Notepad++ - friendly. As for resources that one could use to get better at Python, three of my best friends when I was a noob were: - StackOverflow (obviously)
- the Python standard library documentation (links to json and tutorial
- Python for Everybody.
 
- 
 @Mark-Olson, thanks, I’ll have a go with the script. Edit: All right, you’ve answered my question in your edit. 
- 
 @Ted-Plum 
 See the resources that I linked in my most recent edit of the post.I can’t promise that learning Python will make your life better. Initially I expect you will find it frustrating, but I get the impression that you will find plenty of opportunities to use your learnings before too long. 
- 
 Hello, @ted-plum, @mark-olson and All, Ah… OK, @ted-plum. The two files are quite similar in size and contents ! So, I downloaded your two files from Google Drive 
 - 
The first one is called !item-names.json. It’s aUTF-8-BOMencoded file with Windows line-breaks (\r\n)
- 
The second one is called item-names.json. It’s aUTF-8encoded file with Unix line-break (\n)
 In order to compare these two files easily : - 
I first normalized these two files to the usual UTF-8encoding and to Windows line-breaks. So :- 
I used the View > Encodingoption for the!item-names.jsonfile
- 
I used the Edit > EOL Conversion > Window (CRLF)for theitem-names.jsonfile
 
- 
- 
Secondly, I deleted the outer squarebrackets in the two files
- 
Thirdly, I run the following regex S/R onto these two files - 
SEARCH (?<!\},)(?<!\n)\r\n
- 
REPLACE \t
 
- 
 In order to get a single JSONrecord per line (24,576occ. for !item-names.json and25,424occ. for item-names.json )
 Then, comparing these two files, I noticed that : - 
Some records are different because field(s), other than enUS, are modified
- 
Some records are different because field enUSis modified
- 
Some records are different because, both, field enUSand field(s), other thanenUS, are modified
- 
Near the end of the !item-names.jsonfile,6records are deleted ( from id =27345to id =27350)
- 
Near the end of the item-names.jsonfile, a54records are added ( from id =61000to id =61051+61072and61073)
 
 Thus, @ted-plum : - 
Generally speaking, do we have to modify the first !item-names.jsonfile, using data from theitem-names.jsonfile or the opposite ?
- 
Do we have to take care about the enUSchanges ONLY or do we have to take all the changes in account ?
- 
Do we have to add/delete the new lines, as well ? 
 Best Regards guy038 
- 
- 
 @Mark-Olson, I’ve poked around with the script you’d written, but no luck so far. When trying to run it I keep getting errors such as: No module named ‘Npp’ or, if I try running it without the “from Npp import *” line, I get: name ‘notepad’ is not defined Am I missing something? 
- 
 @Ted-Plum said in Replace Lines from different files: No module named ‘Npp’ 
 or, if I try running it without the “from Npp import *” line, I get:
 name ‘notepad’ is not definedYou are running this from the PythonScript plugin of Notepad++, yes? Cheers. 
- 
 @guy038, the file item-names.json has to remain intact, encoded in UTF-8, except for the corresponding enUS values, which should be taken from the !item-names.json, no new lines should be added, and no lines that don’t exist in !item-names.json should be removed from item-names.json. The file with the exclamation mark is just a donor file, that has the right enUS values. 
- 
 @Michael-Vincent, yes, I’ve installed Python, installed the PythonScript v2 plugin into NPP, and for good measure tried the NppExec plugin. I open both of my JSON files in NPP, open a file with your script pasted into it, and run it through NPP. Yes, I’ve also renamed the JSON files to match your script. 
- 
 @Ted-Plum said in Replace Lines from different files: No module named ‘Npp’ yes, I’ve installed Python, installed PythonScript v2, and for good measure tried NppExec. I open both of my JSON files in NPP, open a file with your script pasted into it, and run it through NPP. There’s no need to install Python as a standalone thing. If there is, then pursuing problems with that is off-topic to this forum. Please make sure you’ve done due-diligence at reading, understanding and following the basic steps to Pythonscripting, found in the FAQ for this forum HERE. and for good measure tried NppExec Doing random things isn’t going to help. 
- 
 @Alan-Kilborn, thanks. 
- 
 @Michael-Vincent @Mark-Olson, never mind my two previous comments. I’ve managed to run Mark’s script, and here’s the result :) https://drive.google.com/file/d/1b1yjTI027hss6ue-wJw6FnqyuRsecUiQ/view?usp=sharing 
- 
 @Ted-Plum 
 Glad you got it working!One final meta-suggestion: if you have a JSON plugin and the ComparePlus plugin installed, you can easily compare two JSON files. 
 First, you want to pretty-print both JSON files (so that the same formatting rules are applied to both) and then you want to use ComparePlus to compare them.I do this all the time, and it works quite well. Alternatively x = json.loads(json_string_1) y = json.loads(json_string_2) assert x == ywill check if json_string_1andjson_string_2are equivalent JSON.



