Need help mass replacing text.

Reece Asquith-Jepson

I have a few files which look like this.

<entry name=“text0003” offset=“00” relocatable=“true” max_length=“0”>
<original>[雷門/らいもん]の[裏/うら]のキャプテンと[呼/よ]ばれている。いつでも[後輩想/こうはいおも]いで　[面倒見/めんどうみ]の[良/よ]い[性格/せいかく]。</original>
<edited>Some call him the heart of the Raimon team. A kind, big brother type. </edited>
<subEntries />
</entry>

i want to make is so that i can get any text that is in the </edited> and merge it into a similar file with the exact same layout.

cipher-1024

You’re probably going to have to give some more details. You say you want to “merge” it into a similar file with the same layout. Does the target file have all the same tags and attributes except it’s lacking the <edited> content? Is that the only XML tree in the file or are there several <edited> tags in each file. You’re going to get more accurate help if you can post something like “I have this for a source… I have this for a target… and I want the target to look like this.” I’m sure some regex genius will come along and be able to help you out. Personally I think Guy has a PCRE engine compiled in his frontal lobe. In the mean time you could try out some regular expression search/replace scenarios yourself.

Reece Asquith-Jepson

so i have multiple entries that have that exact layout. i’m wanting to make it so that i get everything in the edited to appear in another document laid out the exact same except the text in the <edited> is different.

Herb Martin

First, while it’ s possible you (or I) could do this in NP++ it looks difficult and my first choice would be PowerShell (or Perl, or Python).

I do some fairly aggressive replacements in NP++ with Regular Expression but for such a task you quite likely will need a “programming” language to assist the regexes. Or at least to make it practical.

(There are Python and Lua scripting add-ons for NP++ so if you told me I was REQUIRED to do it there then I wold want to investigate those, but currently don’t have any experience with them.)

You’ll first need to be a LOT more EXPLICIT about that you wish to do, whether we help you or you do it yourself you’ll need to know precisely what you mean by “”</edited>", ‘merge’, and ‘exact same layout’.

Sorry, but being explicit is your first step.

Presumably the following:

get any text that is in the </edited> and merge it into a similar file with the exact same layout.

… means the text "between the ‘<edited>’ and the ‘</edited>’ tags, so in this case:

<edited>Some call him the heart of the Raimon team. A kind, big brother type. </edited>
would capture:
Some call him the heart of the Raimon team. A kind, big brother type.

But where would you put it? Would it make a new entry? Go with another? This is XML which has a structure and defines semantic relationships so the <edited></edited> tags likely have no meaning (value) unless they appear in a <subentry>, of an <entry>, along with (some of) an <entry name> and <original>.

It’s fairly easy to CAPTURE either just the <edited> entry or an entire <subentry> (or even all the <entry> items (which likely would make the most sense.

Once you had them you can certainly put them in another file with PowerShell etc. – maybe with NP++.

What does “merge” mean? Does it mean you can’t just append, that you must overwrite if the item exists or perhaps that you must NOT clobber a similar item, and how do you decide, by <edited>, <original> or by <entry name>?

I know all this seems incredibly tedious but that is the nature of doing such replacements.

A language like PowerShell that can just go straight to the XML document model and do the replacement may make it a LOT easier, and even if you use Regexes a full programming language will make that a lot easier too.

One trick: Can you do 2 or 3 of these by hand, while pretending you know NOTHING about the structure that you could not get a regex or XML parser to “see”.?

That’s the 2nd step along with 1st being VERY EXPLICIT.

Computers are stupid, but very fast if you tell the exactly what to do.

Herb Martin

There is another way to get at such complicated replacement and editing that requires NO programming and NO regular expressions – though they may help if you can add that skill.

It will perhaps take some practice and always takes me a few trial runs to perfect if the problem is tough.

Also there does need to be some form of regularity to the text.

Record a macro that captures you editing the text and moving the the same (relative) start position for the next item.

That’s it.

But it is a lot more difficult in practice than it sounds so you’ll need to do it 2 or 3 times or maybe 30 the first time you try it.

You’ll need to find a good “starting place” or some way to make you first “move” or seach arrive at a good starting place.

Then do all the steps manually without making mistakes – at least none that will interfere with the macro working.

Remember, once the job is complete on ONE item you must STOP, except you also need to move to the same relative position to begin the “next” item.

Once you have a working macro recorded, save it, assign it to a key, and try it.

(It will probably fail in some way – mine frequently do. It’s harder than it sound but requires only patience, perseverance and a willingness to learn. No special skills.)

Once it’s working, just repeat it manually or if you have many to do, RECORD repeating it 10 times, then if that works record repeating that 10 times. Pretty soon you can do thousands of such replacements but the key is in getting JUST ONE PERFECT.

Reece Asquith-Jepson

I have 2 versions of the same file one which has all text in Japanese and one has part Japanese and part English. i can upload both versions of my files if that would help with the explanation.