another replacement request for help
-
Thanks @Ekopalypse I still owe you a beer
I’m sure i’ll come and ask questions again, so maybe more than a beer haha -
Hi, @ekopalypse and All,
Just back from some holidays in Brittany ! Good day to everybody ;;-))
Looking back to your first script, of this discussion, it solves an interesting and general task :
-
Given a first file, where some words / expressions must be replaced
-
Given a second file containing all these the words / expressions and their corresponding replacements with a tabulation as a separator
This script executes, in the first file, all in once, the different replacements, from the second file’s table ;-))
Now, considering the slightly modified script, where the range of chars searched is contained in tags
<TEST>......</TEST>
from Npp import editor1, editor2 replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line) def replace_with(m): return replacements[m.group()] editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)
May I ask you about two questions :
- When the file to be modified (
new1
) contains, in a tag<TEST>......</TEST>
, a value not listed in thenew 2
file, the scripts stops, no replacement occurs and the console messageKeyError: 'ABCDE'
is displayed
Example, with
new 1
:<Wellcollection> <Well replace="false"> <TEST>BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>And BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>ABCDE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection>
and
new 2
BEFORE After And BEFORE After
Could it be possible to avoid errors for all key values not listed in
new 2
?
- Secondly, how to ignore case of the key values ? I mean, instead of he obvious method, below, in
new 2
BEFORE After And BEFORE After and before After
I tried this version, without success :
from Npp import editor1, editor2 import re replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line) def replace_with(m): return replacements[m.group()] editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with, re.IGNORECASE)
with
new 1
:<Wellcollection> <Well replace="false"> <TEST>BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>And BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>and before</TEST> <WellType>Oil</WellType> <ReserveStatusCollection>
and
new 2
:BEFORE After And BEFORE After
I still had the error :
KeyError: 'and before'
TIA,
Best Regards,
guy038
Afterwards, I’ll do additional tests with your script and the regex S/R solution, below :
SEARCH
(?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+
REPLACE
?1\2
This S/R executed against this text :
<Wellcollection> <Well replace="false"> <TEST>BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>And BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>and before</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>And BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> --- BEFORE After And BEFORE After 2
gives this changed text :
<Wellcollection> <Well replace="false"> <TEST>After</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>After 2</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>After 2</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>After 2</TEST> <WellType>Oil</WellType> <ReserveStatusCollection> <TEST>After</TEST> <WellType>Oil</WellType> <ReserveStatusCollection>
I bet that your script can manage largest files than my regex solution !
-
-
thank you very much for checking the script.
A solution concerning missing replacements is to replace with what was
found or what I later suggested, creating unique search strings from the
list of replacements.
The first approach might work better for large files, the second seemed
more reasonable at this point.To ignore case sensitivity - let’s changes the replacement keys to lower string
and check against lower matches but replace with defined replacement values. So a replacement dictionary would look like thisreplacements = {'täst': 'TEST'}
and would replace either Täst or täst with TEST always.
Let me think about it again.
-
@Ekopalypse said in another replacement request for help:
A solution concerning missing replacements is to replace with what was
found or what I later suggested, creating unique search strings from the
list of replacements.Not sure about the “later suggest” part, but this seems like a quick solution to the former:
def replace_with(m): try: r = replacements[m.group()] except KeyError: r = m.group() return r
Disclaimer: I didn’t actually try it. :-)
-
Yep, that is another solution as well.
-
Hello @alan-kilborn, @ekopalypse and All,
I’ve just tried your solution, Alan,and it worked like a charm ;-)) Clever : when an entry is missing in the dictionary, the script simply rewrites the key value
So, if you realize that you forgot some entries in the
new 2
file, ( so some replacements ! ), the easy workaround is :-
Add all these new entries as well as their corresponding replacement parts
-
Re-run the script
Now regarding the case’s problem, we may run a regex S/R to get an unique case of an expression, before running the script
For instance, instead of defining, in
new 2
, these5
: entriesBEFORE After Before After before After BEFore After befORE After
-
Firstly, we would use the regex S/R, against our text in
new 1
:-
SEARCH
(?i)Before
-
REPLACE
\U$0
-
-
Secondly, we would re-run the Python script, with only
1
entry, innew 2
:
BEFORE After
Best Regards
guy038
-
-
@guy038 said in another replacement request for help:
Now regarding the case’s problem,
Yes, I didn’t see a quick mod to the original script, and I didn’t want to alter its “simple elegance” to hack in casing support. Of course, it’s possible to do, but the original was such a work of art… :-)
-
Hi, @alan-kilborn, @ekopalypse and All,
Like @ekopalypse, I must had been tired, yesterday, while posting !
Regarding the case’s problem, no need to re-run the script : Just run the following S/R, of course :
-
SEARCH
(?i)Before
-
REPLACE
After
So, for all the remaining expressions which must be replaced, whatever their case, better to use the general regex :
-
SEARCH
(?i)(Expression_1)|(Expression_2)|(Expression_3)....
-
REPLACE
(?1Replacement_1)(?2Replacement_2)(?3Replacement_3)...
BR
guy038
-
-
Hello, @ekopalypse and All,
After tests, I confirmed that any kind of regex, like below :
SEARCH
(?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+
REPLACE
?1\2
does not work properly ( 1 all contents match ! ) as soon as the scanned file contains more than
200
lines about :-(
Whereas your Python script, with the @alan-kilborn’s modification, works nice !
Given the initial
XML
file, below, innew 1
tab :<Wellcollection> <Well replace="false"> <TEST>BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>Guy</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>00000</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>00001</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>00002</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> .... .... <TEST>99998</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>99999</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>BEFORE</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> <TEST>Guy</TEST> <WellType>Oil</WellType> <ReserveStatusCollection/> </Well> </Wellcollection>
and the
new_2
contents, in secondary view :BEFORE After Guy Guy038
and the python script :
from Npp import editor1, editor2 replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line) def replace_with(m): try: r = replacements[m.group()] except KeyError: r = m.group() return r editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)
=> On my old
Win XP
laptop, it took26s
to execute :-
100,000
identical replacements ( from00000
to99999
as not in the dictionary ) -
2 replacements from
BEFORE
toAfter
-
2 replacements from
Guy
toGuy038
So, a script solution is definitively the right solution to adopt to solve this general task !
Best regards
guy038
However, in my example, it would be valuable to find out a method to avoid all these identical replacements ;-))
-
-
Hello friends
So one year later I had to do the same replacements, and found that the original script did not work anymore (Same error as reported by @guy038 ). Something must have changed in Notepad++, Python or somewhere else, because the first time it worked fine.Anyway, I updated the code as suggested by @Alan-Kilborn and it worked correctly.
Cheers everyone -