another replacement request for help

Ekopalypse

Is the oil column really needed?
I mean, does it needs to be checked or can the script always look for company and then replace whatever is in WellType with the value from the Gas column?

Carlos J. Segnini R.

Edit:
Thinking about, no, no need for that column

Ekopalypse

This post is deleted!

Ekopalypse

@Carlos-J-Segnini-R

so you can get rid of the second column and the only thing we need to change is the find expression, I guess.

Ekopalypse

@Carlos-J-Segnini-R

from Npp import editor1, editor2

replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)

def replace_with(m):
    return replacements[m.group(1)]

# search_for = '(?<=<UWI>).+?(?=</UWI>)'
search_for = '<UWI>(.+)</UWI>\R\h*<WellType>\K.+?(?=</WellType>)'
editor1.rereplace(search_for, replace_with)

Ekopalypse

@Carlos-J-Segnini-R

btw. if you are interested how these regex search work see here for a pretty good description.

Carlos J. Segnini R.

Thanks again for your help.
For some reason this time it isn’t working.
I will try and fix it.

Ekopalypse

@Carlos-J-Segnini-R

Can you open the PythonScript console (plugin->PythonScript->Show Console) to see if there is an error?
The replacement file has company and gas/oil tab separated, correct?

Carlos J. Segnini R.

@Ekopalypse here is the log.
I think it is because not all the wells need to be changed, so the first one in the file (Aery #B1H) does not appear in the second file, so the script is stopping. I will try leaving all wells, even those that doesnt need to be changed

Traceback (most recent call last):
File “C:\Users\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\find_replace.py”, line 10, in <module>
editor1.rereplace(search_for, replace_with)
File “C:\Users\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\find_replace.py”, line 6, in replace_with
return replacements[m.group(1)]
KeyError: ‘Aery #B1H’

And yes, the other file is separated by tabs.

Ekopalypse

@Carlos-J-Segnini-R

Yes, that makes sense then we need to take another approach where we
create the searches based on the second list. Gimme a minute

Ekopalypse

@Carlos-J-Segnini-R

from Npp import editor1, editor2

replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)

def replace_with(m):
    return replacements[m.group(1)]

# search_for = '(?<=<UWI>).+?(?=</UWI>)'
for company in replacements.keys():
    search_for = '<UWI>({})</UWI>\R\h*<WellType>\K.+?(?=</WellType>)'.format(company)
    editor1.rereplace(search_for, replace_with)

Ekopalypse

@Carlos-J-Segnini-R

depending on the file size a faster solution might be this

from Npp import editor1, editor2

replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)

def replace_with(m):
    company = replacements.get(m.group(1), None)
    if company:
        return replacements[m.group(1)]
    else:
        return m.group()

# search_for = '(?<=<UWI>).+?(?=</UWI>)'
search_for = '<UWI>(.+)</UWI>\R\h*<WellType>\K.+?(?=</WellType>)'
editor1.rereplace(search_for, replace_with)

Only scanning the text one time and in case the company found is
not in the replacements dictionary we replace it with what was found.
Btw. its midnight here - good night.

Carlos J. Segnini R.

The first one worked!
Thanks so much for your help, I owe you a beer!
Have a good night

Carlos J. Segnini R.

By the way, I went back and edited the script to run a second time looking for another instance below.

I’m sure there are better ways to jump three lines, but it’s late and I needed to finish. It worked! haha

from Npp import editor1, editor2

replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)

def replace_with(m):
    return replacements[m.group(1)]

# search_for = '(?<=<UWI>).+?(?=</UWI>)'
for company in replacements.keys():
    search_for = '<UWI>({})</UWI>\R\h*.*\R\h*.*\R\h*.*\R\h*<PrimaryProduct>\K.+?(?=</PrimaryProduct>)'.format(company)
    editor1.rereplace(search_for, replace_with)

Ekopalypse

@Carlos-J-Segnini-R

I’m sure there are better ways to jump three lines,

In the end, what really counts is whether it does what it is supposed to do, right?
One, of several alternatives would be for example

<UWI>(.*)</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)

but whether this is better or worse depends on the real data.

Ekopalypse

@Ekopalypse said in another replacement request for help:

<UWI>(.*)</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)

to be used in the script it would be

'<UWI>({})</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)'

Carlos J. Segnini R.

Thanks @Ekopalypse I still owe you a beer
I’m sure i’ll come and ask questions again, so maybe more than a beer haha

guy038

Hi, @ekopalypse and All,

Just back from some holidays in Brittany ! Good day to everybody ;;-))

Looking back to your first script, of this discussion, it solves an interesting and general task :

Given a first file, where some words / expressions must be replaced
Given a second file containing all these the words / expressions and their corresponding replacements with a tabulation as a separator

This script executes, in the first file, all in once, the different replacements, from the second file’s table ;-))

Now, considering the slightly modified script, where the range of chars searched is contained in tags <TEST>......</TEST>

from Npp import editor1, editor2

replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)

def replace_with(m):
    return replacements[m.group()]

editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)

May I ask you about two questions :

When the file to be modified ( new1 ) contains, in a tag <TEST>......</TEST>, a value not listed in the new 2 file, the scripts stops, no replacement occurs and the console message KeyError: 'ABCDE' is displayed

Example, with new 1 :

<Wellcollection>
	<Well replace="false">
		<TEST>BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>And BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>ABCDE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>

and new 2

BEFORE	After
And BEFORE	After

Could it be possible to avoid errors for all key values not listed in new 2 ?

Secondly, how to ignore case of the key values ? I mean, instead of he obvious method, below, in new 2

BEFORE	After
And BEFORE	After
and before	After

I tried this version, without success :

from Npp import editor1, editor2
import re

replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)

def replace_with(m):
    return replacements[m.group()]

editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with, re.IGNORECASE)

with new 1 :

<Wellcollection>
	<Well replace="false">
		<TEST>BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>And BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>and before</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>

and new 2 :

BEFORE	After
And BEFORE	After

I still had the error :

KeyError: 'and before'

TIA,

Best Regards,

guy038

Afterwards, I’ll do additional tests with your script and the regex S/R solution, below :

SEARCH (?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+

REPLACE ?1\2

This S/R executed against this text :

<Wellcollection>
	<Well replace="false">
		<TEST>BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>And BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>and before</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>And BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>BEFORE</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
---
BEFORE	After
And BEFORE	After 2

gives this changed text :

<Wellcollection>
	<Well replace="false">
		<TEST>After</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>After 2</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>After 2</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>After 2</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>
		<TEST>After</TEST>
		<WellType>Oil</WellType>
		<ReserveStatusCollection>

I bet that your script can manage largest files than my regex solution !

Ekopalypse

@guy038

thank you very much for checking the script.
A solution concerning missing replacements is to replace with what was
found or what I later suggested, creating unique search strings from the
list of replacements.
The first approach might work better for large files, the second seemed
more reasonable at this point.

To ignore case sensitivity - let’s changes the replacement keys to lower string
and check against lower matches but replace with defined replacement values. So a replacement dictionary would look like this

    replacements = {'täst': 'TEST'}

and would replace either Täst or täst with TEST always.

Let me think about it again.

Alan Kilborn

@Ekopalypse said in another replacement request for help:

A solution concerning missing replacements is to replace with what was
found or what I later suggested, creating unique search strings from the
list of replacements.

Not sure about the “later suggest” part, but this seems like a quick solution to the former:

def replace_with(m):
    try:
        r = replacements[m.group()]
    except KeyError:
        r = m.group()
    return r

Disclaimer: I didn’t actually try it. :-)