Find and replace in xml file,

Guillaume M. CHAZERANS

dear community

I don’t know regexp syntax but i am sure it possible to find and replace each occurrence of “text0” with:

“test1” for first occurrence
“test2” for 2nd occurrence
“test3” for 3rd occurrence
“test1” for the next one
“test2” …

Do you have any genius idea to not to change manually one by one all occurrences in my 15k lines file?

thanks a lot for your help

chagui

PeterJones

It’s a bit confusing: you want to find “text0”, but replace with “test1” then “test2” then “test3” then “test1” again. So we need to know a couple more things: was the “x” in “text” vs “s” in “test” a typo, or an intentional change? Did you intend to start over at “1” – if so, how many matches do we need to find before starting over at 1: we need an actual set of rules that are clear and accurate.

Regular expressions do not have an auto-increment feature. That’s the purview of a full programming language, and the regular expressions implemented in Notepad++ fall slightly shy of that mark. Really, something like Python or Perl would be better suited to doing this. Fortunately, Notepad++ has a python-based automation plugin called PythonScript, which could be used to implement such an algorithm in Python, and run it “live” in your Notepad++ window.

If you have, or are willing to install, the PythonScript plugin, and if you clarified the above confusion, we could make a stab at doing those replacements in PythonScript… but before putting any effort into it, we need to be sure of your requirements.

Alan Kilborn

Indeed, the example in the Pythonscript documentation for the editor.rereplace() function shows how to do something very similar. As Peter said, this isn’t going to be something you can do without a programming language, and Pythonscript is a darned appropriate one for this task.

PeterJones

You’re right; I’d forgotten it was so close. And scrolling up to editor.replace() example, I think @Guillaume-M-CHAZERANS will find an example that does almost exactly what was requested.

Guillaume M. CHAZERANS

hey Peter

Thanks for your reply. Sure, I can install the Python Plugin and learn to use it.

To precise my requirement, change from “text0” to “test1” or “test2” or “test3” is intentional , as example I can use another sequence:

Original text: “text0”

First match, change “text0” to “iteration1”
Second match, change “text0” to “other text”
Third match, change “text0” to “3rd text chain”
Fourth match, change “text0” to “iteration1”
Fifth match, change “text0” to “other text”
etc…

I would like to do these text modifications into my all xml file.

I hope it can help you to understand it better.

thanks a lot
Chagui

PeterJones

So, you want to always search for the same SEARCH string, and replace with the next in a loop of replacements?

# encoding=utf-8
"""in response to https://notepad-plus-plus.org/community/topic/17024/

This will take the next item from a list (er, immutable tuple, really)

This is based on the `editor.replace()` example in pythonscript docs.
"""
from Npp import *
import re

counter = 0
search_for_string = 'text0'
loopy_replacements = ('iteration1', 'other text', '3rd text chain')

def forum_post17024_select_replacement(m):
    """this will select the next item from the loopy_replacements"""
    global counter
    global loopy_replacements
    l = len(loopy_replacements)

    chosen = loopy_replacements[counter % l]
    counter = counter + 1

    return chosen

editor.replace( search_for_string , forum_post17024_select_replacement , re.IGNORECASE )
#editor.replace( 'text0' , get_counter , re.IGNORECASE )

With the source file

This example text0 will be modified, so that text0 will
be replaced every iteration, so that it will no
longer be text0.  Instead, text0 will become the next
of the three loopy_text values.

It will result in

This example iteration1 will be modified, so that other text will
be replaced every iteration, so that it will no
longer be 3rd text chain.  Instead, iteration1 will become the next
of the three loopy_text values.

PeterJones

In case it wasn’t clear, you will have to edit search_for_string and loopy_replacements in the script.

Alan Kilborn

@PeterJones

Yes, sorry, I meant the example from replace(), not rereplace() although the regex version of the replace has a nice example too.

@Guillaume-M-CHAZERANS

So given your example, best I can tell, the following script could do such a replacement you described:

counter = 0

the_list = [
    'iteration1',
    'other text',
    '3rd text chain',
]

def get_counter(m):
    global counter
    ret = the_list[counter]
    counter += 1
    if counter >= len(the_list): counter = 0
    return ret

editor.replace('text0', get_counter)

Alan Kilborn

Wow, I submitted my script, and it was right below Chagui’s last post, and I was looking it over to see if I needed to edit it (within 3 mins!) when, BLAMMO!, in pops Peter’s posting including a script, right in between! I thought I was going crazy. How does this happen?

PeterJones

Apparently, your browser didn’t do the behind-the-scenes auto-refresh. When I’m writing a response and someone else posts a reply before I’m done, it will usually load/update behind the editor; sometimes, I see the flash, or the bell highlight or the unread-highlight before I post. But I don’t know if it always happens.

With a 20min difference in post-time (14:37Z vs 14:57Z), I am surprised mine hadn’t shown up before you posted.

Meta Chuh

@Alan-Kilborn

the first user that starts writing in nodebb will be above another reply, if another user has started to write later, even if this second user submitted his post before.

you might have noticed it already, if you look at the time stamp (about x minutes ago) of threads with many replies at the same time, you’ll sometimes see, that a newer post is ordered above an older one, instead of below.

note: what i usually do, if it’s not a chat but a thread to be answered, is to scroll up and have a look if someone is already writing an answer.
if someone is writing, i don’t start typing at all, and wait until he/she finished, to avoid having two similar answers.

Alan Kilborn

@Meta-Chuh

Two similar answers are not a bad thing. I enjoyed reading Peter’s script solution to see how it was similar to and different from mine. :)

guy038

Hi, @alan-kilborn, @peterjones and All,

After trying your both scripts, here is my own version, using the modulo method of Peter, in the Alan script :

the_list = [
    'iteration1',
    'other text',
    '3rd text chain',
]

l = len(the_list)

counter = l - 1

def get_counter(m):
    global counter
    counter = ( counter + 1 ) % l
    return the_list[counter]

editor.replace('text0', get_counter)

I don’t know if getting the length of the list, in the variable l, is faster than calculating it, each time there is a match of the ‘text0’ string ?

Now, Alan, I’m remembering of your nice script, some days ago, that I’ve slightly modified :

https://notepad-plus-plus.org/community/topic/16942/pythonscript-any-ready-pyscript-to-replace-one-huge-set-of-regex-phrases-with-others/23

And I was wondering if we could merge these two scripts in a single script, using a SR_List.txt file, like below :

# ----- File SR_LIST.txt -----

# Change 1st occurrence of 'text0' with 'ABC'
# Change 2nd occurrence of 'text0' with ' DEF '
# Change 3rd occurrence of 'text0' with '<GHI>'
# Change 4th occurrence pf 'text0' with 'ABC', and so on... :

!text0!ABC! DEF !<GHI>!

# STANDARD case  : change ANY occurrence of 'text1' with the sentence 'This is a test' :

%text1%This is a test%

# Change 1st occurrence of 'text2' with '(012)'
# Change 2nd occurrence of 'text2' with '[345]'
# Change 3rd occurrence of 'text2' with '{678}'
# Change 4th occurrence pf 'text2' with ' 901 '
# Change 5th occurrence of 'text2' with '(012)', and so on... :

=text2=\(012\)=[345]={678}= 901 =

# Change 1st occurrence of 'text3' with 'Bravo !!'
# Change 2nd occurrence of 'text3' with 'Yeah!!'
# Change 3rd occurrence of 'text3' with '{Bravo!!', and so on... :

@text3@ Bravo!!@Yeah!!@

Just an idea, of course ! Above all, do not feel obliged to create such a script ;-))

Cheers,

guy038

guy038

Hi all,

Ha, ha ! So, friends, you didn’t notice my mistake : my present script works, only, with a 1 or 3 items list :-((

Of course, the initialization of the counter variable must be : counter = l - 1 ( and not counter = 2 )

I’ve updated my previous post, as well !

BR

guy038

Alan Kilborn

@guy038

the modulo method of Peter

I often do the modulo method, for myself, but I thought the >= len() compare clearer for the noobs.

if getting the length of the list, in the variable l, is faster than calculating it, each time

I’m sure it is, marginally…or not so marginally if we are talking “big data”.

BTW, I was trying to keep my script maximally “in-flavor” with the Pythonscript docs editor.replace() example, since we cited that earlier.

merge these two scripts in a single script

Surely. Go for it! :)

Meta Chuh

@guy038

Ha, ha ! So, friends, you didn’t notice my mistake …

until now, there was no need to test everything you write, as you have a guru status, and you’re known to be very, very thorough at testing everything before posting … but from now on … 😂😂😂

just kidding, your mistakes are unnoticeably few, and rest assured, i’m producing far more mistakkes and thypos every month 😉👍

Guillaume M. CHAZERANS

He guys

I didn’t know only python gurus were working on this issue. ;) Thanks a lots for all these examples!
I check them (beginning by the last one?) to resolve my issue

chagui