Find and replace in xml file,



  • dear community

    I don’t know regexp syntax but i am sure it possible to find and replace each occurrence of “text0” with:

    • “test1” for first occurrence
    • “test2” for 2nd occurrence
    • “test3” for 3rd occurrence
    • “test1” for the next one
    • “test2” …

    Do you have any genius idea to not to change manually one by one all occurrences in my 15k lines file?

    thanks a lot for your help

    chagui



  • It’s a bit confusing: you want to find “text0”, but replace with “test1” then “test2” then “test3” then “test1” again. So we need to know a couple more things: was the “x” in “text” vs “s” in “test” a typo, or an intentional change? Did you intend to start over at “1” – if so, how many matches do we need to find before starting over at 1: we need an actual set of rules that are clear and accurate.

    Regular expressions do not have an auto-increment feature. That’s the purview of a full programming language, and the regular expressions implemented in Notepad++ fall slightly shy of that mark. Really, something like Python or Perl would be better suited to doing this. Fortunately, Notepad++ has a python-based automation plugin called PythonScript, which could be used to implement such an algorithm in Python, and run it “live” in your Notepad++ window.

    If you have, or are willing to install, the PythonScript plugin, and if you clarified the above confusion, we could make a stab at doing those replacements in PythonScript… but before putting any effort into it, we need to be sure of your requirements.



  • Indeed, the example in the Pythonscript documentation for the editor.rereplace() function shows how to do something very similar. As Peter said, this isn’t going to be something you can do without a programming language, and Pythonscript is a darned appropriate one for this task.



  • You’re right; I’d forgotten it was so close. And scrolling up to editor.replace() example, I think @Guillaume-M-CHAZERANS will find an example that does almost exactly what was requested.



  • hey Peter

    Thanks for your reply. Sure, I can install the Python Plugin and learn to use it.

    To precise my requirement, change from “text0” to “test1” or “test2” or “test3” is intentional , as example I can use another sequence:

    Original text: “text0”

    First match, change “text0” to “iteration1”
    Second match, change “text0” to “other text”
    Third match, change “text0” to “3rd text chain”
    Fourth match, change “text0” to “iteration1”
    Fifth match, change “text0” to “other text”
    etc…

    I would like to do these text modifications into my all xml file.

    I hope it can help you to understand it better.

    thanks a lot
    Chagui



  • So, you want to always search for the same SEARCH string, and replace with the next in a loop of replacements?

    # encoding=utf-8
    """in response to https://notepad-plus-plus.org/community/topic/17024/
    
    This will take the next item from a list (er, immutable tuple, really)
    
    This is based on the `editor.replace()` example in pythonscript docs.
    """
    from Npp import *
    import re
    
    counter = 0
    search_for_string = 'text0'
    loopy_replacements = ('iteration1', 'other text', '3rd text chain')
    
    def forum_post17024_select_replacement(m):
        """this will select the next item from the loopy_replacements"""
        global counter
        global loopy_replacements
        l = len(loopy_replacements)
    
        chosen = loopy_replacements[counter % l]
        counter = counter + 1
    
        return chosen
    
    editor.replace( search_for_string , forum_post17024_select_replacement , re.IGNORECASE )
    #editor.replace( 'text0' , get_counter , re.IGNORECASE )
    

    With the source file

    This example text0 will be modified, so that text0 will
    be replaced every iteration, so that it will no
    longer be text0.  Instead, text0 will become the next
    of the three loopy_text values.
    

    It will result in

    This example iteration1 will be modified, so that other text will
    be replaced every iteration, so that it will no
    longer be 3rd text chain.  Instead, iteration1 will become the next
    of the three loopy_text values.
    


  • In case it wasn’t clear, you will have to edit search_for_string and loopy_replacements in the script.



  • @PeterJones

    Yes, sorry, I meant the example from replace(), not rereplace() although the regex version of the replace has a nice example too.

    @Guillaume-M-CHAZERANS

    So given your example, best I can tell, the following script could do such a replacement you described:

    counter = 0
    
    the_list = [
        'iteration1',
        'other text',
        '3rd text chain',
    ]
    
    def get_counter(m):
        global counter
        ret = the_list[counter]
        counter += 1
        if counter >= len(the_list): counter = 0
        return ret
    
    editor.replace('text0', get_counter)


  • Wow, I submitted my script, and it was right below Chagui’s last post, and I was looking it over to see if I needed to edit it (within 3 mins!) when, BLAMMO!, in pops Peter’s posting including a script, right in between! I thought I was going crazy. How does this happen?



  • Apparently, your browser didn’t do the behind-the-scenes auto-refresh. When I’m writing a response and someone else posts a reply before I’m done, it will usually load/update behind the editor; sometimes, I see the flash, or the bell highlight or the unread-highlight before I post. But I don’t know if it always happens.

    With a 20min difference in post-time (14:37Z vs 14:57Z), I am surprised mine hadn’t shown up before you posted.



  • @Alan-Kilborn

    the first user that starts writing in nodebb will be above another reply, if another user has started to write later, even if this second user submitted his post before.

    you might have noticed it already, if you look at the time stamp (about x minutes ago) of threads with many replies at the same time, you’ll sometimes see, that a newer post is ordered above an older one, instead of below.

    note: what i usually do, if it’s not a chat but a thread to be answered, is to scroll up and have a look if someone is already writing an answer.
    if someone is writing, i don’t start typing at all, and wait until he/she finished, to avoid having two similar answers.



  • @Meta-Chuh

    Two similar answers are not a bad thing. I enjoyed reading Peter’s script solution to see how it was similar to and different from mine. :)



  • Hi, @alan-kilborn, @peterjones and All,

    After trying your both scripts, here is my own version, using the modulo method of Peter, in the Alan script :

    the_list = [
        'iteration1',
        'other text',
        '3rd text chain',
    ]
    
    l = len(the_list)
    
    counter = l - 1
    
    def get_counter(m):
        global counter
        counter = ( counter + 1 ) % l
        return the_list[counter]
    
    editor.replace('text0', get_counter)
    

    I don’t know if getting the length of the list, in the variable l, is faster than calculating it, each time there is a match of the ‘text0’ string ?


    Now, Alan, I’m remembering of your nice script, some days ago, that I’ve slightly modified :

    https://notepad-plus-plus.org/community/topic/16942/pythonscript-any-ready-pyscript-to-replace-one-huge-set-of-regex-phrases-with-others/23

    And I was wondering if we could merge these two scripts in a single script, using a SR_List.txt file, like below :

    # ----- File SR_LIST.txt -----
    
    # Change 1st occurrence of 'text0' with 'ABC'
    # Change 2nd occurrence of 'text0' with ' DEF '
    # Change 3rd occurrence of 'text0' with '<GHI>'
    # Change 4th occurrence pf 'text0' with 'ABC', and so on... :
    
    !text0!ABC! DEF !<GHI>!
    
    # STANDARD case  : change ANY occurrence of 'text1' with the sentence 'This is a test' :
    
    %text1%This is a test%
    
    # Change 1st occurrence of 'text2' with '(012)'
    # Change 2nd occurrence of 'text2' with '[345]'
    # Change 3rd occurrence of 'text2' with '{678}'
    # Change 4th occurrence pf 'text2' with ' 901 '
    # Change 5th occurrence of 'text2' with '(012)', and so on... :
    
    =text2=\(012\)=[345]={678}= 901 =
    
    # Change 1st occurrence of 'text3' with 'Bravo !!'
    # Change 2nd occurrence of 'text3' with 'Yeah!!'
    # Change 3rd occurrence of 'text3' with '{Bravo!!', and so on... :
    
    @text3@ Bravo!!@Yeah!!@
    

    Just an idea, of course ! Above all, do not feel obliged to create such a script ;-))

    Cheers,

    guy038



  • Hi all,

    Ha, ha ! So, friends, you didn’t notice my mistake : my present script works, only, with a 1 or 3 items list :-((

    Of course, the initialization of the counter variable must be : counter = l - 1 ( and not counter = 2 )

    I’ve updated my previous post, as well !

    BR

    guy038



  • @guy038

    the modulo method of Peter

    I often do the modulo method, for myself, but I thought the >= len() compare clearer for the noobs.

    if getting the length of the list, in the variable l, is faster than calculating it, each time

    I’m sure it is, marginally…or not so marginally if we are talking “big data”.

    BTW, I was trying to keep my script maximally “in-flavor” with the Pythonscript docs editor.replace() example, since we cited that earlier.

    merge these two scripts in a single script

    Surely. Go for it! :)



  • @guy038

    Ha, ha ! So, friends, you didn’t notice my mistake …

    until now, there was no need to test everything you write, as you have a guru status, and you’re known to be very, very thorough at testing everything before posting … but from now on … 😂😂😂

    just kidding, your mistakes are unnoticeably few, and rest assured, i’m producing far more mistakkes and thypos every month 😉👍



  • He guys

    I didn’t know only python gurus were working on this issue. ;) Thanks a lots for all these examples!
    I check them (beginning by the last one?) to resolve my issue

    chagui


Log in to reply