Replacing number with another Incrementing #



  • Hello!
    I’m having difficulties in finding regexes again… Sorry I was a bit confused…
    I would like to search all these numbers in bold below before the # ‘hash’ sign…

    100#
    A sweet tasting tropical fruit made famous by its use in slapstick comedy and practical jokes.

    101#
    Clustered berries with smooth skin that can be fermented to make wine.

    102#
    An orange root that is supposedly good for your vision. Despite the Beta Carotene, kids don’t care much for it.

    103#
    A tuber that can be fried, baked, boiled mashed, even eaten.

    I was told that regexe for searching numbers before the # is (?=.+#) but instead of adding numbers at the beginning of it I would like to replace it with a new incrementing numbers…

    So instead of:
    100#
    101#
    102#
    103#

    It would be like:
    40001#
    40002#
    40003#



  • @raizkie19 said in Replacing number with another Incrementing #:

    I was told that regexe for searching numbers before the # is (?=.+#)

    Hi @raizkie19

    It’s close but no joy. The quoted S/R regex wouldn’t just insert 40 before the given number, as needed. Instead you will get 401400401#, 401400402#, and so on.

    But by inserting an anchor ^ before the quoted expression the correct replacement is done

    Find: ^(?=.+#)
    Replace: 40

    However you want to renumber those hashes. Before I can provide you with a solution, I would need to know how many replacements will be made.

    Best Regards



  • @astrosofista
    Hi!.. Sorry for the late reply…
    Oh… thank you for explaining…
    But the number line will be long… From 40001 to 70000… :(



  • @raizkie19

    What is Regex: a language for describing a formula to find patterns in a text
    What is Regex not: A calculator, a painting application, … or a replacement for a programming language.
    Whenever it is necessary to calculate something based on its results,
    then a programming language should be used.
    However, this does not mean that it is not possible with a regex,
    but creating a regex in such scenarios is usually complex and
    only works for this particular case.



  • Inspired by Eko’s posted script HERE I have created the following Pythonscript (see below) to help accomplish the goal of this current thread.

    Here’s how it would be applied in this case (2-phase solution):

    Part 1 of 2:

    • Run the script (with your data file as the active tab in Notepad++).

    • Specify ^\d{3}(?=#) in the input box that appears. Press OK.

    • At this point all of the items to be incrementally replaced should be selected.

    Now for part 2 of 2:

    • With the data still multiselected, press Alt+c to invoke the Column editor.

    • Tick Number to insert and specify your starting number in the `` box.

    • Specify your Initial number (e.g. 40001) and Increase by number (e.g. 1).

    • Press OK and your text should be transformed as desired.

    The script:

    from Npp import editor, notepad
    
    class T19343(object):
    
        def __init__(self):
    
            if editor.selectionIsRectangle(): return
    
            if editor.getSelections() == 1:
    
                if editor.getSelectionEmpty():
                    scope_starting_pos = 0
                    scope_ending_pos = editor.getTextLength()
                else:
                    scope_starting_pos = editor.getSelectionStart()
                    scope_ending_pos = editor.getSelectionEnd()
    
                while True:
                    title = 'SELECT ALL matches in '
                    title += 'ENTIRE FILE' if editor.getSelectionEmpty() else 'SELECTED TEXT'
                    user_regex_input = notepad.prompt('REGEX to search for:                    (will result in multiple selections)\r\n(hint: if need to, start regex with \Q to do a "normal" search)', title, '')
                    if user_regex_input == None: return  # user Cancel
                    try:
                        # test user_regex_input for validity
                        editor.research(user_regex_input, lambda _: None)
                    except RuntimeError as r:
                        notepad.messageBox('Error in search regex:\r\n{0}\r\n{1}\r\n\r\nYou will be prompted to try again.'.format(user_regex_input, str(r)), '')
                    else:
                        break
    
                match_list = []
    
                editor.research(user_regex_input, lambda m: match_list.append(m.span(0)), 0, scope_starting_pos, scope_ending_pos)
    
                #print(match_list)
    
                if len(match_list) >= 1:
    
                    (first_match_anchor_pos, first_match_caret_pos) = match_list[0]
    
                    # set the FIRST selection and bring it into user's view:
                    editor.setSelection(first_match_caret_pos, first_match_anchor_pos)
                    editor.scrollRange(first_match_anchor_pos, first_match_caret_pos)
    
                    # remember top line of user's view, for later restore
                    first_line = editor.getFirstVisibleLine()
    
                    if len(match_list) >= 2:
    
                        editor.setMultipleSelection(True)  # in case not enabled in the Preferences
    
                        # add in all the remaining selections:
                        for (match_anchor_pos, match_caret_pos) in match_list[1 : ]:
                            editor.addSelection(match_caret_pos, match_anchor_pos)
    
                    editor.setFirstVisibleLine(first_line)
    
            elif editor.getSelections() > 1:
    
                delimiter = notepad.prompt('Delimit individual copied selections with:\r\n(leave empty to delimit with line-endings)', 'Copy selections to clipboard', '')
    
                if delimiter != None:
    
                    if len(delimiter) == 0: delimiter = '\r\n'
    
                    accum_list = []
    
                    for sel_nbr in range(editor.getSelections()):
                        accum_list.append(editor.getTextRange(editor.getSelectionNStart(sel_nbr), editor.getSelectionNEnd(sel_nbr)))
    
                    editor.copyText(delimiter.join(accum_list))
    
                    #editor.setEmptySelection(editor.getCurrentPos())
    
                    notepad.messageBox('Results are now in clipboard', '')
    
    if __name__ == '__main__': T19343()
    


  • I should mention two more things about the script:

    • The scope of the search can be limited by making (one) selection of a stream of text before running the script

    • If the script is run a second time, i.e., while the multiselections from the first run are still active, it provides a mechanism to copy all of the matches (selections) to the clipboard.



  • @Alan-Kilborn

    Thank you for providing the detailed explanation…



  • @raizkie19 said in Replacing number with another Incrementing #:

    But the number line will be long… From 40001 to 70000… :(

    Hi @raizkie19, @Ekopalypse, @Alan-Kilborn and All

    As @Ekopalypse said, regex can’t do math, but there are workarounds to simulate some common operations. So, as an alternative to @Alan-Kilborn’s script and in case you are not allowed to install the Python plugin, let me suggest you the following method, which is based on previous posts.

    Please try the following:

    Open a new tab in Notepad++ (Ctrl + N)
    Type in a space and then press the Enter key
    Open the Replace dialog (Ctrl + H)
    Set the following fields as follows (Copy/Paste):

    Find what: \R
    Replace with: \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n

    Only select the Wrap Around option and the Regular expression search mode
    Click 5 times the Replace All button in order to get 32768 (= 2^15) lines with a blank space
    Close the Replace dialog

    Press Ctrl + Home to go to very beginning of the file.

    Now, open the Column editor (Edit -> Column Editor…, or Alt + C)

    Check the Number to Insert option
    Type 100 in the Initial number field
    Type 1 in the Increase by field
    Leave the options Repeat and Leading zeros empty
    The format is decimal, by default
    Click on the OK button

    Place the caret at the beginning of the file and then press End to put it after the blank space that follows 100.

    Again, open the Column editor (Edit -> Column Editor…, or Alt + C)

    Check the Number to Insert option
    Type 40001 in the Initial number field
    Type 1 in the Increase by field
    Leave the options Repeat and Leading zeros empty
    The format is decimal, by default
    Click on the OK button

    The replacement list is done. You should have got a list as to this one:

    100   40001
    101   40002
    102   40003
    103   40004
    104   40005
    [...]
    32864 72765
    32865 72766
    32866 72767
    32867 72768
    32868 72769
    

    Copy the whole list (Ctrl + A)

    Open a copy of the file to be changed.
    Go to the last line.
    Press Enter, then type === (three equal signs) and press Enter again.
    Paste the list to get the following:

    30000#
    A tuber that can be fried, baked, boiled mashed, even eaten.
    ===
    100   40001
    101   40002
    102   40003
    [...]
    

    Return to the first line of the file (Ctrl + Home)
    Open the Replace dialog (Ctrl + H)
    Deselect all options except the Regular expression search mode
    Set the following fields as follows (Copy/Paste):

    Search: (?s)^(\d+)(?=#\R.*?===.*?\1 +(\d+))
    Replace: ?1$2

    Now, the last action:
    Click on the Replace All button
    Close de Replace dialog.

    That’s all. All the numbers should have been changed.

    Best Regards.



  • @astrosofista

    Now THAT is a serious workaround, for the truly desperate. :-)



  • @Alan-Kilborn said in Replacing number with another Incrementing #:

    Now THAT is a serious workaround, for the truly desperate. :-)

    Yep, looks as a quite hard task, but actually isn’t. It would be easier to deliver if the macro feature could record Column Editor outputs.

    Later will try your nice Python script.



  • @Alan-Kilborn said in Replacing number with another Incrementing #:

    Specify ^\d{3}(?=#) in the input box that appears. Press OK.

    Tested and worked fine on sample text :)

    However, I found a potential failure. OP told us that he wanted to replace 30,000 numbers, so if these grow by one, as the sample text suggests, then the regex you provided will fail to match four or five digit numbers.

    If this is the case, then as you know, expressions like ^\d{3,5}(?=#) or ^\d+(?=#) will match all the numbers.



  • @astrosofista

    I found a potential failure.

    True, but also kind of obvious. :-)
    Plus, I should have used “e.g.” on that part of it, like I did in 2 other places.

    But yes, the OP may not be versed in regex, and may not know how to specify a solution that covers all his cases, so thanks for the pick-me-up on that.

    For me, it was more about publishing the script, which I had sitting unfinished in my N++ tabs ever since @Ekopalypse published his original script (which only selected multiple occurrences of static text).



  • @Alan-Kilborn said in Replacing number with another Incrementing #:

    For me, it was more about publishing the script

    Rest assured your script is a nice and useful improvement, since it allows users to make more complex selections with greater flexibility.

    Thank you for sharing it with us :)



  • Hi @raizkie19, @Ekopalypse, @Alan-Kilborn, All

    I have just realised that, given the regularity of the problem in question, it can also be stated as a mathematical operation, a simple addition. This approach, which perhaps was the one that @ekopalypse had in mind in his post above, gives rise to a third alternative, simpler and more direct than the two posted so far, since although it still requires the Python plugin and a regex, it doesn’t need any list created with the Column Editor.

    The following script is a very minor adaptation of a code posted by @ekopalypse to solve a similar problem, so all credits belongs to him:

    from Npp import editor
    
    def add_raizkie_number(m):
        return int(m.group(0)) + 39901
    
    editor.rereplace('\d+(?=#)', add_raizkie_number)
    

    So, @raizkie19 hope you can test it and see if it deliver the expected outcome. It worked fin here.

    Have fun!



  • @astrosofista said in Replacing number with another Incrementing #:

    very minor adaptation of a code posted by @ekopalypse to solve a similar problem
    so all credits belongs to him

    Actually, I think all of the credit goes back to the Pythonscript documentation!

    931caaf5-cd94-4694-a2bf-aee81705980a-image.png

    Note, though: number should be int in the documentation!



  • @Alan-Kilborn @astrosofista

    More specifically, I have modified the result of calling help(editor.rereplace) in the PythonScript console.

    >>> help(editor.rereplace)
    Help on method rereplace:
    
    rereplace(...) method of Npp.Editor instance
        rereplace( (Editor)arg1, (object)searchRegex, (object)replace) -> None :
            Regular expression search and replace. Replaces 'searchRegex' with 'replace'.  ^ and $ by default match the starts and end of the document.  Use additional flags (re.MULTILINE) to treat ^ and $ per line.
            The 'replace' parameter can be a python function, that recieves an object similar to a re.Match object.
            So you can have a function like
               def myIncrement(m):
                   return int(m.group(1)) + 1
            
            And call rereplace('([0-9]+)', myIncrement) and it will increment all the integers.


  • @Alan-Kilborn said in Replacing number with another Incrementing #:

    Note, though: number should be int in the documentation!

    That documentation bug was fixed in v1.5.3 in February.
    v1.5.4 has been released since, with the fix to the getLanguageDesc() historical bug which we finally reported in #146.



  • @Alan-Kilborn

    May it be, mine was just a disclaimer, as I know almost nothing about Python :)


Log in to reply