Replacing number with another Incrementing #

raizkie19

Hello!
I’m having difficulties in finding regexes again… Sorry I was a bit confused…
I would like to search all these numbers in bold below before the # ‘hash’ sign…

100#
A sweet tasting tropical fruit made famous by its use in slapstick comedy and practical jokes.

101#
Clustered berries with smooth skin that can be fermented to make wine.

102#
An orange root that is supposedly good for your vision. Despite the Beta Carotene, kids don’t care much for it.

103#
A tuber that can be fried, baked, boiled mashed, even eaten.

I was told that regexe for searching numbers before the # is (?=.+#) but instead of adding numbers at the beginning of it I would like to replace it with a new incrementing numbers…

So instead of:
100#
101#
102#
103#

It would be like:
40001#
40002#
40003#

astrosofista

@raizkie19 said in Replacing number with another Incrementing #:

I was told that regexe for searching numbers before the # is (?=.+#)

Hi @raizkie19

It’s close but no joy. The quoted S/R regex wouldn’t just insert 40 before the given number, as needed. Instead you will get 401400401#, 401400402#, and so on.

But by inserting an anchor ^ before the quoted expression the correct replacement is done

Find: ^(?=.+#)
Replace: 40

However you want to renumber those hashes. Before I can provide you with a solution, I would need to know how many replacements will be made.

Best Regards

raizkie19

@astrosofista
Hi!.. Sorry for the late reply…
Oh… thank you for explaining…
But the number line will be long… From 40001 to 70000… :(

Ekopalypse

@raizkie19

What is Regex: a language for describing a formula to find patterns in a text
What is Regex not: A calculator, a painting application, … or a replacement for a programming language.
Whenever it is necessary to calculate something based on its results,
then a programming language should be used.
However, this does not mean that it is not possible with a regex,
but creating a regex in such scenarios is usually complex and
only works for this particular case.

Alan Kilborn

Inspired by Eko’s posted script HERE I have created the following Pythonscript (see below) to help accomplish the goal of this current thread.

Here’s how it would be applied in this case (2-phase solution):

Part 1 of 2:

Run the script (with your data file as the active tab in Notepad++).
Specify ^\d{3}(?=#) in the input box that appears. Press OK.
At this point all of the items to be incrementally replaced should be selected.

Now for part 2 of 2:

With the data still multiselected, press Alt+c to invoke the Column editor.
Tick Number to insert and specify your starting number in the `` box.
Specify your Initial number (e.g. 40001) and Increase by number (e.g. 1).
Press OK and your text should be transformed as desired.

The script:

from Npp import editor, notepad

class T19343(object):

    def __init__(self):

        if editor.selectionIsRectangle(): return

        if editor.getSelections() == 1:

            if editor.getSelectionEmpty():
                scope_starting_pos = 0
                scope_ending_pos = editor.getTextLength()
            else:
                scope_starting_pos = editor.getSelectionStart()
                scope_ending_pos = editor.getSelectionEnd()

            while True:
                title = 'SELECT ALL matches in '
                title += 'ENTIRE FILE' if editor.getSelectionEmpty() else 'SELECTED TEXT'
                user_regex_input = notepad.prompt('REGEX to search for:                    (will result in multiple selections)\r\n(hint: if need to, start regex with \Q to do a "normal" search)', title, '')
                if user_regex_input == None: return  # user Cancel
                try:
                    # test user_regex_input for validity
                    editor.research(user_regex_input, lambda _: None)
                except RuntimeError as r:
                    notepad.messageBox('Error in search regex:\r\n{0}\r\n{1}\r\n\r\nYou will be prompted to try again.'.format(user_regex_input, str(r)), '')
                else:
                    break

            match_list = []

            editor.research(user_regex_input, lambda m: match_list.append(m.span(0)), 0, scope_starting_pos, scope_ending_pos)

            #print(match_list)

            if len(match_list) >= 1:

                (first_match_anchor_pos, first_match_caret_pos) = match_list[0]

                # set the FIRST selection and bring it into user's view:
                editor.setSelection(first_match_caret_pos, first_match_anchor_pos)
                editor.scrollRange(first_match_anchor_pos, first_match_caret_pos)

                # remember top line of user's view, for later restore
                first_line = editor.getFirstVisibleLine()

                if len(match_list) >= 2:

                    editor.setMultipleSelection(True)  # in case not enabled in the Preferences

                    # add in all the remaining selections:
                    for (match_anchor_pos, match_caret_pos) in match_list[1 : ]:
                        editor.addSelection(match_caret_pos, match_anchor_pos)

                editor.setFirstVisibleLine(first_line)

        elif editor.getSelections() > 1:

            delimiter = notepad.prompt('Delimit individual copied selections with:\r\n(leave empty to delimit with line-endings)', 'Copy selections to clipboard', '')

            if delimiter != None:

                if len(delimiter) == 0: delimiter = '\r\n'

                accum_list = []

                for sel_nbr in range(editor.getSelections()):
                    accum_list.append(editor.getTextRange(editor.getSelectionNStart(sel_nbr), editor.getSelectionNEnd(sel_nbr)))

                editor.copyText(delimiter.join(accum_list))

                #editor.setEmptySelection(editor.getCurrentPos())

                notepad.messageBox('Results are now in clipboard', '')

if __name__ == '__main__': T19343()

Alan Kilborn

I should mention two more things about the script:

The scope of the search can be limited by making (one) selection of a stream of text before running the script
If the script is run a second time, i.e., while the multiselections from the first run are still active, it provides a mechanism to copy all of the matches (selections) to the clipboard.

raizkie19

@Alan-Kilborn

Thank you for providing the detailed explanation…

astrosofista

@raizkie19 said in Replacing number with another Incrementing #:

But the number line will be long… From 40001 to 70000… :(

Hi @raizkie19, @Ekopalypse, @Alan-Kilborn and All

As @Ekopalypse said, regex can’t do math, but there are workarounds to simulate some common operations. So, as an alternative to @Alan-Kilborn’s script and in case you are not allowed to install the Python plugin, let me suggest you the following method, which is based on previous posts.

Please try the following:

Open a new tab in Notepad++ (Ctrl + N)
Type in a space and then press the Enter key
Open the Replace dialog (Ctrl + H)
Set the following fields as follows (Copy/Paste):

Find what: \R
Replace with: \r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\n

Only select the Wrap Around option and the Regular expression search mode
Click 5 times the Replace All button in order to get 32768 (= 2^15) lines with a blank space
Close the Replace dialog

Press Ctrl + Home to go to very beginning of the file.

Now, open the Column editor (Edit -> Column Editor…, or Alt + C)

Check the Number to Insert option
Type 100 in the Initial number field
Type 1 in the Increase by field
Leave the options Repeat and Leading zeros empty
The format is decimal, by default
Click on the OK button

Place the caret at the beginning of the file and then press End to put it after the blank space that follows 100.

Again, open the Column editor (Edit -> Column Editor…, or Alt + C)

Check the Number to Insert option
Type 40001 in the Initial number field
Type 1 in the Increase by field
Leave the options Repeat and Leading zeros empty
The format is decimal, by default
Click on the OK button

The replacement list is done. You should have got a list as to this one:

100   40001
101   40002
102   40003
103   40004
104   40005
[...]
32864 72765
32865 72766
32866 72767
32867 72768
32868 72769

Copy the whole list (Ctrl + A)

Open a copy of the file to be changed.
Go to the last line.
Press Enter, then type === (three equal signs) and press Enter again.
Paste the list to get the following:

30000#
A tuber that can be fried, baked, boiled mashed, even eaten.
===
100   40001
101   40002
102   40003
[...]

Return to the first line of the file (Ctrl + Home)
Open the Replace dialog (Ctrl + H)
Deselect all options except the Regular expression search mode
Set the following fields as follows (Copy/Paste):

Search: (?s)^(\d+)(?=#\R.*?===.*?\1 +(\d+))
Replace: ?1$2

Now, the last action:
Click on the Replace All button
Close de Replace dialog.

That’s all. All the numbers should have been changed.

Best Regards.

Alan Kilborn

@astrosofista

Now THAT is a serious workaround, for the truly desperate. :-)

astrosofista

@Alan-Kilborn said in Replacing number with another Incrementing #:

Now THAT is a serious workaround, for the truly desperate. :-)

Yep, looks as a quite hard task, but actually isn’t. It would be easier to deliver if the macro feature could record Column Editor outputs.

Later will try your nice Python script.

astrosofista

@Alan-Kilborn said in Replacing number with another Incrementing #:

Specify ^\d{3}(?=#) in the input box that appears. Press OK.

Tested and worked fine on sample text :)

However, I found a potential failure. OP told us that he wanted to replace 30,000 numbers, so if these grow by one, as the sample text suggests, then the regex you provided will fail to match four or five digit numbers.

If this is the case, then as you know, expressions like ^\d{3,5}(?=#) or ^\d+(?=#) will match all the numbers.

Alan Kilborn

@astrosofista

I found a potential failure.

True, but also kind of obvious. :-)
Plus, I should have used “e.g.” on that part of it, like I did in 2 other places.

But yes, the OP may not be versed in regex, and may not know how to specify a solution that covers all his cases, so thanks for the pick-me-up on that.

For me, it was more about publishing the script, which I had sitting unfinished in my N++ tabs ever since @Ekopalypse published his original script (which only selected multiple occurrences of static text).

astrosofista

@Alan-Kilborn said in Replacing number with another Incrementing #:

For me, it was more about publishing the script

Rest assured your script is a nice and useful improvement, since it allows users to make more complex selections with greater flexibility.

Thank you for sharing it with us :)

astrosofista

Hi @raizkie19, @Ekopalypse, @Alan-Kilborn, All

I have just realised that, given the regularity of the problem in question, it can also be stated as a mathematical operation, a simple addition. This approach, which perhaps was the one that @ekopalypse had in mind in his post above, gives rise to a third alternative, simpler and more direct than the two posted so far, since although it still requires the Python plugin and a regex, it doesn’t need any list created with the Column Editor.

The following script is a very minor adaptation of a code posted by @ekopalypse to solve a similar problem, so all credits belongs to him:

from Npp import editor

def add_raizkie_number(m):
    return int(m.group(0)) + 39901

editor.rereplace('\d+(?=#)', add_raizkie_number)

So, @raizkie19 hope you can test it and see if it deliver the expected outcome. It worked fin here.

Have fun!

Alan Kilborn

@astrosofista said in Replacing number with another Incrementing #:

very minor adaptation of a code posted by @ekopalypse to solve a similar problem
so all credits belongs to him

Actually, I think all of the credit goes back to the Pythonscript documentation!

Note, though: number should be int in the documentation!

Ekopalypse

@Alan-Kilborn @astrosofista

More specifically, I have modified the result of calling help(editor.rereplace) in the PythonScript console.

>>> help(editor.rereplace)
Help on method rereplace:

rereplace(...) method of Npp.Editor instance
    rereplace( (Editor)arg1, (object)searchRegex, (object)replace) -> None :
        Regular expression search and replace. Replaces 'searchRegex' with 'replace'.  ^ and $ by default match the starts and end of the document.  Use additional flags (re.MULTILINE) to treat ^ and $ per line.
        The 'replace' parameter can be a python function, that recieves an object similar to a re.Match object.
        So you can have a function like
           def myIncrement(m):
               return int(m.group(1)) + 1
        
        And call rereplace('([0-9]+)', myIncrement) and it will increment all the integers.

PeterJones

@Alan-Kilborn said in Replacing number with another Incrementing #:

Note, though: number should be int in the documentation!

That documentation bug was fixed in v1.5.3 in February.
v1.5.4 has been released since, with the fix to the getLanguageDesc() historical bug which we finally reported in #146.

astrosofista

@Alan-Kilborn

May it be, mine was just a disclaimer, as I know almost nothing about Python :)