Python Script editor.replace Bug? concerning the characters '(' and ')'

Robert Jablko

Hi, I like padding around my brackets: (a)->( a )

therefore I am using a python script, that works in python, but it does something weird with ‘(’ and ‘)’ in Python Script in Notepad++ with editor.replace( ) , They both disappear, while it works for ‘[’, ‘{’, ‘}’, ‘]’

I am very confused. Please help me.

THIS IS THE SCRIPT I AM USING:
text=editor.getText()

brackets_open=‘([{’
brackets_close=‘)]}’

‘’’ adds padding to brackets, when there is none ‘’’
for index, character in enumerate( text ): #catch beginning and end issues
print( character )
if index==0: #first character in text
trailing_character=" "
if len( text )>1:
following_character=text[ index+1 ]
else:
following_character=" "
elif index==len( text )-1: #last character text
trailing_character=text[ index-1 ]
following_character=" "
else:
trailing_character=text[ index-1 ]
following_character=text[ index+1 ]

if character in brackets_open and following_character!=' ':
    editor.replace( character, character+' ' )
elif character in brackets_close and trailing_character!=' ':
    editor.replace( character, ' '+character )

Alan Kilborn

@Robert-Jablko

When wanting a literal ( or ) in replacement text, it must be escaped, i.e., $ or $.

Robert Jablko

@Alan-Kilborn

Thank you so much. It works.

If anybody is interested, here is the full python script. It’s NOT python standard, but it really helps me read code.

text=editor.getText()
################################################################################
''' compress operators, removes trailing and following space from operators '''
operators=[ '+', '-', '*', ':', '/' '//', '%', '**', ]
assigners=[ '=', '+=', '-=', '*=', ':=', ]               
comparators=[ '==', '!=', '<', '>', '<=', '>=', ]                 
for character in operators+assigners+comparators:
    if ' '+character+' ' in text:
        editor.replace( ' '+character+' ', character )
################################################################################
''' padd brackets, adds space to brackets right of open, left of close '''
brackets_open='([{'
brackets_close=')]}'
for index, character in enumerate( text ): 
    
    ''' searching edge cases '''
    if index==0: # first character
        trailing_character=" "
        if len( text )>1:
            following_character=text[ index+1 ]
        else:
            following_character=" "
    elif index==len( text )-1:  # last character
        trailing_character=text[ index-1 ] 
        following_character=" "
    else:
        trailing_character=text[ index-1 ]
        following_character=text[ index+1 ]
    
    ''' replacing '''
    if character in brackets_open and following_character!=' ':
        if character=='(':
            editor.replace( character, '\('+' ' ) # '(' needs escape '\('
        else:
            editor.replace( character, character+' ' )
    elif character in brackets_close and trailing_character!=' ':
        if character=='\)':  
            editor.replace( character, ' '+'\)' ) # '(' needs escape '\)'
        else:
            editor.replace(  character, ' '+character )
################################################################################

Robert Jablko

@Robert-Jablko

I was a little fast, operators work fine, but replace ‘(’ while working now, padds for every ‘(’. The original script hat functions that took the text, worked character by character and returned the final string, but doesn’t work with script. I have to find another way.

Mark Olson

You could replace your entire script with a sequence of regex-replaces (documentation here), which could then be saved as a macro and bound to a keyboard shortcut.

Regex-replace 1 (remove space after and before operators):

Open the find/replace form (Edit->Replace... from the Notepad++ main menu, or Ctrl+H with default keybindings)
Set the following options:
- Search mode set to Regular expressions
- Wrap around checked.
- Find what: \h*([+-*<>:=]=?|!=|//?|%|\*\*)\h*
- Replace with: ${1}
Hit Replace all

Regex-replace 2 (add exactly one space before and after any of {}[]())
The same as the previous steps, except in step 2,
Find what: (?<!\s)[(){}\[\]](?!\s)
Replace with: \x20${0}\x20 (note that \x20 is a way of rendering the literal space character so that it’s visible)

I haven’t tested either of these regexes, but I’m pretty sure they should do what you want.

By the way, your script’s runtime will be proportional to the square of the length of the text (i.e., will be extremely slow if you run it on a long file), because it does a global operation (editor.replace) for every character in the text. Whenever you’re writing a script in any language, always ask yourself if there’s a way to do a series of local modifications rather than a global modification. And remember: strings are immutable, so you can’t perform local modifications to a string in Python. That’s a lot of the reason why regular expressions are so handy: they avoid having to do potentially very slow and clunky changes to strings.

Robert Jablko

@Mark-Olson

Thank you :) I will try that.

But before, I solved my problem by creating a string, clear the editor and then add the string. It works. Don’t know if it’s fast. I am new to programming.

text=editor.getText()
################################################################################
''' compress operators, removes trailing and following space from operatios '''
operators=[ '+', '-', '*', ':', '/' '//', '%', '**', ]
assigners=[ '=', '+=', '-=', '*=', ':=', ]               
comparators=[ '==', '!=', '<', '>', '<=', '>=', ]                 
for character in operators+assigners+comparators:
    if ' '+character+' ' in text:
        editor.replace( ' '+character+' ', character )
################################################################################
''' padd brackets, adds space to brackets right of open, left of close '''
brackets_open='([{'
brackets_close=')]}'
result=''
for index, character in enumerate( text ): 
    
    ''' searching edge cases '''
    if index==0: # first character
        trailing_character=" "
        if len( text )>1:
            following_character=text[ index+1 ]
        else:
            following_character=" "
    elif index==len( text )-1:  # last character
        trailing_character=text[ index-1 ] 
        following_character=" "
    else:
        trailing_character=text[ index-1 ]
        following_character=text[ index+1 ]
    
    ''' replacing '''
    if character in brackets_open and following_character!=' ':
        if character=='(':
            result+='\('+' ' #needs escape '\('
needs escape '\('
        else:
            result+=character+' '
    elif character in brackets_close and trailing_character!=' ':
        if character=='\)':  
            result+=' '+'\(' #needs escape '\)'
        else:
            result+=' '+character
    else:
        result+=character
################################################################################
editor.clearAll( )
editor.addText( result )
################################################################################

Ekopalypse

@Robert-Jablko

Are you sure your script is doing what you expect it to do?

Let me recap what you do

you retrieve the current text content
you replace the operators_assigners_comparators in the current text
you iterate the retrieved text (retrieved text != current text) to create a new content and replace the current text.

Furthermore, if you use the replace method, you do not need to escape the (; this is only necessary if you use rereplace (note, rereplace, this is the regular expression replace form)

Instead of deleting and adding text, you can use setText in one go.

General advice:

There is no need to build strings like result+='\('+' '
use result+='\( '

use a script layout like

def main():
    here your code

main()

This gives you the option to return from a function, as the standard exit functions do what they should do, they end the process, in this case this would be notepad++.
Eg.

def main():
    if x == y:
        # oops something went wrong exit the script
        return

main()

vs.

def main():
    if x == y:
        # oops NOW NPP CRAHSES
        exit()

main()

Mark Olson

@Robert-Jablko
Your script’s runtime will still be proportional to the square of the length of the file, because the x += y operator (when applied to strings) creates a copy of x with y at the end.

This is actually not true. I tested the runtime of creating a string by +='ing characters one at a time, and it appears to be linear.

Since this is a Notepad++ forum, not a Python forum, I would encourage those interested to look at this StackOverflow post. I probably shouldn’t have even posted this in the first place, but I felt duty-bound because I originally believed that the performance implications of not saying something were too great.

Robert Jablko

@Ekopalypse

Thank you for the advice. My code perfectly explained.

I realize that I replace the brackets with the editor method, and then create a new string. I have changed that to replace within the string and script the editor method. That way I don’t need any escape characters.
Talking about escape characters. I really do not understand the difference between replace and rereplace, but what I do know, is that when I used the replace method in the first iteration, ‘(’ was gone and that was the whole reason to seek help. Alan Kilborn told me that I need the escape character ‘(’ and it worked with replace. Now that I use addText, e.g. SetText (Thank you), it’s actually not an issue anymore.
result+=‘(’+’ ’ is actually on purpose, I find it clearer and reusable, I have made it a point to add the variable pad, although in this case it doesn’t make a difference.
Layout issues. In my first try I used my original script with definitions, and for some reason, it would not run. I don’t know why and I thought it’s an issue with Python Script, so I made a code without functions. Thanks to you, I have changed that back to my original definitions. Here is the new code. What do you think?

def compress_operators( text, pad_operator=' ' ):
    print( "text in compress:", text , )
    
    ''' removes trailing and following pad from operators '''
    operators=[ '+', '-', '*', ':', '/' '//', '%', '**', ]
    assigners=[ '=', '+=', '-=', '*=', ':=', ]               
    comparators=[ '==', '!=', '<', '>', '<=', '>=', ]                 

    for character in operators+assigners+comparators:
        if pad_operator+character+pad_operator in text: # padded both sides '4 = 5' -> '4=5'
            text=text.replace( pad_operator+character+pad_operator, character )
        elif pad_operator+character in text: # padded left '4 =5' -> '4=5'
            text=text.replace( pad_operator+character, character ) 
        elif character+pad_operator in text: # padded right '4= 5' -> '4=5'
            text=text.replace( character+pad_operator, character )
            
    return text
################################################################################
def padd_brackets( text, pad_bracket=' ' ):
    ''' padds brackets, adds pad to brackets right of open, left of close 
        pad needs to be one character in this code, length_pad=1 '''
    print( "text in brackets:", text , )
    brackets_open='([{'
    brackets_close=')]}'
    
    result=''
    for index, character in enumerate( text ): 
        
        ''' searching edge cases '''
        if index==0: # first character
            trailing_character=pad_bracket
            if len( text )>1:
                following_character=text[ index+1 ]
            else:
                following_character=pad_bracket
        elif index==len( text )-1:  # last character
            trailing_character=text[ index-1 ] 
            following_character=pad_bracket
        else: # all characters in between
            trailing_character=text[ index-1 ]
            following_character=text[ index+1 ]
        
        ''' padding the brackets '''
        if character in brackets_open and following_character!=pad_bracket:
            result+=character+pad_bracket  # pad open bracket '(x' -> '( x'
        elif character in brackets_close and trailing_character!=pad_bracket:
            result+=pad_bracket+character # pad closing bracket 'x)' -> 'x )'
        else:
            result+=character
    
    return result
################################################################################    
def MyPyFormat( text, pad_operator=' ', pad_bracket=' ' ):
    ''' converts standard python code into my invididual format '''
    
    ''' 1st remove trailing and following pad from operators '''
    text_co=compress_operators( text, pad_operator=' ' )
    
    ''' 2nd padd brackets, add pad to brackets right of open, left of close 
        pad needs to be one character in this code, length_pad=1 '''
    text_pb=padd_brackets( text_co, pad_bracket=' ' )
    
    return text_pb
################################################################################
editor.setText( MyPyFormat( editor.getText( ), pad_operator=' ', pad_bracket=' ' ) )

Robert Jablko

@Mark-Olson

Thank you for testing. How do you test?

Why not posting it here? The Only problem is see, is that the link doesn’t work. :)

Mark Olson

@Robert-Jablko

Discussing how to performance test Python applications is outside the scope of this discussion, so I’ll keep it brief: I use time.perf_counter to measure time elapsed and matplotlib.pyplot to plot lines. Please don’t ask for further details, so as to avoid derailing the discussion.

On a more NPP-related note, your script is looking better now. I still think you would benefit from learning regular expressions, because the regex-based approach is much simpler and doesn’t require plugins.

Ekopalypse

@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':

I really do not understand the difference between replace and rereplace

With regular expressions, there are certain characters such as ( that have a meaning for the regex syntax. So to find them literally, you have to escape them.

I do know, is that when I used the replace method in the first iteration, ‘(’ was gone

I would be very surprised if it was a bug in PS, but if the problem occurs again, you should report it here so it can be investigated.

result+=‘(’+’ ’ is actually on purpose

Yes, what is important is what is important to you here.

What do you think?

It looks much nicer to me ;-)

Robert Jablko

@Mark-Olson

As I said earlier concerning regex, I will try it :) and I did. I found a horrible unintuitive syntax that does miracle things. I changed my whole code using regex and it works like a charm in python.

Unfortunately it throws a SyntaxError here in the Python script. Can anybody help?

SyntaxError: invalid syntax
  File "C:\npp\plugins\PythonScript\scripts\PimpMyPy\MyPyFormat npp Re 01.py", line 20
    text=re.sub( rf'({characters})(\s)', r'\1', text ) # ( operator, space )

Here is the full code I used:

import re 
###############################################################################
def compress_operators( text ):
    ''' removes spaces before and after operators and returns text '''
    print( text )

    # lists of operators, assigners and comparators
    operators=[ '\+', '-', ':', '\*', '\*\*', '/', '//', '%', ] # Mathematische Operatoren
    assigners=[ '=' ]+[ operator+'=' for operator in operators ]
    comparators=[ '==', '!=', '<', '>', '<=', '>=' ]
    bitwise_comparators=[ '&', '\|', '^', '~', '<<', '>>' ]
    
    # combine all operators in one search-string for RegEx
    characters='' 
    for character in operators+assigners+comparators+bitwise_comparators:
        characters+=character+'|'
    characters=characters[ :-1 ] # remove last '|'

    # substitute with RegEx
    text=re.sub( rf'({characters})(\s)', r'\1', text ) # ( operator, space )
    text=re.sub( rf'(\s)({characters})', r'\2', text ) # ( space, operator )
    
    return text
###############################################################################
def padd_brackets( text ):
    ''' add space after opening and before closing bracket, if not present '''
    
    # list of brackets
    opening_brackets=[ '(', '[', '{', ] 
    closing_brackets=[ ')', ']', '}', ]
    
    # create search_string depending on type of bracket and substitute with RegEx
    for bracket in opening_brackets+closing_brackets: 
        if bracket in opening_brackets:
            search=rf'([\{bracket}])(\S)' # ( bracket, any character )
            text=re.sub( search, r'\1 \2', text )  
        elif bracket in closing_brackets:
            search=rf'(\S)([\{bracket}])' # ( any character, bracket )
            text=re.sub( search, r'\1 \2', text ) 
   
    return text 
################################################################################
def my_py_format( text ):
    ''' creates individual format of python code
        compressed operators and padded brackets  '''
    
    text=compress_operators( text )
    text=padd_brackets( text )
    
    return text 
###############################################################################
if __name__=='__main__':
    editor.setText( my_py_format( editor.getText( ) )

Mark Olson

@Robert-Jablko

First of all, if you’re using regular expressions, you don’t need to use PythonScript; you can just use the Notepad++ find/replace form. That was the whole point of my suggestion.

As far as writing a regular expression to do what you want, I already wrote one above (EDIT: except for Regex-replace 1, I would change the Find what: to \h*([+-*<>:=]=?|!=|//?|[&|%^]|<<|>>|\*\*)\h*). I don’t have the time right now to explain syntax, but you clearly understand how to read documentation, so I’ll just let you figure out what my regexes do by consulting the handy-dandy user manual.

EDIT2: the more Pythonic (and much faster) way to render

blah = ''
for x in some_list:
     blah += x + sep # sep is a separator string (in your case '|')
blah = blah[:-len(sep)]

would be

blah = sep.join(some_list)

Ekopalypse

@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':

Unfortunately it throws a SyntaxError here in the Python script. Can anybody help?

It sounds like you are using a different Python version.
Could it be that your PythonScript (check the about menu item)

9b462b88-c8d7-4836-a193-91879c6dd744-{94D23D6D-E401-4545-8D60-1D3570F5140A}.png

is using Python 2.7, whereas your local python installation is Python 3.X.

If so, install the PS 3 version from the Github repository.
Note: Only recommended if you only work with utf-8 encoded files.

5c05f5ad-665a-4f73-bb69-9b7566fcc69a-{19AD1BD1-9FFE-499B-9C67-30B3DC672B08}.png

One additional thing about using regular expressions.
If you are using a Python script, you can either use the Python re module or the boost::regex library, which is also used by notepad++. You are currently using the Python module in your code. So please note that the syntax you find here in this community does not always work with the Python re module.
If you use the editor functions editor.rereplace and editor.research, use the Boost version.

Robert Jablko

@Mark-Olson

Thank you. I applied the pythonic way. It’s beautiful. Great hin.

About your suggestion, I heard you the first time around. It’s smarter to use the internal replace of npp instead of a plugin. I believe you a 100% and at the same time I want to figure out Python script at the moment. I don’t care if it works, I don’t really need the solution, because I have created a GUI that transforms the code. This is about the journey. With one little question, I learned so much in the last days. It’s amazing.

About RegEx, I was and I am still VERY confused, but thanks to Eko I understand now that there are possible issues both with Python Script and different versions of re for python and for boost:regex for npp. As I said, I never used re before until you mentioned it. I will figure it out, Python script, re for python and boost:regex. One after the other. THANK YOU!

Robert Jablko

@Ekopalypse

Thanks for the great advice :) I feel seen and helped.

Indeed, it was python script 2.0.0.0. To update, I deinstalled the old version and erased all my scripts and my days work, I learned something.
Good news. 3.0.21.0 installed successfully. That did not solve the problem.
Indeed, I had no idea there was a difference between python re and boost:regex in NPP. No wonder I am so confused. I can hardly get python re to work and there is another world out there. I will figure out both. Might take a while.

The rest is just history chit chat and affection … no need to read.

UTF-8?

I am very new to NPP, just a little over a week. The reason I started in the first place, was actually not Python, but pink eye. My eyes were so light sensitive that I switched everything to dark mode. Until this point I may have been the last remaining fan of the Windows Editor. Honestly, I love it, because there is no formatting, no bs. Open it, write and it will look the same on every system. Copy and paste everything in there and bye bye formatting issues. The moment I start an office document, I spend too much time on formatting. Since there is no dark mode in the Editor, I looked for an alternative as long as I have pink eye. I stumbled over Notepad++, which I had tried out years ago, didn’t like, gave it another shot and was very happy. A dark Editor with no bs, portable!, leightweight, where I can have a vertical line at 80 characters, to remind me of my printers boundaries. I changed everything to make the printout look like the Editor. Perfect. First plus for NPP.
Tabs, Drag and Drop. Next plus for NPP
Then I realized that I maybe can also use it as an IDE for Python. So I invested some time in NppExec and was thrilled that it actually starts IDLE with my version of Python. That’s the problem I have with Spyder. The autoformatting is some of the best, I have ever seen, especially showing you the parameters for you own functions, but it’s slow and has it’s own python version. I couldn’t get mine to run. In NPP I love that the formatting for text, and python files is different. Next plus for NPP.
Next thing I tried was Quick Text and didn’t like it, because it would only work with ANSI files when you use vowels in my native tongue. So I tried to install Finger Text for a long time and although it it is not compatible, which is a pity, I still learned a lot about how plugins work at app. After that I switched my text ANSI and it worked, but then Python wouldn’t run anymore. Unfortunately, you cannot set text files to ANSI and python files to UTF-8, it has to be same standard in the configuration, which is a minus for NPP, but I can live with that. So “Langer Rede - kurzer Sinn” yes, that was the whole point. Everything is UTF-8 now and I use QuickText for Python excessively. Love it.
Next thing I found was the Python Script. I tried and failed, but decided to seek help in an online forum for the first time in my life, and I must say, I am impressed and thankful. Seems like I am becoming a fan of both NPP and the community.

Alan Kilborn

@Robert-Jablko said:

It’s smarter to use the internal replace of npp instead of a plugin.

It depends upon what you are doing.
Example: If you have a string variable in Python holding some data, it’s easier to use Python regular expression replacement on it rather than N++'s regular expression replacement.

Alan Kilborn

@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':

After that I switched my text ANSI and it worked, but then Python wouldn’t run anymore.

Use PythonScript 2.x if you need to work with non unicode data.
And you can use it for unicode data as well, there are sometimes more hoops to jump through.
In general, go strictly with PythonScript 3.x when you don’t have need to work with ANSI files, and your life is smoother.

Robert Jablko

@Alan-Kilborn

Interesting. So it is worth spending time with both python re and booster regex in npp. So, maybe I am able to read documentation, but not the booster regex in npp. It’s too confusing, but I started with your suggestion and did some trial and error within Ctrl*h of npp. I tried “find next” until he did exactly what I needed. I came up with these solutions.

BRACKETS

opening brackets with no space after

(?!\s)[(\[{](?!\s)
${0}\x20

closing brackets with no space before

(?<!\s)[)\]}](?<!\s) 
\x20${0}

Brackets work nicely!

OPERATORS

finds characters with surrounding spaces both if has space to left or if has space to right

(\h([+-]=?)\h*)|(\h*([+-]=?)\h)
${1}

With operators there is a problem: It’s only working for + - += -=, as soon as I add * or any other character, he says it’s invalid syntax

Another problem: I cannot get any of my solutions to run in Python script. Tried a lot . No idea. “Does not match c++ signature”

editor.getText( )
editor.replace( "\h([+-]=?)\h*)|(\h*([+-]=?)\h", "${1}" )
editor.rereplace( '\h([+]=?)\h*', "${1}" )
editor.research( '\h([+]=?)\h*' )