Python Script editor.replace Bug? concerning the characters '(' and ')'
-
@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':
I really do not understand the difference between replace and rereplace
With regular expressions, there are certain characters such as
(
that have a meaning for the regex syntax. So to find them literally, you have to escape them.I do know, is that when I used the replace method in the first iteration, ‘(’ was gone
I would be very surprised if it was a bug in PS, but if the problem occurs again, you should report it here so it can be investigated.
result+=‘(’+’ ’ is actually on purpose
Yes, what is important is what is important to you here.
What do you think?
It looks much nicer to me ;-)
-
As I said earlier concerning regex, I will try it :) and I did. I found a horrible unintuitive syntax that does miracle things. I changed my whole code using regex and it works like a charm in python.
Unfortunately it throws a SyntaxError here in the Python script. Can anybody help?
SyntaxError: invalid syntax File "C:\npp\plugins\PythonScript\scripts\PimpMyPy\MyPyFormat npp Re 01.py", line 20 text=re.sub( rf'({characters})(\s)', r'\1', text ) # ( operator, space )
Here is the full code I used:
import re ############################################################################### def compress_operators( text ): ''' removes spaces before and after operators and returns text ''' print( text ) # lists of operators, assigners and comparators operators=[ '\+', '-', ':', '\*', '\*\*', '/', '//', '%', ] # Mathematische Operatoren assigners=[ '=' ]+[ operator+'=' for operator in operators ] comparators=[ '==', '!=', '<', '>', '<=', '>=' ] bitwise_comparators=[ '&', '\|', '^', '~', '<<', '>>' ] # combine all operators in one search-string for RegEx characters='' for character in operators+assigners+comparators+bitwise_comparators: characters+=character+'|' characters=characters[ :-1 ] # remove last '|' # substitute with RegEx text=re.sub( rf'({characters})(\s)', r'\1', text ) # ( operator, space ) text=re.sub( rf'(\s)({characters})', r'\2', text ) # ( space, operator ) return text ############################################################################### def padd_brackets( text ): ''' add space after opening and before closing bracket, if not present ''' # list of brackets opening_brackets=[ '(', '[', '{', ] closing_brackets=[ ')', ']', '}', ] # create search_string depending on type of bracket and substitute with RegEx for bracket in opening_brackets+closing_brackets: if bracket in opening_brackets: search=rf'([\{bracket}])(\S)' # ( bracket, any character ) text=re.sub( search, r'\1 \2', text ) elif bracket in closing_brackets: search=rf'(\S)([\{bracket}])' # ( any character, bracket ) text=re.sub( search, r'\1 \2', text ) return text ################################################################################ def my_py_format( text ): ''' creates individual format of python code compressed operators and padded brackets ''' text=compress_operators( text ) text=padd_brackets( text ) return text ############################################################################### if __name__=='__main__': editor.setText( my_py_format( editor.getText( ) )
-
First of all, if you’re using regular expressions, you don’t need to use PythonScript; you can just use the Notepad++ find/replace form. That was the whole point of my suggestion.
As far as writing a regular expression to do what you want, I already wrote one above (EDIT: except for Regex-replace 1, I would change the Find what: to
\h*([+-*<>:=]=?|!=|//?|[&|%^]|<<|>>|\*\*)\h*
). I don’t have the time right now to explain syntax, but you clearly understand how to read documentation, so I’ll just let you figure out what my regexes do by consulting the handy-dandy user manual.EDIT2: the more Pythonic (and much faster) way to render
blah = '' for x in some_list: blah += x + sep # sep is a separator string (in your case '|') blah = blah[:-len(sep)]
would be
blah = sep.join(some_list)
-
@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':
Unfortunately it throws a SyntaxError here in the Python script. Can anybody help?
It sounds like you are using a different Python version.
Could it be that your PythonScript (check the about menu item)is using Python 2.7, whereas your local python installation is Python 3.X.
If so, install the PS 3 version from the Github repository.
Note: Only recommended if you only work with utf-8 encoded files.One additional thing about using regular expressions.
If you are using a Python script, you can either use the Python re module or the boost::regex library, which is also used by notepad++. You are currently using the Python module in your code. So please note that the syntax you find here in this community does not always work with the Python re module.
If you use the editor functionseditor.rereplace
andeditor.research
, use the Boost version. -
Thank you. I applied the pythonic way. It’s beautiful. Great hin.
About your suggestion, I heard you the first time around. It’s smarter to use the internal replace of npp instead of a plugin. I believe you a 100% and at the same time I want to figure out Python script at the moment. I don’t care if it works, I don’t really need the solution, because I have created a GUI that transforms the code. This is about the journey. With one little question, I learned so much in the last days. It’s amazing.
About RegEx, I was and I am still VERY confused, but thanks to Eko I understand now that there are possible issues both with Python Script and different versions of re for python and for boost:regex for npp. As I said, I never used re before until you mentioned it. I will figure it out, Python script, re for python and boost:regex. One after the other. THANK YOU!
-
Thanks for the great advice :) I feel seen and helped.
-
Indeed, it was python script 2.0.0.0. To update, I deinstalled the old version and erased all my scripts and my days work, I learned something.
-
Good news. 3.0.21.0 installed successfully. That did not solve the problem.
-
Indeed, I had no idea there was a difference between python re and boost:regex in NPP. No wonder I am so confused. I can hardly get python re to work and there is another world out there. I will figure out both. Might take a while.
The rest is just history chit chat and affection … no need to read.
UTF-8?
- I am very new to NPP, just a little over a week. The reason I started in the first place, was actually not Python, but pink eye. My eyes were so light sensitive that I switched everything to dark mode. Until this point I may have been the last remaining fan of the Windows Editor. Honestly, I love it, because there is no formatting, no bs. Open it, write and it will look the same on every system. Copy and paste everything in there and bye bye formatting issues. The moment I start an office document, I spend too much time on formatting. Since there is no dark mode in the Editor, I looked for an alternative as long as I have pink eye. I stumbled over Notepad++, which I had tried out years ago, didn’t like, gave it another shot and was very happy. A dark Editor with no bs, portable!, leightweight, where I can have a vertical line at 80 characters, to remind me of my printers boundaries. I changed everything to make the printout look like the Editor. Perfect. First plus for NPP.
- Tabs, Drag and Drop. Next plus for NPP
- Then I realized that I maybe can also use it as an IDE for Python. So I invested some time in NppExec and was thrilled that it actually starts IDLE with my version of Python. That’s the problem I have with Spyder. The autoformatting is some of the best, I have ever seen, especially showing you the parameters for you own functions, but it’s slow and has it’s own python version. I couldn’t get mine to run. In NPP I love that the formatting for text, and python files is different. Next plus for NPP.
- Next thing I tried was Quick Text and didn’t like it, because it would only work with ANSI files when you use vowels in my native tongue. So I tried to install Finger Text for a long time and although it it is not compatible, which is a pity, I still learned a lot about how plugins work at app. After that I switched my text ANSI and it worked, but then Python wouldn’t run anymore. Unfortunately, you cannot set text files to ANSI and python files to UTF-8, it has to be same standard in the configuration, which is a minus for NPP, but I can live with that. So “Langer Rede - kurzer Sinn” yes, that was the whole point. Everything is UTF-8 now and I use QuickText for Python excessively. Love it.
- Next thing I found was the Python Script. I tried and failed, but decided to seek help in an online forum for the first time in my life, and I must say, I am impressed and thankful. Seems like I am becoming a fan of both NPP and the community.
-
-
@Robert-Jablko said:
It’s smarter to use the internal replace of npp instead of a plugin.
It depends upon what you are doing.
Example: If you have a string variable in Python holding some data, it’s easier to use Python regular expression replacement on it rather than N++'s regular expression replacement. -
@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':
After that I switched my text ANSI and it worked, but then Python wouldn’t run anymore.
Use PythonScript 2.x if you need to work with non unicode data.
And you can use it for unicode data as well, there are sometimes more hoops to jump through.
In general, go strictly with PythonScript 3.x when you don’t have need to work with ANSI files, and your life is smoother. -
Interesting. So it is worth spending time with both python re and booster regex in npp. So, maybe I am able to read documentation, but not the booster regex in npp. It’s too confusing, but I started with your suggestion and did some trial and error within Ctrl*h of npp. I tried “find next” until he did exactly what I needed. I came up with these solutions.
BRACKETS
- opening brackets with no space after
(?!\s)[(\[{](?!\s) ${0}\x20
- closing brackets with no space before
(?<!\s)[)\]}](?<!\s) \x20${0}
Brackets work nicely!
OPERATORS
- finds characters with surrounding spaces both if has space to left or if has space to right
(\h([+-]=?)\h*)|(\h*([+-]=?)\h) ${1}
- With operators there is a problem: It’s only working for + - += -=, as soon as I add * or any other character, he says it’s invalid syntax
Another problem: I cannot get any of my solutions to run in Python script. Tried a lot . No idea. “Does not match c++ signature”
editor.getText( ) editor.replace( "\h([+-]=?)\h*)|(\h*([+-]=?)\h", "${1}" ) editor.rereplace( '\h([+]=?)\h*', "${1}" ) editor.research( '\h([+]=?)\h*' )
-
@Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':
With operators there is a problem: It’s only working for + - += -=, as soon as I add * or any other character, he says it’s invalid syntax
You did not show which expression you were trying to update but did include:
\h([+-]=?)\h*)|(\h*([+-]=?)\h ^ ^
I think you had swapped the left and right hand parts and as part of that accidentally swapped the outermost parentheses pair so that they are now inside. Your intent was probably:
(\h([+*-]=?)\h*|\h*([+*-]=?)\h)
I also added the
*
for you. I suspect you had tried[+-*]
which is invalid syntax as you tried to define a range from ‘+’ to ‘*’ which is backwards in ASCII. You can include a-
within[]
brackets either making it the first character, the last character, or prefixing it with\
. All of these are valid ways to include a-
within the brackets:[-...]
,[...-]
, and[...\-...]
. -
Thank you.
I must admit that dealing with RegEx has been very frustrating. You think you understand something and then this ‘-’ comes along :) I used your hint and put the ‘-’ at the end for now.
As for the search expression, it is just like I said before that I have no clue how and why it works, but my version works. It removes spaces around the operators ‘+’, ‘*’, ‘=’, ‘-’ and ‘+=’, ‘*=’, ‘==’, ‘-=’, both if they have a space on the left or on the right.
The whole idea is to format python code the way I like it, with padding around parentheses and compressed operators. It’s a personal choice.
(\h([+*=-]=?)\h*)|(\h*([+*=-]=?)\h) ${2}
-
Hello Ekopalypse, I still want to succeed with the Python script. So I went deep diving into python re with the idea of creating a code that formats the text both the way I like it and back to standard python.
What I found is that RegEx is useful for short code and I managed to design one function for both compressing and adding, as well as for any operator, but it’s also global and is not able to format the way I want it in one step, so I have to reduce spaces with another function. It think I included all operators now ‘:’ was wrong. Not included are words like ‘for’, ‘while’ ‘and’ ‘or’ ‘not’ ‘in’, etc.
The biggest challege was to understand that I have to sort the operators by length, to avoid ‘==’ padded to ’ = = '. This one line does the magic and also takes care of any escape characters.
re_pattern=( '|' ).join( re.escape( operator ) for operator in sorted( operators, key=len, reverse=True ) )
This is the code I came up with. It works.
import re ################################################################################ def reduce_spaces( text ): ''' reduces more than one spaces to one space, ignores spaces at start of line to ensure code ''' lines=text.splitlines( ) result=[ ] for line in lines: match=re.match( r'^\s+', line ) # spaces at start of line if match: start=match.group( ) # returns as string line=start+re.sub( r'\s{2,}', ' ', line[ len( start ): ] ) else: line=re.sub( r'\s{2,}', ' ', line ) result.append( line ) return '\n'.join( result ) ################################################################################ def manipulate_operators( text, operators, compress=True, left=True, right=True ): ''' adds or removes space on left, right or both sides of operators ''' # create search pattern for operators, joins with |=or, takes care of escape characters and sorts them by length-reversed, first: +=, == ... then + = re_pattern=( '|' ).join( re.escape( operator ) for operator in sorted( operators, key=len, reverse=True ) ) if left&right&compress: text=re.sub( rf'(\s*)({re_pattern})(\s*)', r'\2', text ) elif left&~int( right )&compress: text=re.sub( rf'(\s+)({re_pattern})', r'\2', text ) elif ~int( left )&right&compress: text=re.sub( rf'({re_pattern})(\s+)', r'\1', text ) elif left&right&~int( compress ): text=re.sub( rf'({re_pattern})', r' \1 ', text ) elif left&~int( right )&~int( compress ): text=re.sub( rf'({re_pattern})', r' \1', text ) elif ~int( left )&right&~int( compress): text=re.sub( rf'({re_pattern})', r'\1 ', text ) else: #~int( left&right )&( compress|~int( compress ) ) pass return text ################################################################################ def my_py_format( text, form='my' """ 'standard' """ ): ''' creates individual format of python code compressed operators and padded brackets or standard ''' operators=['+', '-', '*', '**', '/', '//', '%', '&', '|', '^', '~', '<<', '>>', '=', '+=', '-=', '*=', '**=', '/=', '//=', '%=', '&=', '|=', '^=', '<<=', '>>=', '==', '!=', '>', '<', '>=', '<='] operators_parentheses_open=[ '(', '[', '{'] operators_parentheses_close=[ ')', ']', '}'] if form=='my': text=manipulate_operators( text, operators, compress=1, left=1, right=1 ) text=manipulate_operators( text, operators_parentheses_open, compress=0, left=0, right=1 ) text=manipulate_operators( text, operators_parentheses_close, compress=0, left=1, right=0 ) elif form=='standard': text=manipulate_operators( text, operators, compress=0, left=1, right=1 ) text=manipulate_operators( text, operators_parentheses_open, compress=1, left=0, right=1 ) text=manipulate_operators( text, operators_parentheses_close, compress=1, left=1, right=0 ) # RegEx global, reduce spaces after padding, but not at start of line text=reduce_spaces( text ) return text ################################################################################ if __name__=='__main__': editor.setText( my_py_format( editor.getText( ), form='my' ) ) #editor.setText( my_py_format( editor.getText( ), form='standard' ) )
Next iteration will be to ignore strings and comments, otherwise all my RegEx-Search-Strings will not work anymore. But that is another chapter.
-
I just took a very quick glance at your latest script above. Nice job, it seems like you are learning and advancing as you go; always a good thing.
One thing that struck me in my quick look was you pass “left” and “right” was integers but the receiving function arguments reference “True” but then later in the function you go back to integers again. You really should pick either integer or boolean and stay with it; I’d suggest the booleans in this case.
You rapidly get into “trouble” with this approach, e.g. a line like this:
elif left&~int( right )&compress:
. I’d suggest that is much better as:elif left and not right and compress:
; no need to resort to bit level manipulation operator&
.But these are Python things, not directly related to Notepad++. Hopefully it is OK to be “off topic” if one is attempting to help one become a better programmer so their N++ scripts will be improved.
-
Agree with @Alan-Kilborn that you are getting a lot better at writing good Pythonic code.
I won’t get into details, but you are going to have a very hard time correctly ignoring strings and comments without using the
(*SKIP)(*FAIL)
backtracking control verbs in your regexes. This is a problem for you in particular because NPP’s Boost supports those operators but Python’sre
does not.EDIT: For example, an efficient regex to match the character
c
only in lines that don’t start with leading whitespace might look like(?-si)^\h+\S.*$(*SKIP)(*FAIL)|c
, but that regex is not valid for Python’sre
. -
@Mark-Olson said:
…using the (*SKIP)(*FAIL) backtracking control verbs in your regexes. This is a problem for you in particular because NPP’s Boost supports those operators but Python’s re does not.
If it comes to that, there are some techniques where you can avoid usage of Python
re
module entirely. I call it the hidden-third-editor technique, look at it HERE. -
Thank you :)
Integers vs. Booleans
- They are booleans, 100%. I got tired of writing True and False and the line got so long that I resumed to 1 and 0. I also thought it’s clever, but you are right. I should stick to clean coding. I will switch back to True and False.
- Same thing with ‘&’ and ‘~’. The code was clean with ‘and’, ‘not’, ‘or’, but while searching for all operators, I found these just had to apply them. I actually learned something: ~ will be deprecated and while running he told me to use ~int( ) instead. I will go back to clean code.
Ignoring string and comments
- At the moment I have had enough of this project and need a pause. But once I feel like it, I will first solve it with Python methods, either own loop and/or maybe with a library like tokenize. To me that is npp-related, because it has to run in the python-script.
- After that I will go back to Booster:Regex, find the right combinations and then maybe a makro.
- I will keep the hidden-third-editor in mind.
-
@Robert-Jablko said:
I got tired of writing True and False and the line got so long that I resumed to 1 and 0
If you are writing code like
if foo == True or bar == False
, then, well, stop that.
Just doif foo or not bar
and you lines are shorter.~ will be deprecated and while running he told me to use ~int( ) instead
The
~
can hardly become deprecated as it has legitimate use. The way your earlier code used it was not really “legitimate”. Who is “he”?
(sorry, more off-topic stuff)
-
not if foo=True, just like you suggested, check out below with no integers and no bitwise operators:
if left and right and compress: text=re.sub( rf'(\s*)({re_pattern})(\s*)', r'\2', text ) elif left and not right and compress:
text=manipulate_operators( text, operators, compress=True, left=True, right=True )
The ~ will be deprecated. ‘HE’ is the machine, the code, the all-knowing algorithmn that always tells what goes wrong :)
elif left and ~ right and compress:
Warning (from warnings module): File "MyPyFormat npp 08 ~ deprecated.py", line 27 text=re.sub( rf'(\s+)({re_pattern})', r'\2', text ) DeprecationWarning: Bitwise inversion '~' on bool is deprecated and will be removed in Python 3.16. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
-
@Robert-Jablko said:
DeprecationWarning: Bitwise inversion ‘~’ on bool is deprecated
This is quite different from your original (blanket) statement of
~ is deprecated
.