Community
    • Login

    Python Script editor.replace Bug? concerning the characters '(' and ')'

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    33 Posts 5 Posters 4.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn @Robert Jablko
      last edited by Alan Kilborn

      @Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':

      After that I switched my text ANSI and it worked, but then Python wouldn’t run anymore.

      Use PythonScript 2.x if you need to work with non unicode data.
      And you can use it for unicode data as well, there are sometimes more hoops to jump through.
      In general, go strictly with PythonScript 3.x when you don’t have need to work with ANSI files, and your life is smoother.

      1 Reply Last reply Reply Quote 1
      • Robert JablkoR
        Robert Jablko @Alan Kilborn
        last edited by

        @Alan-Kilborn

        Interesting. So it is worth spending time with both python re and booster regex in npp. So, maybe I am able to read documentation, but not the booster regex in npp. It’s too confusing, but I started with your suggestion and did some trial and error within Ctrl*h of npp. I tried “find next” until he did exactly what I needed. I came up with these solutions.

        BRACKETS

        • opening brackets with no space after
        (?!\s)[(\[{](?!\s)
        ${0}\x20
        
        • closing brackets with no space before
        (?<!\s)[)\]}](?<!\s) 
        \x20${0}
        

        Brackets work nicely!

        OPERATORS

        • finds characters with surrounding spaces both if has space to left or if has space to right
        (\h([+-]=?)\h*)|(\h*([+-]=?)\h)
        ${1}
        
        • With operators there is a problem: It’s only working for + - += -=, as soon as I add * or any other character, he says it’s invalid syntax

        Another problem: I cannot get any of my solutions to run in Python script. Tried a lot . No idea. “Does not match c++ signature”

        editor.getText( )
        editor.replace( "\h([+-]=?)\h*)|(\h*([+-]=?)\h", "${1}" )
        editor.rereplace( '\h([+]=?)\h*', "${1}" )
        editor.research( '\h([+]=?)\h*' )
        
        mkupperM 1 Reply Last reply Reply Quote 0
        • mkupperM
          mkupper @Robert Jablko
          last edited by

          @Robert-Jablko said in Python Script editor.replace Bug? concerning the characters '(' and ')':

          With operators there is a problem: It’s only working for + - += -=, as soon as I add * or any other character, he says it’s invalid syntax

          You did not show which expression you were trying to update but did include:

          \h([+-]=?)\h*)|(\h*([+-]=?)\h
                       ^ ^ 
          

          I think you had swapped the left and right hand parts and as part of that accidentally swapped the outermost parentheses pair so that they are now inside. Your intent was probably:

          (\h([+*-]=?)\h*|\h*([+*-]=?)\h)
          

          I also added the * for you. I suspect you had tried [+-*] which is invalid syntax as you tried to define a range from ‘+’ to ‘*’ which is backwards in ASCII. You can include a - within [] brackets either making it the first character, the last character, or prefixing it with \. All of these are valid ways to include a - within the brackets: [-...], [...-], and [...\-...].

          Robert JablkoR 1 Reply Last reply Reply Quote 0
          • Robert JablkoR
            Robert Jablko @mkupper
            last edited by Robert Jablko

            @mkupper

            Thank you.

            I must admit that dealing with RegEx has been very frustrating. You think you understand something and then this ‘-’ comes along :) I used your hint and put the ‘-’ at the end for now.

            As for the search expression, it is just like I said before that I have no clue how and why it works, but my version works. It removes spaces around the operators ‘+’, ‘*’, ‘=’, ‘-’ and ‘+=’, ‘*=’, ‘==’, ‘-=’, both if they have a space on the left or on the right.

            The whole idea is to format python code the way I like it, with padding around parentheses and compressed operators. It’s a personal choice.

            (\h([+*=-]=?)\h*)|(\h*([+*=-]=?)\h)
            ${2}
            
            1 Reply Last reply Reply Quote 1
            • Robert JablkoR
              Robert Jablko @Ekopalypse
              last edited by

              @Ekopalypse

              Hello Ekopalypse, I still want to succeed with the Python script. So I went deep diving into python re with the idea of creating a code that formats the text both the way I like it and back to standard python.

              What I found is that RegEx is useful for short code and I managed to design one function for both compressing and adding, as well as for any operator, but it’s also global and is not able to format the way I want it in one step, so I have to reduce spaces with another function. It think I included all operators now ‘:’ was wrong. Not included are words like ‘for’, ‘while’ ‘and’ ‘or’ ‘not’ ‘in’, etc.

              The biggest challege was to understand that I have to sort the operators by length, to avoid ‘==’ padded to ’ = = '. This one line does the magic and also takes care of any escape characters.

              re_pattern=( '|' ).join( re.escape( operator ) for operator in sorted( operators, key=len, reverse=True ) )
              

              This is the code I came up with. It works.

              import re 
              ################################################################################
              def reduce_spaces( text ):
                  ''' reduces more than one spaces to one space, 
                      ignores spaces at start of line to ensure code '''
                  lines=text.splitlines( )
                  result=[ ]
                  for line in lines:
                      match=re.match( r'^\s+', line ) # spaces at start of line  
                      if match:
                          start=match.group( ) # returns as string
                          line=start+re.sub( r'\s{2,}', ' ', line[ len( start ): ] )
                      else:
                          line=re.sub( r'\s{2,}', ' ', line )
                      result.append( line )
                  return '\n'.join( result )
              ################################################################################
              def manipulate_operators( text, operators, compress=True, left=True, right=True ):
                  ''' adds or removes space on left, right or both sides of operators ''' 
                  
                  # create search pattern for operators, joins with |=or, takes care of escape characters and sorts them by length-reversed, first: +=, == ... then + =
                  re_pattern=( '|' ).join( re.escape( operator ) for operator in sorted( operators, key=len, reverse=True ) ) 
                  
                  if left&right&compress:
                      text=re.sub( rf'(\s*)({re_pattern})(\s*)', r'\2', text )
                  elif left&~int( right )&compress:
                      text=re.sub( rf'(\s+)({re_pattern})', r'\2', text )
                  elif ~int( left )&right&compress:
                      text=re.sub( rf'({re_pattern})(\s+)', r'\1', text )
                  elif left&right&~int( compress ):
                      text=re.sub( rf'({re_pattern})', r' \1 ', text )
                  elif left&~int( right )&~int( compress ):
                      text=re.sub( rf'({re_pattern})', r' \1', text )
                  elif ~int( left )&right&~int( compress):
                      text=re.sub( rf'({re_pattern})', r'\1 ', text )
                  else: #~int( left&right )&( compress|~int( compress ) )
                      pass
              
                  return text
              ################################################################################
              def my_py_format( text, form='my' """ 'standard' """ ):
                  ''' creates individual format of python code
                      compressed operators and padded brackets or standard '''
                  
                  operators=['+', '-', '*', '**', '/', '//', '%', '&', '|', '^', '~', '<<', '>>', '=', '+=', '-=', '*=', '**=', '/=', '//=', '%=', '&=', '|=', '^=', '<<=', '>>=', '==', '!=', '>', '<', '>=', '<=']
                  operators_parentheses_open=[ '(', '[', '{']
                  operators_parentheses_close=[ ')', ']', '}']
                  
                  if form=='my':
                      text=manipulate_operators( text, operators, compress=1, left=1, right=1 )
                      text=manipulate_operators( text, operators_parentheses_open, compress=0, left=0, right=1 )  
                      text=manipulate_operators( text, operators_parentheses_close, compress=0, left=1, right=0 )  
                  elif form=='standard':
                      text=manipulate_operators( text, operators, compress=0, left=1, right=1 )
                      text=manipulate_operators( text, operators_parentheses_open, compress=1, left=0, right=1 )  
                      text=manipulate_operators( text, operators_parentheses_close, compress=1, left=1, right=0 )  
                  
                  # RegEx global, reduce spaces after padding, but not at start of line
                  text=reduce_spaces( text ) 
                  
                  return text 
              ################################################################################
              if __name__=='__main__':
                  editor.setText( my_py_format( editor.getText( ), form='my' ) )
                  #editor.setText( my_py_format( editor.getText( ), form='standard' ) )
              

              Next iteration will be to ignore strings and comments, otherwise all my RegEx-Search-Strings will not work anymore. But that is another chapter.

              Alan KilbornA 1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @Robert Jablko
                last edited by Alan Kilborn

                @Robert-Jablko

                I just took a very quick glance at your latest script above. Nice job, it seems like you are learning and advancing as you go; always a good thing.

                One thing that struck me in my quick look was you pass “left” and “right” was integers but the receiving function arguments reference “True” but then later in the function you go back to integers again. You really should pick either integer or boolean and stay with it; I’d suggest the booleans in this case.

                You rapidly get into “trouble” with this approach, e.g. a line like this: elif left&~int( right )&compress:. I’d suggest that is much better as: elif left and not right and compress:; no need to resort to bit level manipulation operator &.

                But these are Python things, not directly related to Notepad++. Hopefully it is OK to be “off topic” if one is attempting to help one become a better programmer so their N++ scripts will be improved.

                Robert JablkoR 1 Reply Last reply Reply Quote 2
                • Mark OlsonM
                  Mark Olson
                  last edited by Mark Olson

                  @Robert-Jablko

                  Agree with @Alan-Kilborn that you are getting a lot better at writing good Pythonic code.

                  I won’t get into details, but you are going to have a very hard time correctly ignoring strings and comments without using the (*SKIP)(*FAIL) backtracking control verbs in your regexes. This is a problem for you in particular because NPP’s Boost supports those operators but Python’s re does not.

                  EDIT: For example, an efficient regex to match the character c only in lines that don’t start with leading whitespace might look like (?-si)^\h+\S.*$(*SKIP)(*FAIL)|c, but that regex is not valid for Python’s re.

                  Alan KilbornA 1 Reply Last reply Reply Quote 1
                  • Alan KilbornA
                    Alan Kilborn @Mark Olson
                    last edited by Alan Kilborn

                    @Mark-Olson said:

                    …using the (*SKIP)(*FAIL) backtracking control verbs in your regexes. This is a problem for you in particular because NPP’s Boost supports those operators but Python’s re does not.

                    If it comes to that, there are some techniques where you can avoid usage of Python re module entirely. I call it the hidden-third-editor technique, look at it HERE.

                    1 Reply Last reply Reply Quote 2
                    • Robert JablkoR
                      Robert Jablko @Alan Kilborn
                      last edited by Robert Jablko

                      @Alan-Kilborn
                      @Mark-Olson

                      Thank you :)

                      Integers vs. Booleans

                      • They are booleans, 100%. I got tired of writing True and False and the line got so long that I resumed to 1 and 0. I also thought it’s clever, but you are right. I should stick to clean coding. I will switch back to True and False.
                      • Same thing with ‘&’ and ‘~’. The code was clean with ‘and’, ‘not’, ‘or’, but while searching for all operators, I found these just had to apply them. I actually learned something: ~ will be deprecated and while running he told me to use ~int( ) instead. I will go back to clean code.

                      Ignoring string and comments

                      • At the moment I have had enough of this project and need a pause. But once I feel like it, I will first solve it with Python methods, either own loop and/or maybe with a library like tokenize. To me that is npp-related, because it has to run in the python-script.
                      • After that I will go back to Booster:Regex, find the right combinations and then maybe a makro.
                      • I will keep the hidden-third-editor in mind.
                      Alan KilbornA 1 Reply Last reply Reply Quote 0
                      • Alan KilbornA
                        Alan Kilborn @Robert Jablko
                        last edited by

                        @Robert-Jablko said:

                        I got tired of writing True and False and the line got so long that I resumed to 1 and 0

                        If you are writing code like if foo == True or bar == False, then, well, stop that.
                        Just do if foo or not bar and you lines are shorter.

                        ~ will be deprecated and while running he told me to use ~int( ) instead

                        The ~ can hardly become deprecated as it has legitimate use. The way your earlier code used it was not really “legitimate”. Who is “he”?


                        (sorry, more off-topic stuff)

                        Robert JablkoR 1 Reply Last reply Reply Quote 1
                        • Robert JablkoR
                          Robert Jablko @Alan Kilborn
                          last edited by

                          @Alan-Kilborn

                          not if foo=True, just like you suggested, check out below with no integers and no bitwise operators:

                          if left and right and compress:
                                 text=re.sub( rf'(\s*)({re_pattern})(\s*)', r'\2', text )
                             elif left and not right and compress:
                          
                          text=manipulate_operators( text, operators, compress=True, left=True, right=True )
                          

                          The ~ will be deprecated. ‘HE’ is the machine, the code, the all-knowing algorithmn that always tells what goes wrong :)

                          elif left and ~ right and compress:
                          
                          Warning (from warnings module):
                            File "MyPyFormat npp 08 ~ deprecated.py", line 27
                              text=re.sub( rf'(\s+)({re_pattern})', r'\2', text )
                          DeprecationWarning: Bitwise inversion '~' on bool is deprecated and will be removed in Python 3.16. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
                          
                          Alan KilbornA 1 Reply Last reply Reply Quote 1
                          • Alan KilbornA
                            Alan Kilborn @Robert Jablko
                            last edited by

                            @Robert-Jablko said:

                            DeprecationWarning: Bitwise inversion ‘~’ on bool is deprecated

                            This is quite different from your original (blanket) statement of ~ is deprecated.

                            Robert JablkoR 1 Reply Last reply Reply Quote 0
                            • Robert JablkoR
                              Robert Jablko @Alan Kilborn
                              last edited by

                              @Alan-Kilborn

                              don’t want to split hairs, but I said ‘will be deprecated’

                              Alan KilbornA 1 Reply Last reply Reply Quote 0
                              • Alan KilbornA
                                Alan Kilborn @Robert Jablko
                                last edited by Alan Kilborn

                                @Robert-Jablko said:

                                but I said ‘will be deprecated’

                                Well, you were wrong in saying that, as well.
                                Look at the error message you received:
                                DeprecationWarning: Bitwise inversion '~' on bool is deprecated
                                Clearly that says “is deprecated”.
                                Another possibility is that you don’t truly know what deprecated means.

                                This offshoot discussion could have been avoided if you’d just pasted the text of the entire warning here. Posting something like ~ will be deprecated without any qualifiers is what “got me going”, as that is patently ridiculous. With the qualifiers from the warning message, it makes perfect sense.

                                Robert JablkoR 1 Reply Last reply Reply Quote 0
                                • Robert JablkoR
                                  Robert Jablko @Alan Kilborn
                                  last edited by

                                  @Alan-Kilborn

                                  ah, is deprecated and will be removed. Point to you ;)

                                  1 Reply Last reply Reply Quote 0
                                  • First post
                                    Last post
                                  The Community of users of the Notepad++ text editor.
                                  Powered by NodeBB | Contributors