Community
    • Login

    Python Script Regex replace with uppercase

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    16 Posts 4 Posters 3.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Andre Z EckenrodeM
      M Andre Z Eckenrode @Alan Kilborn
      last edited by

      @Alan-Kilborn said in Python Script Regex replace with uppercase:

      editor.rereplace(r'(?-i)([A-Z])', ur'\L\1')

      Interesting… I used a nearly identical line of code here, and it works for me:

      editor.rereplace(r'(?-i)^([A-Z])', ur'\L\1')

      Note that my script file is itself ANSI/Windows-1252, and therefore begins with the following line:

      # encoding: Windows-1252

      I’m guessing yours is Unicode, and maybe that’s the interfering factor?

      PeterJonesP 2 Replies Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @M Andre Z Eckenrode
        last edited by PeterJones

        @M-Andre-Z-Eckenrode ,

        Weird.

        I just tried a comparison:

        Zz
        Yy
        XxXx
        
        C_
        À
        Á
        Â
        Ã
        Ä
        Å
        
        1. File that is Encoding > ANSI: f03b0786-610d-4cf8-97b7-f8ef8fa9f85d-image.png => editor.rereplace(r'(?-i)^([A-Z])', ur'\L\1') => 1ef54b3e-41fc-4d50-823e-c6c825cc1a7e-image.png

        2. File that is Encoding > Charset > Western > Windows-1252: same file, same rereplace line, no characters go lowercase

        3. File that is Encoding > UTF-8: same file, same rereplace line, no characters go lowercase

        So ANSI works differently than a forced charset or forced UTF-8.

        —
        addendum:

        Notepad++ v8.1.4   (64-bit)
        Build time : Aug 21 2021 - 13:04:59
        Path : C:\usr\local\apps\notepad++\notepad++.exe
        Command Line : 
        Admin mode : OFF
        Local Conf mode : ON
        Cloud Config : OFF
        OS Name : Windows 10 Enterprise (64-bit) 
        OS Version : 2009
        OS Build : 19042.1165
        Current ANSI codepage : 1252
        Plugins : AutoSave.dll ComparePlugin.dll ExtSettings.dll MarkdownViewerPlusPlus.dll mimeTools.dll NppConsole.dll NppConverter.dll NppEditorConfig.dll NppExec.dll NppExport.dll NppFTP.dll NppUISpy.dll PreviewHTML.dll PythonScript.dll QuickText.dll TagLEET.dll XMLTools.dll 
        

        Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)]

        1 Reply Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @M Andre Z Eckenrode
          last edited by PeterJones

          @M-Andre-Z-Eckenrode ,

          Instead of doing the lowercase through a regex replacement in the rereplace, what about a lambda function? editor.rereplace(r'(?-i)^([A-Z])', lambda m: m.group(1).lower()) worked in all three of the test file conditions I listed in my previous post.

          And the opposite, which your original question asked for, editor.rereplace(r'(?-i)^([a-z])', lambda m: m.group(1).upper())

          M Andre Z EckenrodeM 1 Reply Last reply Reply Quote 3
          • M Andre Z EckenrodeM
            M Andre Z Eckenrode @PeterJones
            last edited by

            @PeterJones said in Python Script Regex replace with uppercase:

            Instead of doing the lowercase through a regex replacement in the rereplace, what about a lambda function?

            I’ve never even heard of that before, but it sounds like a promising work-around unless and until an actual fix for rereplace is in place, if possible. Where can I read more about lambda? I see only a passing mention in the Python Script doc page for ‘Editor Object’, and though typing ‘lambda’ in the search box for the online NPP user manual makes it appear that it can be found in numerous sections including ‘Searching’, I’m unable to locate any specific instance of it there using my browser’s ‘Find’ facility.

            Alan KilbornA PeterJonesP 2 Replies Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @M Andre Z Eckenrode
              last edited by

              @M-Andre-Z-Eckenrode said in Python Script Regex replace with uppercase:

              Where can I read more about lambda?

              lambda functions are part of Python, not specific to Notepad++'s PythonScript plugin.

              Read more about them by “googling” for “lambda functions in Python”.

              lambda functions are available in other programming languages as well, so it is not even something specific to Python (but that’s the context here, so…).

              M Andre Z EckenrodeM 1 Reply Last reply Reply Quote 1
              • M Andre Z EckenrodeM
                M Andre Z Eckenrode @Alan Kilborn
                last edited by

                @Alan-Kilborn said in Python Script Regex replace with uppercase:

                Read more about them by “googling” for “lambda functions in Python”.

                Ok, thanks much.

                1 Reply Last reply Reply Quote 0
                • PeterJonesP
                  PeterJones @M Andre Z Eckenrode
                  last edited by

                  @M-Andre-Z-Eckenrode said in Python Script Regex replace with uppercase:

                  @PeterJones said in Python Script Regex replace with uppercase:
                  I see only a passing mention in the Python Script doc page for ‘Editor Object’,

                  There’s only a passing mention because, as @Alan-Kilborn explained, it’s a standard feature in Python (and elsewhere).

                  And you don’t even need to know about lambdas for the problem at hand: Really, you just need to learn, as the PythonScript documentation showed, that rereplace allows either a replacement expression or a function that it will call on the matching text. A lambda function or a normally-defined function will both work equally well (like the infamous add_1 in the PS docs). The function accesses the text of the match through the m.group(#) where # aligns with the capture groups in your regular expression match expression. The function should return the text that you want to replace the entire match with (in your case, the function-based equivalent of \U\1). So when rereplace finds a match, it will send that match as m to the function, and then the function returns the replacement value; then rereplace moves on to the next match and calls the function again, until no more matches are found. (To make it abundantly clear, your function does not need to loop through the matches; that is handled by the rereplace; your function just needs to transform one match m into some text to return to be used as the replacement.)

                  The call editor.rereplace(r'(?-i)^([a-z])', lambda m: m.group(1).upper()) is exactly equivalent to the longer script

                  def do_capitalize(m):
                      return m.group(1).upper()
                  
                  editor.rereplace(r'(?-i)^([a-z])', do_capitalize)
                  

                  … but it fits nicely in a one-liner. If your replacement function required more than one line (if you wanted to build a more complicated string through various calculations), then you’d have to use the defined-function variant instead of a lambda function.

                  M Andre Z EckenrodeM 1 Reply Last reply Reply Quote 1
                  • M Andre Z EckenrodeM
                    M Andre Z Eckenrode @PeterJones
                    last edited by

                    @PeterJones said in Python Script Regex replace with uppercase:

                    you don’t even need to know about lambdas for the problem at hand: Really, you just need to learn, as the PythonScript documentation showed, that rereplace allows either a replacement expression or a function that it will call on the matching text.

                    Noted, and thank you for the more detailed explanation. Although I can’t think of any immediate use for a lambda function other than your helpful suggestion for rereplace just now, it’s certainly possible on will come to me in the future, so I’m happy to learn more about it than I absolutely have to for my present needs — even though a refresh and more thorough study of it will surely be necessary when the time comes.

                    Thanks again to you and @Alan-Kilborn for your help.

                    1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn
                      last edited by Alan Kilborn

                      It might have been instructive for the PythonScript docs to have shown the add_1 example as a lambda, e.g.:

                      editor.rereplace('X([0-9]+)', lambda m: 'Y' + str(int(m.group(1)) + 1))


                      For those without easy access to the PythonScript docs, here’s what they DO show:

                      def add_1(m):
                          return 'Y' + str(int(m.group(1)) + 1)
                      
                      # replace X followed by numbers by an incremented number
                      # e.g.   X56 X39 X999
                      #          becomes
                      #        Y57 Y40 Y1000
                      
                      editor.rereplace('X([0-9]+)', add_1);
                      

                      And, No, I’ve no idea what’s up with the trailing semicolon on the last line of that example.

                      1 Reply Last reply Reply Quote 2
                      • EkopalypseE
                        Ekopalypse
                        last edited by

                        Sorry for the late and already too late reply, but I usually stay away from the computer on weekends.

                        I assume that Python does its string processing before the boost::regex function gets a chance to interpret the string, but I’ve never really looked into it. The lambda or explicit function solution seem to be the way to solve this problem.

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors