Community
    • Login

    another replacement request for help

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    40 Posts 4 Posters 3.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE
      Ekopalypse
      last edited by

      @Ekopalypse said in another replacement request for help:

      <UWI>(.*)</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)

      to be used in the script it would be

      '<UWI>({})</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)'
      
      1 Reply Last reply Reply Quote 3
      • Carlos J. Segnini R.C
        Carlos J. Segnini R.
        last edited by

        Thanks @Ekopalypse I still owe you a beer
        I’m sure i’ll come and ask questions again, so maybe more than a beer haha

        1 Reply Last reply Reply Quote 1
        • guy038G
          guy038
          last edited by guy038

          Hi, @ekopalypse and All,

          Just back from some holidays in Brittany ! Good day to everybody ;;-))

          Looking back to your first script, of this discussion, it solves an interesting and general task :

          • Given a first file, where some words / expressions must be replaced

          • Given a second file containing all these the words / expressions and their corresponding replacements with a tabulation as a separator

          This script executes, in the first file, all in once, the different replacements, from the second file’s table ;-))


          Now, considering the slightly modified script, where the range of chars searched is contained in tags <TEST>......</TEST>

          from Npp import editor1, editor2
          
          replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
          
          def replace_with(m):
              return replacements[m.group()]
          
          editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)
          

          May I ask you about two questions :

          • When the file to be modified ( new1 ) contains, in a tag <TEST>......</TEST>, a value not listed in the new 2 file, the scripts stops, no replacement occurs and the console message KeyError: 'ABCDE' is displayed

          Example, with new 1 :

          <Wellcollection>
          	<Well replace="false">
          		<TEST>BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>And BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>ABCDE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          

          and new 2

          BEFORE	After
          And BEFORE	After
          

          Could it be possible to avoid errors for all key values not listed in new 2 ?


          • Secondly, how to ignore case of the key values ? I mean, instead of he obvious method, below, in new 2
          BEFORE	After
          And BEFORE	After
          and before	After
          

          I tried this version, without success :

          from Npp import editor1, editor2
          import re
          
          replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
          
          def replace_with(m):
              return replacements[m.group()]
          
          editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with, re.IGNORECASE)
          

          with new 1 :

          <Wellcollection>
          	<Well replace="false">
          		<TEST>BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>And BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>and before</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          
          

          and new 2 :

          BEFORE	After
          And BEFORE	After
          

          I still had the error :

          KeyError: 'and before'

          TIA,

          Best Regards,

          guy038

          Afterwards, I’ll do additional tests with your script and the regex S/R solution, below :

          SEARCH (?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+

          REPLACE ?1\2

          This S/R executed against this text :

          <Wellcollection>
          	<Well replace="false">
          		<TEST>BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>And BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>and before</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>And BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>BEFORE</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          ---
          BEFORE	After
          And BEFORE	After 2
          

          gives this changed text :

          <Wellcollection>
          	<Well replace="false">
          		<TEST>After</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>After 2</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>After 2</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>After 2</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          		<TEST>After</TEST>
          		<WellType>Oil</WellType>
          		<ReserveStatusCollection>
          

          I bet that your script can manage largest files than my regex solution !

          EkopalypseE 1 Reply Last reply Reply Quote 2
          • EkopalypseE
            Ekopalypse @guy038
            last edited by

            @guy038

            thank you very much for checking the script.
            A solution concerning missing replacements is to replace with what was
            found or what I later suggested, creating unique search strings from the
            list of replacements.
            The first approach might work better for large files, the second seemed
            more reasonable at this point.

            To ignore case sensitivity - let’s changes the replacement keys to lower string
            and check against lower matches but replace with defined replacement values. So a replacement dictionary would look like this

                replacements = {'täst': 'TEST'}
            

            and would replace either Täst or täst with TEST always.

            Let me think about it again.

            Alan KilbornA 1 Reply Last reply Reply Quote 2
            • Alan KilbornA
              Alan Kilborn @Ekopalypse
              last edited by

              @Ekopalypse said in another replacement request for help:

              A solution concerning missing replacements is to replace with what was
              found or what I later suggested, creating unique search strings from the
              list of replacements.

              Not sure about the “later suggest” part, but this seems like a quick solution to the former:

              def replace_with(m):
                  try:
                      r = replacements[m.group()]
                  except KeyError:
                      r = m.group()
                  return r
              

              Disclaimer: I didn’t actually try it. :-)

              EkopalypseE 1 Reply Last reply Reply Quote 2
              • EkopalypseE
                Ekopalypse @Alan Kilborn
                last edited by

                @Alan-Kilborn

                Yep, that is another solution as well.

                1 Reply Last reply Reply Quote 0
                • guy038G
                  guy038
                  last edited by guy038

                  Hello @alan-kilborn, @ekopalypse and All,

                  I’ve just tried your solution, Alan,and it worked like a charm ;-)) Clever : when an entry is missing in the dictionary, the script simply rewrites the key value

                  So, if you realize that you forgot some entries in the new 2 file, ( so some replacements ! ), the easy workaround is :

                  • Add all these new entries as well as their corresponding replacement parts

                  • Re-run the script


                  Now regarding the case’s problem, we may run a regex S/R to get an unique case of an expression, before running the script

                  For instance, instead of defining, in new 2, these 5: entries

                  BEFORE	After
                  Before	After
                  before	After
                  BEFore	After
                  befORE	After
                  
                  • Firstly, we would use the regex S/R, against our text in new 1 :

                    • SEARCH (?i)Before

                    • REPLACE \U$0

                  • Secondly, we would re-run the Python script, with only 1 entry, in new 2 :

                  BEFORE	After
                  

                  Best Regards

                  guy038

                  Alan KilbornA 1 Reply Last reply Reply Quote 3
                  • Alan KilbornA
                    Alan Kilborn @guy038
                    last edited by

                    @guy038 said in another replacement request for help:

                    Now regarding the case’s problem,

                    Yes, I didn’t see a quick mod to the original script, and I didn’t want to alter its “simple elegance” to hack in casing support. Of course, it’s possible to do, but the original was such a work of art… :-)

                    1 Reply Last reply Reply Quote 2
                    • guy038G
                      guy038
                      last edited by

                      Hi, @alan-kilborn, @ekopalypse and All,

                      Like @ekopalypse, I must had been tired, yesterday, while posting !

                      Regarding the case’s problem, no need to re-run the script : Just run the following S/R, of course :

                      • SEARCH (?i)Before

                      • REPLACE After


                      So, for all the remaining expressions which must be replaced, whatever their case, better to use the general regex :

                      • SEARCH (?i)(Expression_1)|(Expression_2)|(Expression_3)....

                      • REPLACE (?1Replacement_1)(?2Replacement_2)(?3Replacement_3)...

                      BR

                      guy038

                      1 Reply Last reply Reply Quote 2
                      • guy038G
                        guy038
                        last edited by

                        Hello, @ekopalypse and All,

                        After tests, I confirmed that any kind of regex, like below :

                        SEARCH (?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+

                        REPLACE ?1\2

                        does not work properly ( 1 all contents match ! ) as soon as the scanned file contains more than 200 lines about :-(


                        Whereas your Python script, with the @alan-kilborn’s modification, works nice !

                        Given the initial XML file, below, in new 1 tab :

                        <Wellcollection>
                        	<Well replace="false">
                        		<TEST>BEFORE</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>Guy</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>00000</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>00001</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>00002</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        ....
                        ....
                        		<TEST>99998</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>99999</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>BEFORE</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        		<TEST>Guy</TEST>
                        		<WellType>Oil</WellType>
                        		<ReserveStatusCollection/>
                        	</Well>
                        </Wellcollection>
                        

                        and the new_2 contents, in secondary view :

                        BEFORE	After
                        Guy	Guy038
                        

                        and the python script :

                        from Npp import editor1, editor2
                        
                        replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
                        
                        def replace_with(m):
                            try:
                                r = replacements[m.group()]
                            except KeyError:
                                r = m.group()
                            return r
                        
                        editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)
                        

                        => On my old Win XP laptop, it took 26s to execute :

                        • 100,000 identical replacements ( from 00000 to 99999 as not in the dictionary )

                        • 2 replacements from BEFORE to After

                        • 2 replacements from Guy to Guy038

                        So, a script solution is definitively the right solution to adopt to solve this general task !

                        Best regards

                        guy038

                        However, in my example, it would be valuable to find out a method to avoid all these identical replacements ;-))

                        1 Reply Last reply Reply Quote 2
                        • Carlos J. Segnini R.C
                          Carlos J. Segnini R.
                          last edited by

                          Hello friends
                          So one year later I had to do the same replacements, and found that the original script did not work anymore (Same error as reported by @guy038 ). Something must have changed in Notepad++, Python or somewhere else, because the first time it worked fine.

                          Anyway, I updated the code as suggested by @Alan-Kilborn and it worked correctly.
                          Cheers everyone

                          1 Reply Last reply Reply Quote 0
                          • guy038G guy038 referenced this topic on
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors