• Login
Community
  • Login

another replacement request for help

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
40 Posts 4 Posters 4.7k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • E
    Ekopalypse @Carlos J. Segnini R.
    last edited by Aug 21, 2020, 9:17 PM

    @Carlos-J-Segnini-R

    btw. if you are interested how these regex search work see here for a pretty good description.

    1 Reply Last reply Reply Quote 2
    • C
      Carlos J. Segnini R.
      last edited by Aug 21, 2020, 9:41 PM

      Thanks again for your help.
      For some reason this time it isn’t working.
      I will try and fix it.

      a4f54196-4f8f-444f-a7b7-15048ba35f66-image.png

      E 1 Reply Last reply Aug 21, 2020, 9:47 PM Reply Quote 0
      • E
        Ekopalypse @Carlos J. Segnini R.
        last edited by Aug 21, 2020, 9:47 PM

        @Carlos-J-Segnini-R

        Can you open the PythonScript console (plugin->PythonScript->Show Console) to see if there is an error?
        The replacement file has company and gas/oil tab separated, correct?

        C 1 Reply Last reply Aug 21, 2020, 9:50 PM Reply Quote 0
        • C
          Carlos J. Segnini R. @Ekopalypse
          last edited by Carlos J. Segnini R. Aug 21, 2020, 9:51 PM Aug 21, 2020, 9:50 PM

          @Ekopalypse here is the log.
          I think it is because not all the wells need to be changed, so the first one in the file (Aery #B1H) does not appear in the second file, so the script is stopping. I will try leaving all wells, even those that doesnt need to be changed

          Traceback (most recent call last):
          File “C:\Users\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\find_replace.py”, line 10, in <module>
          editor1.rereplace(search_for, replace_with)
          File “C:\Users\AppData\Roaming\Notepad++\plugins\Config\PythonScript\scripts\find_replace.py”, line 6, in replace_with
          return replacements[m.group(1)]
          KeyError: ‘Aery #B1H’

          And yes, the other file is separated by tabs.

          E 3 Replies Last reply Aug 21, 2020, 9:52 PM Reply Quote 0
          • E
            Ekopalypse @Carlos J. Segnini R.
            last edited by Aug 21, 2020, 9:52 PM

            @Carlos-J-Segnini-R

            Yes, that makes sense then we need to take another approach where we
            create the searches based on the second list. Gimme a minute

            1 Reply Last reply Reply Quote 0
            • E
              Ekopalypse @Carlos J. Segnini R.
              last edited by Aug 21, 2020, 9:54 PM

              @Carlos-J-Segnini-R

              from Npp import editor1, editor2
              
              replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
              
              def replace_with(m):
                  return replacements[m.group(1)]
              
              # search_for = '(?<=<UWI>).+?(?=</UWI>)'
              for company in replacements.keys():
                  search_for = '<UWI>({})</UWI>\R\h*<WellType>\K.+?(?=</WellType>)'.format(company)
                  editor1.rereplace(search_for, replace_with)
              
              1 Reply Last reply Reply Quote 2
              • E
                Ekopalypse @Carlos J. Segnini R.
                last edited by Aug 21, 2020, 10:11 PM

                @Carlos-J-Segnini-R

                depending on the file size a faster solution might be this

                from Npp import editor1, editor2
                
                replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
                
                def replace_with(m):
                    company = replacements.get(m.group(1), None)
                    if company:
                        return replacements[m.group(1)]
                    else:
                        return m.group()
                
                # search_for = '(?<=<UWI>).+?(?=</UWI>)'
                search_for = '<UWI>(.+)</UWI>\R\h*<WellType>\K.+?(?=</WellType>)'
                editor1.rereplace(search_for, replace_with)
                

                Only scanning the text one time and in case the company found is
                not in the replacements dictionary we replace it with what was found.
                Btw. its midnight here - good night.

                1 Reply Last reply Reply Quote 2
                • C
                  Carlos J. Segnini R.
                  last edited by Aug 21, 2020, 10:14 PM

                  The first one worked!
                  Thanks so much for your help, I owe you a beer!
                  Have a good night

                  1 Reply Last reply Reply Quote 0
                  • C
                    Carlos J. Segnini R.
                    last edited by Aug 21, 2020, 10:30 PM

                    By the way, I went back and edited the script to run a second time looking for another instance below.

                    I’m sure there are better ways to jump three lines, but it’s late and I needed to finish. It worked! haha

                    from Npp import editor1, editor2
                    
                    replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
                    
                    def replace_with(m):
                        return replacements[m.group(1)]
                    
                    # search_for = '(?<=<UWI>).+?(?=</UWI>)'
                    for company in replacements.keys():
                        search_for = '<UWI>({})</UWI>\R\h*.*\R\h*.*\R\h*.*\R\h*<PrimaryProduct>\K.+?(?=</PrimaryProduct>)'.format(company)
                        editor1.rereplace(search_for, replace_with)
                    
                    E 1 Reply Last reply Aug 24, 2020, 10:39 AM Reply Quote 1
                    • E
                      Ekopalypse @Carlos J. Segnini R.
                      last edited by Aug 24, 2020, 10:39 AM

                      @Carlos-J-Segnini-R

                      I’m sure there are better ways to jump three lines,

                      In the end, what really counts is whether it does what it is supposed to do, right?
                      One, of several alternatives would be for example

                      <UWI>(.*)</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)

                      but whether this is better or worse depends on the real data.

                      1 Reply Last reply Reply Quote 3
                      • E
                        Ekopalypse
                        last edited by Aug 24, 2020, 11:00 AM

                        @Ekopalypse said in another replacement request for help:

                        <UWI>(.*)</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)

                        to be used in the script it would be

                        '<UWI>({})</UWI>(?s)(\R.+?)<PrimaryProduct>\K.+?(?=</PrimaryProduct>)'
                        
                        1 Reply Last reply Reply Quote 3
                        • C
                          Carlos J. Segnini R.
                          last edited by Aug 28, 2020, 8:44 PM

                          Thanks @Ekopalypse I still owe you a beer
                          I’m sure i’ll come and ask questions again, so maybe more than a beer haha

                          1 Reply Last reply Reply Quote 1
                          • G
                            guy038
                            last edited by guy038 Aug 30, 2020, 7:10 PM Aug 29, 2020, 11:31 AM

                            Hi, @ekopalypse and All,

                            Just back from some holidays in Brittany ! Good day to everybody ;;-))

                            Looking back to your first script, of this discussion, it solves an interesting and general task :

                            • Given a first file, where some words / expressions must be replaced

                            • Given a second file containing all these the words / expressions and their corresponding replacements with a tabulation as a separator

                            This script executes, in the first file, all in once, the different replacements, from the second file’s table ;-))


                            Now, considering the slightly modified script, where the range of chars searched is contained in tags <TEST>......</TEST>

                            from Npp import editor1, editor2
                            
                            replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
                            
                            def replace_with(m):
                                return replacements[m.group()]
                            
                            editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)
                            

                            May I ask you about two questions :

                            • When the file to be modified ( new1 ) contains, in a tag <TEST>......</TEST>, a value not listed in the new 2 file, the scripts stops, no replacement occurs and the console message KeyError: 'ABCDE' is displayed

                            Example, with new 1 :

                            <Wellcollection>
                            	<Well replace="false">
                            		<TEST>BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>And BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>ABCDE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            

                            and new 2

                            BEFORE	After
                            And BEFORE	After
                            

                            Could it be possible to avoid errors for all key values not listed in new 2 ?


                            • Secondly, how to ignore case of the key values ? I mean, instead of he obvious method, below, in new 2
                            BEFORE	After
                            And BEFORE	After
                            and before	After
                            

                            I tried this version, without success :

                            from Npp import editor1, editor2
                            import re
                            
                            replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
                            
                            def replace_with(m):
                                return replacements[m.group()]
                            
                            editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with, re.IGNORECASE)
                            

                            with new 1 :

                            <Wellcollection>
                            	<Well replace="false">
                            		<TEST>BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>And BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>and before</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            
                            

                            and new 2 :

                            BEFORE	After
                            And BEFORE	After
                            

                            I still had the error :

                            KeyError: 'and before'

                            TIA,

                            Best Regards,

                            guy038

                            Afterwards, I’ll do additional tests with your script and the regex S/R solution, below :

                            SEARCH (?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+

                            REPLACE ?1\2

                            This S/R executed against this text :

                            <Wellcollection>
                            	<Well replace="false">
                            		<TEST>BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>And BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>and before</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>And BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>BEFORE</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            ---
                            BEFORE	After
                            And BEFORE	After 2
                            

                            gives this changed text :

                            <Wellcollection>
                            	<Well replace="false">
                            		<TEST>After</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>After 2</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>After 2</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>After 2</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            		<TEST>After</TEST>
                            		<WellType>Oil</WellType>
                            		<ReserveStatusCollection>
                            

                            I bet that your script can manage largest files than my regex solution !

                            E 1 Reply Last reply Aug 31, 2020, 10:58 AM Reply Quote 2
                            • E
                              Ekopalypse @guy038
                              last edited by Aug 31, 2020, 10:58 AM

                              @guy038

                              thank you very much for checking the script.
                              A solution concerning missing replacements is to replace with what was
                              found or what I later suggested, creating unique search strings from the
                              list of replacements.
                              The first approach might work better for large files, the second seemed
                              more reasonable at this point.

                              To ignore case sensitivity - let’s changes the replacement keys to lower string
                              and check against lower matches but replace with defined replacement values. So a replacement dictionary would look like this

                                  replacements = {'täst': 'TEST'}
                              

                              and would replace either Täst or täst with TEST always.

                              Let me think about it again.

                              Alan KilbornA 1 Reply Last reply Aug 31, 2020, 2:19 PM Reply Quote 2
                              • Alan KilbornA
                                Alan Kilborn @Ekopalypse
                                last edited by Aug 31, 2020, 2:19 PM

                                @Ekopalypse said in another replacement request for help:

                                A solution concerning missing replacements is to replace with what was
                                found or what I later suggested, creating unique search strings from the
                                list of replacements.

                                Not sure about the “later suggest” part, but this seems like a quick solution to the former:

                                def replace_with(m):
                                    try:
                                        r = replacements[m.group()]
                                    except KeyError:
                                        r = m.group()
                                    return r
                                

                                Disclaimer: I didn’t actually try it. :-)

                                E 1 Reply Last reply Aug 31, 2020, 5:23 PM Reply Quote 2
                                • E
                                  Ekopalypse @Alan Kilborn
                                  last edited by Aug 31, 2020, 5:23 PM

                                  @Alan-Kilborn

                                  Yep, that is another solution as well.

                                  1 Reply Last reply Reply Quote 0
                                  • G
                                    guy038
                                    last edited by guy038 Aug 31, 2020, 8:43 PM Aug 31, 2020, 8:39 PM

                                    Hello @alan-kilborn, @ekopalypse and All,

                                    I’ve just tried your solution, Alan,and it worked like a charm ;-)) Clever : when an entry is missing in the dictionary, the script simply rewrites the key value

                                    So, if you realize that you forgot some entries in the new 2 file, ( so some replacements ! ), the easy workaround is :

                                    • Add all these new entries as well as their corresponding replacement parts

                                    • Re-run the script


                                    Now regarding the case’s problem, we may run a regex S/R to get an unique case of an expression, before running the script

                                    For instance, instead of defining, in new 2, these 5: entries

                                    BEFORE	After
                                    Before	After
                                    before	After
                                    BEFore	After
                                    befORE	After
                                    
                                    • Firstly, we would use the regex S/R, against our text in new 1 :

                                      • SEARCH (?i)Before

                                      • REPLACE \U$0

                                    • Secondly, we would re-run the Python script, with only 1 entry, in new 2 :

                                    BEFORE	After
                                    

                                    Best Regards

                                    guy038

                                    Alan KilbornA 1 Reply Last reply Aug 31, 2020, 9:09 PM Reply Quote 3
                                    • Alan KilbornA
                                      Alan Kilborn @guy038
                                      last edited by Aug 31, 2020, 9:09 PM

                                      @guy038 said in another replacement request for help:

                                      Now regarding the case’s problem,

                                      Yes, I didn’t see a quick mod to the original script, and I didn’t want to alter its “simple elegance” to hack in casing support. Of course, it’s possible to do, but the original was such a work of art… :-)

                                      1 Reply Last reply Reply Quote 2
                                      • G
                                        guy038
                                        last edited by Sep 1, 2020, 12:29 PM

                                        Hi, @alan-kilborn, @ekopalypse and All,

                                        Like @ekopalypse, I must had been tired, yesterday, while posting !

                                        Regarding the case’s problem, no need to re-run the script : Just run the following S/R, of course :

                                        • SEARCH (?i)Before

                                        • REPLACE After


                                        So, for all the remaining expressions which must be replaced, whatever their case, better to use the general regex :

                                        • SEARCH (?i)(Expression_1)|(Expression_2)|(Expression_3)....

                                        • REPLACE (?1Replacement_1)(?2Replacement_2)(?3Replacement_3)...

                                        BR

                                        guy038

                                        1 Reply Last reply Reply Quote 2
                                        • G
                                          guy038
                                          last edited by Sep 2, 2020, 5:14 PM

                                          Hello, @ekopalypse and All,

                                          After tests, I confirmed that any kind of regex, like below :

                                          SEARCH (?is)(?<=<TEST>)(.+?)(?=</TEST>.+^---.+^\1\t(.+?)$)|^---.+

                                          REPLACE ?1\2

                                          does not work properly ( 1 all contents match ! ) as soon as the scanned file contains more than 200 lines about :-(


                                          Whereas your Python script, with the @alan-kilborn’s modification, works nice !

                                          Given the initial XML file, below, in new 1 tab :

                                          <Wellcollection>
                                          	<Well replace="false">
                                          		<TEST>BEFORE</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>Guy</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>00000</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>00001</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>00002</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          ....
                                          ....
                                          		<TEST>99998</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>99999</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>BEFORE</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          		<TEST>Guy</TEST>
                                          		<WellType>Oil</WellType>
                                          		<ReserveStatusCollection/>
                                          	</Well>
                                          </Wellcollection>
                                          

                                          and the new_2 contents, in secondary view :

                                          BEFORE	After
                                          Guy	Guy038
                                          

                                          and the python script :

                                          from Npp import editor1, editor2
                                          
                                          replacements = dict(line.split('\t') for line in editor2.getText().splitlines() if line)
                                          
                                          def replace_with(m):
                                              try:
                                                  r = replacements[m.group()]
                                              except KeyError:
                                                  r = m.group()
                                              return r
                                          
                                          editor1.rereplace('(?<=<TEST>).+?(?=</TEST>)', replace_with)
                                          

                                          => On my old Win XP laptop, it took 26s to execute :

                                          • 100,000 identical replacements ( from 00000 to 99999 as not in the dictionary )

                                          • 2 replacements from BEFORE to After

                                          • 2 replacements from Guy to Guy038

                                          So, a script solution is definitively the right solution to adopt to solve this general task !

                                          Best regards

                                          guy038

                                          However, in my example, it would be valuable to find out a method to avoid all these identical replacements ;-))

                                          1 Reply Last reply Reply Quote 2
                                          29 out of 40
                                          • First post
                                            29/40
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors