Community
    • Login

    Massive list and massive search and replace?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    23 Posts 7 Posters 10.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn
      last edited by Alan Kilborn

      Ok, so I found some time to finish the script.

      Here’s how it works:

      You start with a file open and active in Notepad++ that contains your desired replacements. This file should be saved into the same folder in which you want to do the replacements. The file can have any name and needs to have the following format for your replacements list:

      blue->orange
      replace this->with this
      the delimiter is->a minus followed by a greater than
      

      At that point, you run the script. It will prompt you through a series of questions about the operation, giving you a chance to validate that you are doing what you intend at several steps:

      ce1880d1-2c1c-4d7a-a08d-bb6bc47aeca6-image.png
      086558b9-38bc-486b-93f7-5418b0689635-image.png
      141bc6e6-2511-42b3-b8ec-bdd907bb318d-image.png
      592c7ce0-5b52-42e7-812d-f6337b96daab-image.png

      Somewhat obviously, after you give the final “Yes” the real work will actually be done and the indicated replacements made.

      I call the script ReplaceInFilesFromListInActiveTab.py and here is its listing:

      # -*- coding: utf-8 -*-
      from __future__ import print_function
      
      # references:
      #  https://community.notepad-plus-plus.org/topic/23638/massive-list-and-massive-search-and-replace
      #  also possibly https://community.notepad-plus-plus.org/topic/22601
      #  also possibly https://community.notepad-plus-plus.org/topic/22721
      #  also possibly https://community.notepad-plus-plus.org/topic/23495
      
      from Npp import *
      import inspect
      import os
      import re
      import glob
      
      #-------------------------------------------------------------------------------
      
      class RIFFLIAT(object):
      
          def __init__(self):
      
              self.debug = True if 0 else False
              if self.debug:
                  console.show()
                  console.clear()
      
              self.this_script_name = inspect.getframeinfo(inspect.currentframe()).filename.split(os.sep)[-1].rsplit('.', 1)[0]
      
              # the active tab has the list of the substitution pairs
              substitutions_list_file_path = notepad.getCurrentFilename()
              if not os.path.isfile(substitutions_list_file_path):
                  self.mb('Substitution list file must be a hard-named file in the file system, i.e.,  not e.g. "new 2"')
                  return
              self.print('substitutions_list_file_path:', substitutions_list_file_path)
      
              find_and_repl_match_list = []
              delimiter = '->'
              editor.research(r'(?-s)^(.+?)' + delimiter + r'(.+)', lambda m: find_and_repl_match_list.append((m.group(1), m.group(2))))
              if len(find_and_repl_match_list) == 0:
                  self.mb('\r\n'.join([
                      'The substitution list in the active file has no findwhat/replacewith pairs\r\n',
                      'Format of file is, 1 pair per line, using  {d}  as a delimiter, no extra spaces:\r\n'.format(d=delimiter),
                      'find1{d}replace1'.format(d=delimiter),
                      'find2{d}replace2'.format(d=delimiter),
                      '...{d}...'.format(d=delimiter),
                      ]))
                  return
      
              sample_repl_pairs_summary_list = []
              three_or_less_sub_pairs = min(3, len(find_and_repl_match_list))
              num_sub_pairs_above_3 = len(find_and_repl_match_list) - three_or_less_sub_pairs
              max_chars_show = 20
              for (find_what, replace_with) in find_and_repl_match_list[ 0 : three_or_less_sub_pairs ]:
                  if len(find_what) > max_chars_show: find_what = find_what[ 0 : max_chars_show] + '...'
                  if len(replace_with) > max_chars_show: replace_with = replace_with[ 0 : max_chars_show] + '...'
                  sample_repl_pairs_summary_list.append('"{fw}" with "{rw}"'.format(fw=find_what, rw=replace_with))
              if num_sub_pairs_above_3 > 0: sample_repl_pairs_summary_list.append('(and {} more)'.format(num_sub_pairs_above_3))
      
              search_folder_top_level_path = substitutions_list_file_path.rsplit(os.sep, 1)[0] + os.sep
              self.print('search_folder_top_level_path:', search_folder_top_level_path)
      
              if not self.yes_no('\r\n\r\n'.join([
                      'Q1 of 4:\r\n',
                      'Perform these replacements (specified in the active file content):',
                      '\r\n'.join(sample_repl_pairs_summary_list) + '\r\n',
                      'in the files in this folder?',
                      search_folder_top_level_path,
                      '-' * 60,
                      'IT IS STRONGLY SUGGESTED TO MAKE A BACKUP',
                      'OF ALL SOURCE FILES BEFORE RUNNING THIS!',
                      ])):
                  return
      
              process_subfolders = self.yes_no_cancel('\r\n\r\n'.join([
                  'Q2 of 4:\r\n',
                  'Do replacements in files in SUBFOLDERS of this folder also?',
                  search_folder_top_level_path,
                  ]))
              if process_subfolders == None: return  # user cancel
              self.print('process_subfolders:', process_subfolders)
      
              default_filespec = '*.txt'
              filter_input = self.prompt(
                  'Q3 of 4:\r\n' + \
                  'Supply filespec filter list         (example:    *.html *.txt *.log    )',
                  default_filespec)
              if filter_input == None: return  # user cancel
              filters_list = filter_input.split(' ')
              filters_list = [ f for f in filters_list if len(f) > 0 ]  # remove any empty entries in filters_list
              self.print('filters_list:', filters_list)
      
              pathnames_of_files_to_replace_in_list = []
              for (root, dirs, files) in os.walk(search_folder_top_level_path):
                  for filt in filters_list:
                      for p in glob.glob(os.path.join(root, filt)):
                          if p != substitutions_list_file_path:
                              pathnames_of_files_to_replace_in_list.append(p)
                  if not process_subfolders: break
              if len(pathnames_of_files_to_replace_in_list) == 0:
                  self.mb('No files matched specified filter(s)')
                  return
      
              num_files_to_examine = len(pathnames_of_files_to_replace_in_list)
      
              if not self.yes_no('\r\n\r\n'.join([
                      'Q4 of 4:\r\n',
                      '---- FINAL CONFIRM ----\r\n',
                      'Make replacements in {nfe} candidate files in this folder{b} ?'.format(
                          nfe=num_files_to_examine,
                          b=' AND below' if process_subfolders else '\r\n(but not its subfolders)'),
                      search_folder_top_level_path,
                      ])):
                  return
      
              pathname_currently_open_in_a_tab_list = []
              for (pathname, buffer_id, index, view) in notepad.getFiles():
                  if pathname not in pathname_currently_open_in_a_tab_list:
                      pathname_currently_open_in_a_tab_list.append(pathname)
      
              num_repl_made_in_all_files = 0
              pathnames_with_content_changed_by_repl_list = []
      
              for pathname in pathnames_of_files_to_replace_in_list:
      
                  if pathname in pathname_currently_open_in_a_tab_list:
                      self.print('switching active tab to', pathname)
                      notepad.activateFile(pathname)
                      editor.beginUndoAction()
                  else:
                      self.print('opening', pathname)
                      notepad.open(pathname)
                  if notepad.getCurrentFilename() != pathname: continue  # shouldn't happen
      
                  for (find_what, replace_with) in find_and_repl_match_list:
      
                      # since the editor.replace() function won't tell us how many replacements it made,
                      #  count them by searching for the matches BEFORE doing the replacement
                      self.num_repl_made_in_this_file = 0
                      def match_found(m): self.num_repl_made_in_this_file += 1
                      editor.search(find_what, match_found)
      
                      if self.num_repl_made_in_this_file > 0:
      
                          self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                              fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file))
      
                          num_repl_made_in_all_files += self.num_repl_made_in_this_file
      
                          if pathname not in pathnames_with_content_changed_by_repl_list:
                              pathnames_with_content_changed_by_repl_list.append(pathname)
      
                          # FINALLY, the actual replacement!
                          editor.replace(find_what, replace_with)
      
                  if pathname in pathname_currently_open_in_a_tab_list:
                      editor.endUndoAction()
                  else:
                      if editor.getModify():
                          self.print('saving', pathname)
                          notepad.save()
                      self.print('closing', pathname)
                      notepad.close()
      
              # restore tab that was active before we started
              notepad.activateFile(substitutions_list_file_path)
      
              self.mb('\r\n\r\n'.join([
                  '---- DONE! ----',
                  '{nr} total replacements made in {nrf} files'.format(nr=num_repl_made_in_all_files,
                      nrf=len(pathnames_with_content_changed_by_repl_list)),
                  '(of {nfe} files matching filters provided)'.format(nfe=num_files_to_examine),
                  ]))
      
          def print(self, *args):
              if self.debug:
                  print('RIFFLIAT:', *args)
      
          def mb(self, msg, flags=0, title=''):  # a message-box function
              return notepad.messageBox(msg, title if title else self.this_script_name, flags)
      
          def yes_no(self, question_text):
              retval = False
              answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNO, self.this_script_name)
              return True if answer == MESSAGEBOXFLAGS.RESULTYES else False
      
          def yes_no_cancel(self, question_text):
              retval = None
              answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNOCANCEL, self.this_script_name)
              if answer == MESSAGEBOXFLAGS.RESULTYES: retval = True
              elif answer == MESSAGEBOXFLAGS.RESULTNO: retval = False
              return retval
      
          def prompt(self, prompt_text, default_text=''):
              if '\n' not in prompt_text: prompt_text = '\r\n' + prompt_text
              prompt_text += ':'
              return notepad.prompt(prompt_text, self.this_script_name, default_text)
      
      #-------------------------------------------------------------------------------
      
      if __name__ == '__main__': RIFFLIAT()
      

      For basic information about setting up scripting, see the REFERENCE I provided in an earlier post in this thread.

      (And BTW, thanks to @PeterJones for some prerelease testing on this!)

      Calvin FooC Yurble VươngY 2 Replies Last reply Reply Quote 3
      • Alan KilbornA Alan Kilborn referenced this topic on
      • Calvin FooC
        Calvin Foo @Alan Kilborn
        last edited by Calvin Foo

        @Alan-Kilborn Thanks, but wow, this seems a bit too complicated for me.
        Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

        I can just add new words in the excel file, then I just run the script

        PeterJonesP Terry RT Alan KilbornA 3 Replies Last reply Reply Quote 0
        • PeterJonesP
          PeterJones @Calvin Foo
          last edited by PeterJones

          @Calvin-Foo ,

          Most of the complication is just setting up Python Script plugin and installing the script once.

          After that, you just have to run the script when you have a file

          blue->orange
          replace this->with this
          the delimiter is->a minus followed by a greater than
          

          It’s really not hard to create that substitution file.

          And having the replacement data in Excel would make it harder for Alan to write a script for you (and this forum is not a code-writing service), but because the script would still be written in PythonScript, you would still have to install PythonScript and install that script once. Running it would be just as easy for you whether you run it with a text file as the source of the search->replace pairs or whether you run a script that has to parse some external Excel spreadsheet (easier, actually, for the text file, because then you don’t have to also run Excel just to prepare for a search-and-replace in Notepad++). He would still have to have all those confirmation dialogs whether the map is in Excel or in Notepad++… and he’d also have to have another dialog which asks you where the Excel spreadsheet was.

          He wrote this not just for you, but also for all the other people who ask for nearly the same functionality (we’ve seen similar requests a lot over the years, and he finally decided that we needed one generic script to handle them all, so that we’d stop having to write customized scripts for each user). If using this generic script is too complicated for you, you will not like any implementation that anyone here is able to give you.

          Good luck.

          1 Reply Last reply Reply Quote 4
          • Terry RT
            Terry R @Calvin Foo
            last edited by

            @Calvin-Foo said in Massive list and massive search and replace?:

            Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

            If you are referring to an Excel file with extension XLSX then no, Notepad++ does NOT read files which are binary in nature very well. It is after all a TEXT editor, not a Binary editor.

            However since you refer to an “Excel page” and refer to words in 2 columns, that could also be a CSV (comma separated value) file. And whilst @Alan-Kilborn has used a TSV (tab separated value) file, the 2 are very similar. He possibly could alter his code to use the comma instead of a tab, but possibly used the tab to prevent possible confusion within words.

            But doing that minor change to his code isn’t going to simplify the process anyways. Just be thankful he has gone to such lengths to help you. Sometimes doing processes such as you outlined can be made easier, but will still require time to setup.

            Terry

            PS I see @PeterJones has stated the same.

            Alan KilbornA 1 Reply Last reply Reply Quote 2
            • Alan KilbornA
              Alan Kilborn @Terry R
              last edited by

              @Terry-R said in Massive list and massive search and replace?:

              And whilst @Alan-Kilborn has used a TSV (tab separated value) file

              Actually it isn’t a “tab”, although I can see why you’d think that. I chose - followed by > as the delimiter. The delimiter is specifically variable-ized in the script, so one could easily change it to whatever is desired.

              1 Reply Last reply Reply Quote 3
              • Alan KilbornA
                Alan Kilborn @Calvin Foo
                last edited by

                @Calvin-Foo said in Massive list and massive search and replace?:

                Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                I suppose it IS possible, but not by me.
                The intent of the script was to solve kind of a general case problem, in a general way.
                Of course anyone can treat it as a demo, and feel free to modify it at will.

                1 Reply Last reply Reply Quote 3
                • Calvin FooC
                  Calvin Foo
                  last edited by

                  I guess I need to learn from ground up. I only have experience in writing ASP 3.0. and SQL Server.

                  I guess I just start from there

                  Maybe anyone can give me a simple guide on How to write a simple replace text script? I’ll further study from there and include a list of text (maybe import from CSV)

                  Neil SchipperN Alan KilbornA 2 Replies Last reply Reply Quote 0
                  • Neil SchipperN
                    Neil Schipper @Calvin Foo
                    last edited by Neil Schipper

                    @Calvin-Foo

                    There’s very, very, very little to learn. First, follow the instructions in Alan’s second post to you (“REFERENCE” link) LIKE A MONKEY.

                    Here’s a first script you can run:

                    #! python
                    import sys
                    print "Old style print syntax"
                    sys.stdout.write("Calvin-Foo's first script -- hello from Python %s\n" % (sys.version,))
                    

                    You don’t need to know what any of the lines mean or what they do. (Once you get it running, you can hack at it for fun).

                    There’s a lot complexity in airplanes and elevators and keyboards and phones that you don’t see and don’t need to deal with. It’s really quite similar.

                    1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn @Calvin Foo
                      last edited by Alan Kilborn

                      @Calvin-Foo said in Massive list and massive search and replace?:

                      I guess I need to learn from ground up.
                      I guess I just start from thereI’ll further study from there and include a list of text (maybe import from CSV)

                      In case it isn’t obvious, you could take YOUR data, in whatever (textual) format, and use Notepad++ to change it with a replacement operation into MY demo format, and then just run with the demo solution.

                      This avoids programming and could (probably) be made into a N++ macro for easy repetitive running.

                      As an example, take your original problem statement data (I know it isn’t your real data, but we have none of that here, so…):

                      1. james - James
                      2. calvin - Calvin
                      3. new york - New York
                      

                      You could change that into the needed input format for the script by this operation:

                      Find: (?-s)^\d+\. (.+?) - (.+)
                      Replace: ${1}->${2}
                      Search mode: Regular expression
                      Wrap around: Checked
                      Action: Replace All button

                      After the replace-all, your data would then look like this:

                      james->James
                      calvin->Calvin
                      new york->New York
                      

                      which would be a direct feed-in to the demo script.


                      this seems a bit too complicated for me

                      Of course, this data transform may involve learning some regular expressions (don’t know your expertise), but I don’t think it is too much to ask people that request a moderately-complex solution to a problem to do some sort of learning of their own along the way.


                      How to write a simple replace text script?

                      Well about the simplest one I can think of would be a one-liner:

                      editor.replace('apple', 'Apple')

                      1 Reply Last reply Reply Quote 2
                      • Alan KilbornA Alan Kilborn referenced this topic on
                      • Alan KilbornA Alan Kilborn referenced this topic on
                      • Alan KilbornA Alan Kilborn referenced this topic on
                      • fenzek1F fenzek1 referenced this topic on
                      • nerdyone255N
                        nerdyone255
                        last edited by

                        this script is FANTASTIC.

                        i do have a question though- in the final output it prints how many replacements were made, but is there any way to see what the actual replacements were?

                        looking into the code i see

                        “”" if self.num_repl_made_in_this_file > 0:

                                        self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                            fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file)) """
                        

                        but i dont see where that would get printed- it doesnt show up in the python console either

                        again huge thanks for this one

                        Alan KilbornA 1 Reply Last reply Reply Quote 0
                        • Alan KilbornA
                          Alan Kilborn @nerdyone255
                          last edited by Alan Kilborn

                          @nerdyone255 said :

                          this script is FANTASTIC.

                          Well…glad you like it.

                          but i dont see where that would get printed- it doesnt show up in the python console either

                          The self.print() function calls are really meant as debug helpers while testing the script. Thus, in the version of the script above, they don’t do anything because the debug variable is set to False. If you change the 0 to a 1 in this line:

                          self.debug = True if 0 else False

                          or simply change it to:

                          self.debug = True

                          then the output of the self.print() calls will go to the PythonScript console window. You’ll see the output you indicated you were interested, plus output from other things that happen while the script is running.

                          nerdyone255N 1 Reply Last reply Reply Quote 2
                          • nerdyone255N
                            nerdyone255 @Alan Kilborn
                            last edited by

                            @Alan-Kilborn perfect!

                            1 Reply Last reply Reply Quote 1
                            • Alan KilbornA Alan Kilborn referenced this topic on
                            • Yurble VươngY
                              Yurble Vương @Alan Kilborn
                              last edited by PeterJones

                              @Alan-Kilborn

                              Thanks Alan, it work perfectly, Except one specially for me. Appreciate your help if possible:

                              I want to find only within word boundary.


                              For example:
                              Sentence: You are eating apple. The tree have a lot of apples. all the apples is green.
                              apple->cherry
                              apples->cherries


                              Hence, how can I add in a code to made it change only words start or end a transition from space to non-space character (space, common, dot, quote marks, questions mark…).

                              thanks in advance
                              Yurble

                              Alan KilbornA 1 Reply Last reply Reply Quote 0
                              • Alan KilbornA
                                Alan Kilborn @Yurble Vương
                                last edited by

                                @Yurble-Vương said in Massive list and massive search and replace?:

                                how can I add in a code to made it change only words start or end a transition from space to non-space character

                                You can use \b in the regular expression to insist upon a word boundary; example: \bapple will match have an apple today but will not match have a crabapple today.

                                Yurble VươngY 2 Replies Last reply Reply Quote 1
                                • Yurble VươngY
                                  Yurble Vương @Alan Kilborn
                                  last edited by

                                  This post is deleted!
                                  1 Reply Last reply Reply Quote 0
                                  • Yurble VươngY
                                    Yurble Vương @Alan Kilborn
                                    last edited by Yurble Vương

                                    @Alan-Kilborn

                                    Sorry to ask and bother you. I tried to edit your py code as below, but it seems to be a wrong code. Could you advise how to correct:

                                    Not work:

                                         # FINALLY, the actual replacement!
                                                    editor.rereplace('\b'+find_what+'\b', replace_with)
                                    

                                    Not work 2:

                                         # FINALLY, the actual replacement!
                                                    editor.rereplace(r'\b'+find_what+'\b', replace_with)
                                    
                                    Alan KilbornA 1 Reply Last reply Reply Quote 1
                                    • Alan KilbornA
                                      Alan Kilborn @Yurble Vương
                                      last edited by

                                      @Yurble-Vương said :

                                      Not work 2

                                      This should work: editor.rereplace(r'\b'+find_what+r'\b', replace_with)

                                      I like how you showed initiative in trying to solve the problem yourself…and you had the right idea, you just didn’t take it far enough.

                                      Yurble VươngY 1 Reply Last reply Reply Quote 1
                                      • Yurble VươngY
                                        Yurble Vương @Alan Kilborn
                                        last edited by

                                        @Alan-Kilborn

                                        Many thanks for help

                                        1 Reply Last reply Reply Quote 0
                                        • Alan KilbornA Alan Kilborn referenced this topic on
                                        • First post
                                          Last post
                                        The Community of users of the Notepad++ text editor.
                                        Powered by NodeBB | Contributors