• Login
Community
  • Login

Massive list and massive search and replace?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
23 Posts 7 Posters 7.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C
    Calvin Foo
    last edited by Oct 20, 2022, 6:21 PM

    I have multiple text files (over 100)

    and I need to do multiple replace text (a lot)

    example

    1. james - James
    2. calvin - Calvin
    3. new york - New York

    is it possible to upload a list of text needs replacement (maybe Excel?) then search and replace?

    A 1 Reply Last reply Oct 20, 2022, 6:59 PM Reply Quote 0
    • A
      Alan Kilborn @Calvin Foo
      last edited by Oct 20, 2022, 6:59 PM

      @Calvin-Foo

      There really isn’t a native-to-Notepad++ way of doing this.
      If you’re willing to use a scripting plugin, though, we can “get 'er done” – how about it? Are you willing to go to the “complication” of setting up scripting?

      C 1 Reply Last reply Oct 20, 2022, 7:16 PM Reply Quote 1
      • C
        Calvin Foo @Alan Kilborn
        last edited by Oct 20, 2022, 7:16 PM

        @Alan-Kilborn I dont mind a little scripting. But I didnt know Notepad++ can do Scripting

        A 1 Reply Last reply Oct 20, 2022, 7:20 PM Reply Quote 1
        • A
          Alan Kilborn @Calvin Foo
          last edited by Oct 20, 2022, 7:20 PM

          @Calvin-Foo said in Massive list and massive search and replace?:

          I didnt know Notepad++ can do Scripting

          Yes, here’s a good starting point REFERENCE.

          If you can give me a little time, I’ll put together a demo in my upcoming spare time. We have some scripts here on this site for “replacing from a list” but I don’t think we have anything that operates over a folder tree of files. I could pull together something that does both.

          C 1 Reply Last reply Oct 20, 2022, 7:30 PM Reply Quote 1
          • C
            Calvin Foo @Alan Kilborn
            last edited by Oct 20, 2022, 7:30 PM

            @Alan-Kilborn it will be great if you can help me on this. TQVM

            1 Reply Last reply Reply Quote 0
            • A
              Alan Kilborn
              last edited by Alan Kilborn Oct 24, 2022, 2:13 PM Oct 24, 2022, 2:00 PM

              Ok, so I found some time to finish the script.

              Here’s how it works:

              You start with a file open and active in Notepad++ that contains your desired replacements. This file should be saved into the same folder in which you want to do the replacements. The file can have any name and needs to have the following format for your replacements list:

              blue->orange
              replace this->with this
              the delimiter is->a minus followed by a greater than
              

              At that point, you run the script. It will prompt you through a series of questions about the operation, giving you a chance to validate that you are doing what you intend at several steps:

              ce1880d1-2c1c-4d7a-a08d-bb6bc47aeca6-image.png
              086558b9-38bc-486b-93f7-5418b0689635-image.png
              141bc6e6-2511-42b3-b8ec-bdd907bb318d-image.png
              592c7ce0-5b52-42e7-812d-f6337b96daab-image.png

              Somewhat obviously, after you give the final “Yes” the real work will actually be done and the indicated replacements made.

              I call the script ReplaceInFilesFromListInActiveTab.py and here is its listing:

              # -*- coding: utf-8 -*-
              from __future__ import print_function
              
              # references:
              #  https://community.notepad-plus-plus.org/topic/23638/massive-list-and-massive-search-and-replace
              #  also possibly https://community.notepad-plus-plus.org/topic/22601
              #  also possibly https://community.notepad-plus-plus.org/topic/22721
              #  also possibly https://community.notepad-plus-plus.org/topic/23495
              
              from Npp import *
              import inspect
              import os
              import re
              import glob
              
              #-------------------------------------------------------------------------------
              
              class RIFFLIAT(object):
              
                  def __init__(self):
              
                      self.debug = True if 0 else False
                      if self.debug:
                          console.show()
                          console.clear()
              
                      self.this_script_name = inspect.getframeinfo(inspect.currentframe()).filename.split(os.sep)[-1].rsplit('.', 1)[0]
              
                      # the active tab has the list of the substitution pairs
                      substitutions_list_file_path = notepad.getCurrentFilename()
                      if not os.path.isfile(substitutions_list_file_path):
                          self.mb('Substitution list file must be a hard-named file in the file system, i.e.,  not e.g. "new 2"')
                          return
                      self.print('substitutions_list_file_path:', substitutions_list_file_path)
              
                      find_and_repl_match_list = []
                      delimiter = '->'
                      editor.research(r'(?-s)^(.+?)' + delimiter + r'(.+)', lambda m: find_and_repl_match_list.append((m.group(1), m.group(2))))
                      if len(find_and_repl_match_list) == 0:
                          self.mb('\r\n'.join([
                              'The substitution list in the active file has no findwhat/replacewith pairs\r\n',
                              'Format of file is, 1 pair per line, using  {d}  as a delimiter, no extra spaces:\r\n'.format(d=delimiter),
                              'find1{d}replace1'.format(d=delimiter),
                              'find2{d}replace2'.format(d=delimiter),
                              '...{d}...'.format(d=delimiter),
                              ]))
                          return
              
                      sample_repl_pairs_summary_list = []
                      three_or_less_sub_pairs = min(3, len(find_and_repl_match_list))
                      num_sub_pairs_above_3 = len(find_and_repl_match_list) - three_or_less_sub_pairs
                      max_chars_show = 20
                      for (find_what, replace_with) in find_and_repl_match_list[ 0 : three_or_less_sub_pairs ]:
                          if len(find_what) > max_chars_show: find_what = find_what[ 0 : max_chars_show] + '...'
                          if len(replace_with) > max_chars_show: replace_with = replace_with[ 0 : max_chars_show] + '...'
                          sample_repl_pairs_summary_list.append('"{fw}" with "{rw}"'.format(fw=find_what, rw=replace_with))
                      if num_sub_pairs_above_3 > 0: sample_repl_pairs_summary_list.append('(and {} more)'.format(num_sub_pairs_above_3))
              
                      search_folder_top_level_path = substitutions_list_file_path.rsplit(os.sep, 1)[0] + os.sep
                      self.print('search_folder_top_level_path:', search_folder_top_level_path)
              
                      if not self.yes_no('\r\n\r\n'.join([
                              'Q1 of 4:\r\n',
                              'Perform these replacements (specified in the active file content):',
                              '\r\n'.join(sample_repl_pairs_summary_list) + '\r\n',
                              'in the files in this folder?',
                              search_folder_top_level_path,
                              '-' * 60,
                              'IT IS STRONGLY SUGGESTED TO MAKE A BACKUP',
                              'OF ALL SOURCE FILES BEFORE RUNNING THIS!',
                              ])):
                          return
              
                      process_subfolders = self.yes_no_cancel('\r\n\r\n'.join([
                          'Q2 of 4:\r\n',
                          'Do replacements in files in SUBFOLDERS of this folder also?',
                          search_folder_top_level_path,
                          ]))
                      if process_subfolders == None: return  # user cancel
                      self.print('process_subfolders:', process_subfolders)
              
                      default_filespec = '*.txt'
                      filter_input = self.prompt(
                          'Q3 of 4:\r\n' + \
                          'Supply filespec filter list         (example:    *.html *.txt *.log    )',
                          default_filespec)
                      if filter_input == None: return  # user cancel
                      filters_list = filter_input.split(' ')
                      filters_list = [ f for f in filters_list if len(f) > 0 ]  # remove any empty entries in filters_list
                      self.print('filters_list:', filters_list)
              
                      pathnames_of_files_to_replace_in_list = []
                      for (root, dirs, files) in os.walk(search_folder_top_level_path):
                          for filt in filters_list:
                              for p in glob.glob(os.path.join(root, filt)):
                                  if p != substitutions_list_file_path:
                                      pathnames_of_files_to_replace_in_list.append(p)
                          if not process_subfolders: break
                      if len(pathnames_of_files_to_replace_in_list) == 0:
                          self.mb('No files matched specified filter(s)')
                          return
              
                      num_files_to_examine = len(pathnames_of_files_to_replace_in_list)
              
                      if not self.yes_no('\r\n\r\n'.join([
                              'Q4 of 4:\r\n',
                              '---- FINAL CONFIRM ----\r\n',
                              'Make replacements in {nfe} candidate files in this folder{b} ?'.format(
                                  nfe=num_files_to_examine,
                                  b=' AND below' if process_subfolders else '\r\n(but not its subfolders)'),
                              search_folder_top_level_path,
                              ])):
                          return
              
                      pathname_currently_open_in_a_tab_list = []
                      for (pathname, buffer_id, index, view) in notepad.getFiles():
                          if pathname not in pathname_currently_open_in_a_tab_list:
                              pathname_currently_open_in_a_tab_list.append(pathname)
              
                      num_repl_made_in_all_files = 0
                      pathnames_with_content_changed_by_repl_list = []
              
                      for pathname in pathnames_of_files_to_replace_in_list:
              
                          if pathname in pathname_currently_open_in_a_tab_list:
                              self.print('switching active tab to', pathname)
                              notepad.activateFile(pathname)
                              editor.beginUndoAction()
                          else:
                              self.print('opening', pathname)
                              notepad.open(pathname)
                          if notepad.getCurrentFilename() != pathname: continue  # shouldn't happen
              
                          for (find_what, replace_with) in find_and_repl_match_list:
              
                              # since the editor.replace() function won't tell us how many replacements it made,
                              #  count them by searching for the matches BEFORE doing the replacement
                              self.num_repl_made_in_this_file = 0
                              def match_found(m): self.num_repl_made_in_this_file += 1
                              editor.search(find_what, match_found)
              
                              if self.num_repl_made_in_this_file > 0:
              
                                  self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                      fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file))
              
                                  num_repl_made_in_all_files += self.num_repl_made_in_this_file
              
                                  if pathname not in pathnames_with_content_changed_by_repl_list:
                                      pathnames_with_content_changed_by_repl_list.append(pathname)
              
                                  # FINALLY, the actual replacement!
                                  editor.replace(find_what, replace_with)
              
                          if pathname in pathname_currently_open_in_a_tab_list:
                              editor.endUndoAction()
                          else:
                              if editor.getModify():
                                  self.print('saving', pathname)
                                  notepad.save()
                              self.print('closing', pathname)
                              notepad.close()
              
                      # restore tab that was active before we started
                      notepad.activateFile(substitutions_list_file_path)
              
                      self.mb('\r\n\r\n'.join([
                          '---- DONE! ----',
                          '{nr} total replacements made in {nrf} files'.format(nr=num_repl_made_in_all_files,
                              nrf=len(pathnames_with_content_changed_by_repl_list)),
                          '(of {nfe} files matching filters provided)'.format(nfe=num_files_to_examine),
                          ]))
              
                  def print(self, *args):
                      if self.debug:
                          print('RIFFLIAT:', *args)
              
                  def mb(self, msg, flags=0, title=''):  # a message-box function
                      return notepad.messageBox(msg, title if title else self.this_script_name, flags)
              
                  def yes_no(self, question_text):
                      retval = False
                      answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNO, self.this_script_name)
                      return True if answer == MESSAGEBOXFLAGS.RESULTYES else False
              
                  def yes_no_cancel(self, question_text):
                      retval = None
                      answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNOCANCEL, self.this_script_name)
                      if answer == MESSAGEBOXFLAGS.RESULTYES: retval = True
                      elif answer == MESSAGEBOXFLAGS.RESULTNO: retval = False
                      return retval
              
                  def prompt(self, prompt_text, default_text=''):
                      if '\n' not in prompt_text: prompt_text = '\r\n' + prompt_text
                      prompt_text += ':'
                      return notepad.prompt(prompt_text, self.this_script_name, default_text)
              
              #-------------------------------------------------------------------------------
              
              if __name__ == '__main__': RIFFLIAT()
              

              For basic information about setting up scripting, see the REFERENCE I provided in an earlier post in this thread.

              (And BTW, thanks to @PeterJones for some prerelease testing on this!)

              C Yurble VươngY 2 Replies Last reply Oct 25, 2022, 7:22 PM Reply Quote 3
              • A Alan Kilborn referenced this topic on Oct 24, 2022, 2:00 PM
              • C
                Calvin Foo @Alan Kilborn
                last edited by Calvin Foo Oct 25, 2022, 7:25 PM Oct 25, 2022, 7:22 PM

                @Alan-Kilborn Thanks, but wow, this seems a bit too complicated for me.
                Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                I can just add new words in the excel file, then I just run the script

                PeterJonesP Terry RT A 3 Replies Last reply Oct 25, 2022, 7:39 PM Reply Quote 0
                • PeterJonesP
                  PeterJones @Calvin Foo
                  last edited by PeterJones Oct 25, 2022, 7:45 PM Oct 25, 2022, 7:39 PM

                  @Calvin-Foo ,

                  Most of the complication is just setting up Python Script plugin and installing the script once.

                  After that, you just have to run the script when you have a file

                  blue->orange
                  replace this->with this
                  the delimiter is->a minus followed by a greater than
                  

                  It’s really not hard to create that substitution file.

                  And having the replacement data in Excel would make it harder for Alan to write a script for you (and this forum is not a code-writing service), but because the script would still be written in PythonScript, you would still have to install PythonScript and install that script once. Running it would be just as easy for you whether you run it with a text file as the source of the search->replace pairs or whether you run a script that has to parse some external Excel spreadsheet (easier, actually, for the text file, because then you don’t have to also run Excel just to prepare for a search-and-replace in Notepad++). He would still have to have all those confirmation dialogs whether the map is in Excel or in Notepad++… and he’d also have to have another dialog which asks you where the Excel spreadsheet was.

                  He wrote this not just for you, but also for all the other people who ask for nearly the same functionality (we’ve seen similar requests a lot over the years, and he finally decided that we needed one generic script to handle them all, so that we’d stop having to write customized scripts for each user). If using this generic script is too complicated for you, you will not like any implementation that anyone here is able to give you.

                  Good luck.

                  1 Reply Last reply Reply Quote 4
                  • Terry RT
                    Terry R @Calvin Foo
                    last edited by Oct 25, 2022, 7:41 PM

                    @Calvin-Foo said in Massive list and massive search and replace?:

                    Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                    If you are referring to an Excel file with extension XLSX then no, Notepad++ does NOT read files which are binary in nature very well. It is after all a TEXT editor, not a Binary editor.

                    However since you refer to an “Excel page” and refer to words in 2 columns, that could also be a CSV (comma separated value) file. And whilst @Alan-Kilborn has used a TSV (tab separated value) file, the 2 are very similar. He possibly could alter his code to use the comma instead of a tab, but possibly used the tab to prevent possible confusion within words.

                    But doing that minor change to his code isn’t going to simplify the process anyways. Just be thankful he has gone to such lengths to help you. Sometimes doing processes such as you outlined can be made easier, but will still require time to setup.

                    Terry

                    PS I see @PeterJones has stated the same.

                    A 1 Reply Last reply Oct 25, 2022, 8:40 PM Reply Quote 2
                    • A
                      Alan Kilborn @Terry R
                      last edited by Oct 25, 2022, 8:40 PM

                      @Terry-R said in Massive list and massive search and replace?:

                      And whilst @Alan-Kilborn has used a TSV (tab separated value) file

                      Actually it isn’t a “tab”, although I can see why you’d think that. I chose - followed by > as the delimiter. The delimiter is specifically variable-ized in the script, so one could easily change it to whatever is desired.

                      1 Reply Last reply Reply Quote 3
                      • A
                        Alan Kilborn @Calvin Foo
                        last edited by Oct 25, 2022, 8:41 PM

                        @Calvin-Foo said in Massive list and massive search and replace?:

                        Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                        I suppose it IS possible, but not by me.
                        The intent of the script was to solve kind of a general case problem, in a general way.
                        Of course anyone can treat it as a demo, and feel free to modify it at will.

                        1 Reply Last reply Reply Quote 3
                        • C
                          Calvin Foo
                          last edited by Oct 26, 2022, 9:02 AM

                          I guess I need to learn from ground up. I only have experience in writing ASP 3.0. and SQL Server.

                          I guess I just start from there

                          Maybe anyone can give me a simple guide on How to write a simple replace text script? I’ll further study from there and include a list of text (maybe import from CSV)

                          Neil SchipperN A 2 Replies Last reply Oct 26, 2022, 9:54 AM Reply Quote 0
                          • Neil SchipperN
                            Neil Schipper @Calvin Foo
                            last edited by Neil Schipper Oct 26, 2022, 10:16 AM Oct 26, 2022, 9:54 AM

                            @Calvin-Foo

                            There’s very, very, very little to learn. First, follow the instructions in Alan’s second post to you (“REFERENCE” link) LIKE A MONKEY.

                            Here’s a first script you can run:

                            #! python
                            import sys
                            print "Old style print syntax"
                            sys.stdout.write("Calvin-Foo's first script -- hello from Python %s\n" % (sys.version,))
                            

                            You don’t need to know what any of the lines mean or what they do. (Once you get it running, you can hack at it for fun).

                            There’s a lot complexity in airplanes and elevators and keyboards and phones that you don’t see and don’t need to deal with. It’s really quite similar.

                            1 Reply Last reply Reply Quote 1
                            • A
                              Alan Kilborn @Calvin Foo
                              last edited by Alan Kilborn Oct 26, 2022, 12:12 PM Oct 26, 2022, 12:11 PM

                              @Calvin-Foo said in Massive list and massive search and replace?:

                              I guess I need to learn from ground up.
                              I guess I just start from thereI’ll further study from there and include a list of text (maybe import from CSV)

                              In case it isn’t obvious, you could take YOUR data, in whatever (textual) format, and use Notepad++ to change it with a replacement operation into MY demo format, and then just run with the demo solution.

                              This avoids programming and could (probably) be made into a N++ macro for easy repetitive running.

                              As an example, take your original problem statement data (I know it isn’t your real data, but we have none of that here, so…):

                              1. james - James
                              2. calvin - Calvin
                              3. new york - New York
                              

                              You could change that into the needed input format for the script by this operation:

                              Find: (?-s)^\d+\. (.+?) - (.+)
                              Replace: ${1}->${2}
                              Search mode: Regular expression
                              Wrap around: Checked
                              Action: Replace All button

                              After the replace-all, your data would then look like this:

                              james->James
                              calvin->Calvin
                              new york->New York
                              

                              which would be a direct feed-in to the demo script.


                              this seems a bit too complicated for me

                              Of course, this data transform may involve learning some regular expressions (don’t know your expertise), but I don’t think it is too much to ask people that request a moderately-complex solution to a problem to do some sort of learning of their own along the way.


                              How to write a simple replace text script?

                              Well about the simplest one I can think of would be a one-liner:

                              editor.replace('apple', 'Apple')

                              1 Reply Last reply Reply Quote 2
                              • A Alan Kilborn referenced this topic on Nov 21, 2022, 6:42 PM
                              • A Alan Kilborn referenced this topic on Nov 21, 2022, 7:53 PM
                              • A Alan Kilborn referenced this topic on Jan 4, 2023, 1:10 PM
                              • fenzek1F fenzek1 referenced this topic on Jan 4, 2023, 3:19 PM
                              • nerdyone255N
                                nerdyone255
                                last edited by Jun 7, 2024, 4:26 PM

                                this script is FANTASTIC.

                                i do have a question though- in the final output it prints how many replacements were made, but is there any way to see what the actual replacements were?

                                looking into the code i see

                                “”" if self.num_repl_made_in_this_file > 0:

                                                self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                                    fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file)) """
                                

                                but i dont see where that would get printed- it doesnt show up in the python console either

                                again huge thanks for this one

                                A 1 Reply Last reply Jun 8, 2024, 1:19 AM Reply Quote 0
                                • A
                                  Alan Kilborn @nerdyone255
                                  last edited by Alan Kilborn Jun 8, 2024, 1:21 AM Jun 8, 2024, 1:19 AM

                                  @nerdyone255 said :

                                  this script is FANTASTIC.

                                  Well…glad you like it.

                                  but i dont see where that would get printed- it doesnt show up in the python console either

                                  The self.print() function calls are really meant as debug helpers while testing the script. Thus, in the version of the script above, they don’t do anything because the debug variable is set to False. If you change the 0 to a 1 in this line:

                                  self.debug = True if 0 else False

                                  or simply change it to:

                                  self.debug = True

                                  then the output of the self.print() calls will go to the PythonScript console window. You’ll see the output you indicated you were interested, plus output from other things that happen while the script is running.

                                  nerdyone255N 1 Reply Last reply Jun 11, 2024, 1:39 PM Reply Quote 2
                                  • nerdyone255N
                                    nerdyone255 @Alan Kilborn
                                    last edited by Jun 11, 2024, 1:39 PM

                                    @Alan-Kilborn perfect!

                                    1 Reply Last reply Reply Quote 1
                                    • A Alan Kilborn referenced this topic on Aug 14, 2024, 10:42 AM
                                    • Yurble VươngY
                                      Yurble Vương @Alan Kilborn
                                      last edited by PeterJones Feb 9, 2025, 4:49 PM Feb 9, 2025, 4:41 PM

                                      @Alan-Kilborn

                                      Thanks Alan, it work perfectly, Except one specially for me. Appreciate your help if possible:

                                      I want to find only within word boundary.


                                      For example:
                                      Sentence: You are eating apple. The tree have a lot of apples. all the apples is green.
                                      apple->cherry
                                      apples->cherries


                                      Hence, how can I add in a code to made it change only words start or end a transition from space to non-space character (space, common, dot, quote marks, questions mark…).

                                      thanks in advance
                                      Yurble

                                      A 1 Reply Last reply Feb 9, 2025, 5:16 PM Reply Quote 0
                                      • A
                                        Alan Kilborn @Yurble Vương
                                        last edited by Feb 9, 2025, 5:16 PM

                                        @Yurble-Vương said in Massive list and massive search and replace?:

                                        how can I add in a code to made it change only words start or end a transition from space to non-space character

                                        You can use \b in the regular expression to insist upon a word boundary; example: \bapple will match have an apple today but will not match have a crabapple today.

                                        Yurble VươngY 2 Replies Last reply Feb 10, 2025, 2:20 PM Reply Quote 1
                                        • Yurble VươngY
                                          Yurble Vương @Alan Kilborn
                                          last edited by Feb 10, 2025, 2:20 PM

                                          This post is deleted!
                                          1 Reply Last reply Reply Quote 0
                                          4 out of 23
                                          • First post
                                            4/23
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors