Community
    • Login

    Massive list and massive search and replace?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    23 Posts 7 Posters 6.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Calvin FooC
      Calvin Foo
      last edited by

      I have multiple text files (over 100)

      and I need to do multiple replace text (a lot)

      example

      1. james - James
      2. calvin - Calvin
      3. new york - New York

      is it possible to upload a list of text needs replacement (maybe Excel?) then search and replace?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Calvin Foo
        last edited by

        @Calvin-Foo

        There really isn’t a native-to-Notepad++ way of doing this.
        If you’re willing to use a scripting plugin, though, we can “get 'er done” – how about it? Are you willing to go to the “complication” of setting up scripting?

        Calvin FooC 1 Reply Last reply Reply Quote 1
        • Calvin FooC
          Calvin Foo @Alan Kilborn
          last edited by

          @Alan-Kilborn I dont mind a little scripting. But I didnt know Notepad++ can do Scripting

          Alan KilbornA 1 Reply Last reply Reply Quote 1
          • Alan KilbornA
            Alan Kilborn @Calvin Foo
            last edited by

            @Calvin-Foo said in Massive list and massive search and replace?:

            I didnt know Notepad++ can do Scripting

            Yes, here’s a good starting point REFERENCE.

            If you can give me a little time, I’ll put together a demo in my upcoming spare time. We have some scripts here on this site for “replacing from a list” but I don’t think we have anything that operates over a folder tree of files. I could pull together something that does both.

            Calvin FooC 1 Reply Last reply Reply Quote 1
            • Calvin FooC
              Calvin Foo @Alan Kilborn
              last edited by

              @Alan-Kilborn it will be great if you can help me on this. TQVM

              1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn
                last edited by Alan Kilborn

                Ok, so I found some time to finish the script.

                Here’s how it works:

                You start with a file open and active in Notepad++ that contains your desired replacements. This file should be saved into the same folder in which you want to do the replacements. The file can have any name and needs to have the following format for your replacements list:

                blue->orange
                replace this->with this
                the delimiter is->a minus followed by a greater than
                

                At that point, you run the script. It will prompt you through a series of questions about the operation, giving you a chance to validate that you are doing what you intend at several steps:

                ce1880d1-2c1c-4d7a-a08d-bb6bc47aeca6-image.png
                086558b9-38bc-486b-93f7-5418b0689635-image.png
                141bc6e6-2511-42b3-b8ec-bdd907bb318d-image.png
                592c7ce0-5b52-42e7-812d-f6337b96daab-image.png

                Somewhat obviously, after you give the final “Yes” the real work will actually be done and the indicated replacements made.

                I call the script ReplaceInFilesFromListInActiveTab.py and here is its listing:

                # -*- coding: utf-8 -*-
                from __future__ import print_function
                
                # references:
                #  https://community.notepad-plus-plus.org/topic/23638/massive-list-and-massive-search-and-replace
                #  also possibly https://community.notepad-plus-plus.org/topic/22601
                #  also possibly https://community.notepad-plus-plus.org/topic/22721
                #  also possibly https://community.notepad-plus-plus.org/topic/23495
                
                from Npp import *
                import inspect
                import os
                import re
                import glob
                
                #-------------------------------------------------------------------------------
                
                class RIFFLIAT(object):
                
                    def __init__(self):
                
                        self.debug = True if 0 else False
                        if self.debug:
                            console.show()
                            console.clear()
                
                        self.this_script_name = inspect.getframeinfo(inspect.currentframe()).filename.split(os.sep)[-1].rsplit('.', 1)[0]
                
                        # the active tab has the list of the substitution pairs
                        substitutions_list_file_path = notepad.getCurrentFilename()
                        if not os.path.isfile(substitutions_list_file_path):
                            self.mb('Substitution list file must be a hard-named file in the file system, i.e.,  not e.g. "new 2"')
                            return
                        self.print('substitutions_list_file_path:', substitutions_list_file_path)
                
                        find_and_repl_match_list = []
                        delimiter = '->'
                        editor.research(r'(?-s)^(.+?)' + delimiter + r'(.+)', lambda m: find_and_repl_match_list.append((m.group(1), m.group(2))))
                        if len(find_and_repl_match_list) == 0:
                            self.mb('\r\n'.join([
                                'The substitution list in the active file has no findwhat/replacewith pairs\r\n',
                                'Format of file is, 1 pair per line, using  {d}  as a delimiter, no extra spaces:\r\n'.format(d=delimiter),
                                'find1{d}replace1'.format(d=delimiter),
                                'find2{d}replace2'.format(d=delimiter),
                                '...{d}...'.format(d=delimiter),
                                ]))
                            return
                
                        sample_repl_pairs_summary_list = []
                        three_or_less_sub_pairs = min(3, len(find_and_repl_match_list))
                        num_sub_pairs_above_3 = len(find_and_repl_match_list) - three_or_less_sub_pairs
                        max_chars_show = 20
                        for (find_what, replace_with) in find_and_repl_match_list[ 0 : three_or_less_sub_pairs ]:
                            if len(find_what) > max_chars_show: find_what = find_what[ 0 : max_chars_show] + '...'
                            if len(replace_with) > max_chars_show: replace_with = replace_with[ 0 : max_chars_show] + '...'
                            sample_repl_pairs_summary_list.append('"{fw}" with "{rw}"'.format(fw=find_what, rw=replace_with))
                        if num_sub_pairs_above_3 > 0: sample_repl_pairs_summary_list.append('(and {} more)'.format(num_sub_pairs_above_3))
                
                        search_folder_top_level_path = substitutions_list_file_path.rsplit(os.sep, 1)[0] + os.sep
                        self.print('search_folder_top_level_path:', search_folder_top_level_path)
                
                        if not self.yes_no('\r\n\r\n'.join([
                                'Q1 of 4:\r\n',
                                'Perform these replacements (specified in the active file content):',
                                '\r\n'.join(sample_repl_pairs_summary_list) + '\r\n',
                                'in the files in this folder?',
                                search_folder_top_level_path,
                                '-' * 60,
                                'IT IS STRONGLY SUGGESTED TO MAKE A BACKUP',
                                'OF ALL SOURCE FILES BEFORE RUNNING THIS!',
                                ])):
                            return
                
                        process_subfolders = self.yes_no_cancel('\r\n\r\n'.join([
                            'Q2 of 4:\r\n',
                            'Do replacements in files in SUBFOLDERS of this folder also?',
                            search_folder_top_level_path,
                            ]))
                        if process_subfolders == None: return  # user cancel
                        self.print('process_subfolders:', process_subfolders)
                
                        default_filespec = '*.txt'
                        filter_input = self.prompt(
                            'Q3 of 4:\r\n' + \
                            'Supply filespec filter list         (example:    *.html *.txt *.log    )',
                            default_filespec)
                        if filter_input == None: return  # user cancel
                        filters_list = filter_input.split(' ')
                        filters_list = [ f for f in filters_list if len(f) > 0 ]  # remove any empty entries in filters_list
                        self.print('filters_list:', filters_list)
                
                        pathnames_of_files_to_replace_in_list = []
                        for (root, dirs, files) in os.walk(search_folder_top_level_path):
                            for filt in filters_list:
                                for p in glob.glob(os.path.join(root, filt)):
                                    if p != substitutions_list_file_path:
                                        pathnames_of_files_to_replace_in_list.append(p)
                            if not process_subfolders: break
                        if len(pathnames_of_files_to_replace_in_list) == 0:
                            self.mb('No files matched specified filter(s)')
                            return
                
                        num_files_to_examine = len(pathnames_of_files_to_replace_in_list)
                
                        if not self.yes_no('\r\n\r\n'.join([
                                'Q4 of 4:\r\n',
                                '---- FINAL CONFIRM ----\r\n',
                                'Make replacements in {nfe} candidate files in this folder{b} ?'.format(
                                    nfe=num_files_to_examine,
                                    b=' AND below' if process_subfolders else '\r\n(but not its subfolders)'),
                                search_folder_top_level_path,
                                ])):
                            return
                
                        pathname_currently_open_in_a_tab_list = []
                        for (pathname, buffer_id, index, view) in notepad.getFiles():
                            if pathname not in pathname_currently_open_in_a_tab_list:
                                pathname_currently_open_in_a_tab_list.append(pathname)
                
                        num_repl_made_in_all_files = 0
                        pathnames_with_content_changed_by_repl_list = []
                
                        for pathname in pathnames_of_files_to_replace_in_list:
                
                            if pathname in pathname_currently_open_in_a_tab_list:
                                self.print('switching active tab to', pathname)
                                notepad.activateFile(pathname)
                                editor.beginUndoAction()
                            else:
                                self.print('opening', pathname)
                                notepad.open(pathname)
                            if notepad.getCurrentFilename() != pathname: continue  # shouldn't happen
                
                            for (find_what, replace_with) in find_and_repl_match_list:
                
                                # since the editor.replace() function won't tell us how many replacements it made,
                                #  count them by searching for the matches BEFORE doing the replacement
                                self.num_repl_made_in_this_file = 0
                                def match_found(m): self.num_repl_made_in_this_file += 1
                                editor.search(find_what, match_found)
                
                                if self.num_repl_made_in_this_file > 0:
                
                                    self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                        fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file))
                
                                    num_repl_made_in_all_files += self.num_repl_made_in_this_file
                
                                    if pathname not in pathnames_with_content_changed_by_repl_list:
                                        pathnames_with_content_changed_by_repl_list.append(pathname)
                
                                    # FINALLY, the actual replacement!
                                    editor.replace(find_what, replace_with)
                
                            if pathname in pathname_currently_open_in_a_tab_list:
                                editor.endUndoAction()
                            else:
                                if editor.getModify():
                                    self.print('saving', pathname)
                                    notepad.save()
                                self.print('closing', pathname)
                                notepad.close()
                
                        # restore tab that was active before we started
                        notepad.activateFile(substitutions_list_file_path)
                
                        self.mb('\r\n\r\n'.join([
                            '---- DONE! ----',
                            '{nr} total replacements made in {nrf} files'.format(nr=num_repl_made_in_all_files,
                                nrf=len(pathnames_with_content_changed_by_repl_list)),
                            '(of {nfe} files matching filters provided)'.format(nfe=num_files_to_examine),
                            ]))
                
                    def print(self, *args):
                        if self.debug:
                            print('RIFFLIAT:', *args)
                
                    def mb(self, msg, flags=0, title=''):  # a message-box function
                        return notepad.messageBox(msg, title if title else self.this_script_name, flags)
                
                    def yes_no(self, question_text):
                        retval = False
                        answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNO, self.this_script_name)
                        return True if answer == MESSAGEBOXFLAGS.RESULTYES else False
                
                    def yes_no_cancel(self, question_text):
                        retval = None
                        answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNOCANCEL, self.this_script_name)
                        if answer == MESSAGEBOXFLAGS.RESULTYES: retval = True
                        elif answer == MESSAGEBOXFLAGS.RESULTNO: retval = False
                        return retval
                
                    def prompt(self, prompt_text, default_text=''):
                        if '\n' not in prompt_text: prompt_text = '\r\n' + prompt_text
                        prompt_text += ':'
                        return notepad.prompt(prompt_text, self.this_script_name, default_text)
                
                #-------------------------------------------------------------------------------
                
                if __name__ == '__main__': RIFFLIAT()
                

                For basic information about setting up scripting, see the REFERENCE I provided in an earlier post in this thread.

                (And BTW, thanks to @PeterJones for some prerelease testing on this!)

                Calvin FooC Yurble VươngY 2 Replies Last reply Reply Quote 3
                • Alan KilbornA Alan Kilborn referenced this topic on
                • Calvin FooC
                  Calvin Foo @Alan Kilborn
                  last edited by Calvin Foo

                  @Alan-Kilborn Thanks, but wow, this seems a bit too complicated for me.
                  Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                  I can just add new words in the excel file, then I just run the script

                  PeterJonesP Terry RT Alan KilbornA 3 Replies Last reply Reply Quote 0
                  • PeterJonesP
                    PeterJones @Calvin Foo
                    last edited by PeterJones

                    @Calvin-Foo ,

                    Most of the complication is just setting up Python Script plugin and installing the script once.

                    After that, you just have to run the script when you have a file

                    blue->orange
                    replace this->with this
                    the delimiter is->a minus followed by a greater than
                    

                    It’s really not hard to create that substitution file.

                    And having the replacement data in Excel would make it harder for Alan to write a script for you (and this forum is not a code-writing service), but because the script would still be written in PythonScript, you would still have to install PythonScript and install that script once. Running it would be just as easy for you whether you run it with a text file as the source of the search->replace pairs or whether you run a script that has to parse some external Excel spreadsheet (easier, actually, for the text file, because then you don’t have to also run Excel just to prepare for a search-and-replace in Notepad++). He would still have to have all those confirmation dialogs whether the map is in Excel or in Notepad++… and he’d also have to have another dialog which asks you where the Excel spreadsheet was.

                    He wrote this not just for you, but also for all the other people who ask for nearly the same functionality (we’ve seen similar requests a lot over the years, and he finally decided that we needed one generic script to handle them all, so that we’d stop having to write customized scripts for each user). If using this generic script is too complicated for you, you will not like any implementation that anyone here is able to give you.

                    Good luck.

                    1 Reply Last reply Reply Quote 4
                    • Terry RT
                      Terry R @Calvin Foo
                      last edited by

                      @Calvin-Foo said in Massive list and massive search and replace?:

                      Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                      If you are referring to an Excel file with extension XLSX then no, Notepad++ does NOT read files which are binary in nature very well. It is after all a TEXT editor, not a Binary editor.

                      However since you refer to an “Excel page” and refer to words in 2 columns, that could also be a CSV (comma separated value) file. And whilst @Alan-Kilborn has used a TSV (tab separated value) file, the 2 are very similar. He possibly could alter his code to use the comma instead of a tab, but possibly used the tab to prevent possible confusion within words.

                      But doing that minor change to his code isn’t going to simplify the process anyways. Just be thankful he has gone to such lengths to help you. Sometimes doing processes such as you outlined can be made easier, but will still require time to setup.

                      Terry

                      PS I see @PeterJones has stated the same.

                      Alan KilbornA 1 Reply Last reply Reply Quote 2
                      • Alan KilbornA
                        Alan Kilborn @Terry R
                        last edited by

                        @Terry-R said in Massive list and massive search and replace?:

                        And whilst @Alan-Kilborn has used a TSV (tab separated value) file

                        Actually it isn’t a “tab”, although I can see why you’d think that. I chose - followed by > as the delimiter. The delimiter is specifically variable-ized in the script, so one could easily change it to whatever is desired.

                        1 Reply Last reply Reply Quote 3
                        • Alan KilbornA
                          Alan Kilborn @Calvin Foo
                          last edited by

                          @Calvin-Foo said in Massive list and massive search and replace?:

                          Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                          I suppose it IS possible, but not by me.
                          The intent of the script was to solve kind of a general case problem, in a general way.
                          Of course anyone can treat it as a demo, and feel free to modify it at will.

                          1 Reply Last reply Reply Quote 3
                          • Calvin FooC
                            Calvin Foo
                            last edited by

                            I guess I need to learn from ground up. I only have experience in writing ASP 3.0. and SQL Server.

                            I guess I just start from there

                            Maybe anyone can give me a simple guide on How to write a simple replace text script? I’ll further study from there and include a list of text (maybe import from CSV)

                            Neil SchipperN Alan KilbornA 2 Replies Last reply Reply Quote 0
                            • Neil SchipperN
                              Neil Schipper @Calvin Foo
                              last edited by Neil Schipper

                              @Calvin-Foo

                              There’s very, very, very little to learn. First, follow the instructions in Alan’s second post to you (“REFERENCE” link) LIKE A MONKEY.

                              Here’s a first script you can run:

                              #! python
                              import sys
                              print "Old style print syntax"
                              sys.stdout.write("Calvin-Foo's first script -- hello from Python %s\n" % (sys.version,))
                              

                              You don’t need to know what any of the lines mean or what they do. (Once you get it running, you can hack at it for fun).

                              There’s a lot complexity in airplanes and elevators and keyboards and phones that you don’t see and don’t need to deal with. It’s really quite similar.

                              1 Reply Last reply Reply Quote 1
                              • Alan KilbornA
                                Alan Kilborn @Calvin Foo
                                last edited by Alan Kilborn

                                @Calvin-Foo said in Massive list and massive search and replace?:

                                I guess I need to learn from ground up.
                                I guess I just start from thereI’ll further study from there and include a list of text (maybe import from CSV)

                                In case it isn’t obvious, you could take YOUR data, in whatever (textual) format, and use Notepad++ to change it with a replacement operation into MY demo format, and then just run with the demo solution.

                                This avoids programming and could (probably) be made into a N++ macro for easy repetitive running.

                                As an example, take your original problem statement data (I know it isn’t your real data, but we have none of that here, so…):

                                1. james - James
                                2. calvin - Calvin
                                3. new york - New York
                                

                                You could change that into the needed input format for the script by this operation:

                                Find: (?-s)^\d+\. (.+?) - (.+)
                                Replace: ${1}->${2}
                                Search mode: Regular expression
                                Wrap around: Checked
                                Action: Replace All button

                                After the replace-all, your data would then look like this:

                                james->James
                                calvin->Calvin
                                new york->New York
                                

                                which would be a direct feed-in to the demo script.


                                this seems a bit too complicated for me

                                Of course, this data transform may involve learning some regular expressions (don’t know your expertise), but I don’t think it is too much to ask people that request a moderately-complex solution to a problem to do some sort of learning of their own along the way.


                                How to write a simple replace text script?

                                Well about the simplest one I can think of would be a one-liner:

                                editor.replace('apple', 'Apple')

                                1 Reply Last reply Reply Quote 2
                                • Alan KilbornA Alan Kilborn referenced this topic on
                                • Alan KilbornA Alan Kilborn referenced this topic on
                                • Alan KilbornA Alan Kilborn referenced this topic on
                                • fenzek1F fenzek1 referenced this topic on
                                • nerdyone255N
                                  nerdyone255
                                  last edited by

                                  this script is FANTASTIC.

                                  i do have a question though- in the final output it prints how many replacements were made, but is there any way to see what the actual replacements were?

                                  looking into the code i see

                                  “”" if self.num_repl_made_in_this_file > 0:

                                                  self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                                      fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file)) """
                                  

                                  but i dont see where that would get printed- it doesnt show up in the python console either

                                  again huge thanks for this one

                                  Alan KilbornA 1 Reply Last reply Reply Quote 0
                                  • Alan KilbornA
                                    Alan Kilborn @nerdyone255
                                    last edited by Alan Kilborn

                                    @nerdyone255 said :

                                    this script is FANTASTIC.

                                    Well…glad you like it.

                                    but i dont see where that would get printed- it doesnt show up in the python console either

                                    The self.print() function calls are really meant as debug helpers while testing the script. Thus, in the version of the script above, they don’t do anything because the debug variable is set to False. If you change the 0 to a 1 in this line:

                                    self.debug = True if 0 else False

                                    or simply change it to:

                                    self.debug = True

                                    then the output of the self.print() calls will go to the PythonScript console window. You’ll see the output you indicated you were interested, plus output from other things that happen while the script is running.

                                    nerdyone255N 1 Reply Last reply Reply Quote 2
                                    • nerdyone255N
                                      nerdyone255 @Alan Kilborn
                                      last edited by

                                      @Alan-Kilborn perfect!

                                      1 Reply Last reply Reply Quote 1
                                      • Alan KilbornA Alan Kilborn referenced this topic on
                                      • Yurble VươngY
                                        Yurble Vương @Alan Kilborn
                                        last edited by PeterJones

                                        @Alan-Kilborn

                                        Thanks Alan, it work perfectly, Except one specially for me. Appreciate your help if possible:

                                        I want to find only within word boundary.


                                        For example:
                                        Sentence: You are eating apple. The tree have a lot of apples. all the apples is green.
                                        apple->cherry
                                        apples->cherries


                                        Hence, how can I add in a code to made it change only words start or end a transition from space to non-space character (space, common, dot, quote marks, questions mark…).

                                        thanks in advance
                                        Yurble

                                        Alan KilbornA 1 Reply Last reply Reply Quote 0
                                        • Alan KilbornA
                                          Alan Kilborn @Yurble Vương
                                          last edited by

                                          @Yurble-Vương said in Massive list and massive search and replace?:

                                          how can I add in a code to made it change only words start or end a transition from space to non-space character

                                          You can use \b in the regular expression to insist upon a word boundary; example: \bapple will match have an apple today but will not match have a crabapple today.

                                          Yurble VươngY 2 Replies Last reply Reply Quote 1
                                          • Yurble VươngY
                                            Yurble Vương @Alan Kilborn
                                            last edited by

                                            This post is deleted!
                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors