Community
    • Login

    Massive list and massive search and replace?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    23 Posts 7 Posters 7.5k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Calvin FooC
      Calvin Foo @Alan Kilborn
      last edited by

      @Alan-Kilborn it will be great if you can help me on this. TQVM

      1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn
        last edited by Alan Kilborn

        Ok, so I found some time to finish the script.

        Here’s how it works:

        You start with a file open and active in Notepad++ that contains your desired replacements. This file should be saved into the same folder in which you want to do the replacements. The file can have any name and needs to have the following format for your replacements list:

        blue->orange
        replace this->with this
        the delimiter is->a minus followed by a greater than
        

        At that point, you run the script. It will prompt you through a series of questions about the operation, giving you a chance to validate that you are doing what you intend at several steps:

        ce1880d1-2c1c-4d7a-a08d-bb6bc47aeca6-image.png
        086558b9-38bc-486b-93f7-5418b0689635-image.png
        141bc6e6-2511-42b3-b8ec-bdd907bb318d-image.png
        592c7ce0-5b52-42e7-812d-f6337b96daab-image.png

        Somewhat obviously, after you give the final “Yes” the real work will actually be done and the indicated replacements made.

        I call the script ReplaceInFilesFromListInActiveTab.py and here is its listing:

        # -*- coding: utf-8 -*-
        from __future__ import print_function
        
        # references:
        #  https://community.notepad-plus-plus.org/topic/23638/massive-list-and-massive-search-and-replace
        #  also possibly https://community.notepad-plus-plus.org/topic/22601
        #  also possibly https://community.notepad-plus-plus.org/topic/22721
        #  also possibly https://community.notepad-plus-plus.org/topic/23495
        
        from Npp import *
        import inspect
        import os
        import re
        import glob
        
        #-------------------------------------------------------------------------------
        
        class RIFFLIAT(object):
        
            def __init__(self):
        
                self.debug = True if 0 else False
                if self.debug:
                    console.show()
                    console.clear()
        
                self.this_script_name = inspect.getframeinfo(inspect.currentframe()).filename.split(os.sep)[-1].rsplit('.', 1)[0]
        
                # the active tab has the list of the substitution pairs
                substitutions_list_file_path = notepad.getCurrentFilename()
                if not os.path.isfile(substitutions_list_file_path):
                    self.mb('Substitution list file must be a hard-named file in the file system, i.e.,  not e.g. "new 2"')
                    return
                self.print('substitutions_list_file_path:', substitutions_list_file_path)
        
                find_and_repl_match_list = []
                delimiter = '->'
                editor.research(r'(?-s)^(.+?)' + delimiter + r'(.+)', lambda m: find_and_repl_match_list.append((m.group(1), m.group(2))))
                if len(find_and_repl_match_list) == 0:
                    self.mb('\r\n'.join([
                        'The substitution list in the active file has no findwhat/replacewith pairs\r\n',
                        'Format of file is, 1 pair per line, using  {d}  as a delimiter, no extra spaces:\r\n'.format(d=delimiter),
                        'find1{d}replace1'.format(d=delimiter),
                        'find2{d}replace2'.format(d=delimiter),
                        '...{d}...'.format(d=delimiter),
                        ]))
                    return
        
                sample_repl_pairs_summary_list = []
                three_or_less_sub_pairs = min(3, len(find_and_repl_match_list))
                num_sub_pairs_above_3 = len(find_and_repl_match_list) - three_or_less_sub_pairs
                max_chars_show = 20
                for (find_what, replace_with) in find_and_repl_match_list[ 0 : three_or_less_sub_pairs ]:
                    if len(find_what) > max_chars_show: find_what = find_what[ 0 : max_chars_show] + '...'
                    if len(replace_with) > max_chars_show: replace_with = replace_with[ 0 : max_chars_show] + '...'
                    sample_repl_pairs_summary_list.append('"{fw}" with "{rw}"'.format(fw=find_what, rw=replace_with))
                if num_sub_pairs_above_3 > 0: sample_repl_pairs_summary_list.append('(and {} more)'.format(num_sub_pairs_above_3))
        
                search_folder_top_level_path = substitutions_list_file_path.rsplit(os.sep, 1)[0] + os.sep
                self.print('search_folder_top_level_path:', search_folder_top_level_path)
        
                if not self.yes_no('\r\n\r\n'.join([
                        'Q1 of 4:\r\n',
                        'Perform these replacements (specified in the active file content):',
                        '\r\n'.join(sample_repl_pairs_summary_list) + '\r\n',
                        'in the files in this folder?',
                        search_folder_top_level_path,
                        '-' * 60,
                        'IT IS STRONGLY SUGGESTED TO MAKE A BACKUP',
                        'OF ALL SOURCE FILES BEFORE RUNNING THIS!',
                        ])):
                    return
        
                process_subfolders = self.yes_no_cancel('\r\n\r\n'.join([
                    'Q2 of 4:\r\n',
                    'Do replacements in files in SUBFOLDERS of this folder also?',
                    search_folder_top_level_path,
                    ]))
                if process_subfolders == None: return  # user cancel
                self.print('process_subfolders:', process_subfolders)
        
                default_filespec = '*.txt'
                filter_input = self.prompt(
                    'Q3 of 4:\r\n' + \
                    'Supply filespec filter list         (example:    *.html *.txt *.log    )',
                    default_filespec)
                if filter_input == None: return  # user cancel
                filters_list = filter_input.split(' ')
                filters_list = [ f for f in filters_list if len(f) > 0 ]  # remove any empty entries in filters_list
                self.print('filters_list:', filters_list)
        
                pathnames_of_files_to_replace_in_list = []
                for (root, dirs, files) in os.walk(search_folder_top_level_path):
                    for filt in filters_list:
                        for p in glob.glob(os.path.join(root, filt)):
                            if p != substitutions_list_file_path:
                                pathnames_of_files_to_replace_in_list.append(p)
                    if not process_subfolders: break
                if len(pathnames_of_files_to_replace_in_list) == 0:
                    self.mb('No files matched specified filter(s)')
                    return
        
                num_files_to_examine = len(pathnames_of_files_to_replace_in_list)
        
                if not self.yes_no('\r\n\r\n'.join([
                        'Q4 of 4:\r\n',
                        '---- FINAL CONFIRM ----\r\n',
                        'Make replacements in {nfe} candidate files in this folder{b} ?'.format(
                            nfe=num_files_to_examine,
                            b=' AND below' if process_subfolders else '\r\n(but not its subfolders)'),
                        search_folder_top_level_path,
                        ])):
                    return
        
                pathname_currently_open_in_a_tab_list = []
                for (pathname, buffer_id, index, view) in notepad.getFiles():
                    if pathname not in pathname_currently_open_in_a_tab_list:
                        pathname_currently_open_in_a_tab_list.append(pathname)
        
                num_repl_made_in_all_files = 0
                pathnames_with_content_changed_by_repl_list = []
        
                for pathname in pathnames_of_files_to_replace_in_list:
        
                    if pathname in pathname_currently_open_in_a_tab_list:
                        self.print('switching active tab to', pathname)
                        notepad.activateFile(pathname)
                        editor.beginUndoAction()
                    else:
                        self.print('opening', pathname)
                        notepad.open(pathname)
                    if notepad.getCurrentFilename() != pathname: continue  # shouldn't happen
        
                    for (find_what, replace_with) in find_and_repl_match_list:
        
                        # since the editor.replace() function won't tell us how many replacements it made,
                        #  count them by searching for the matches BEFORE doing the replacement
                        self.num_repl_made_in_this_file = 0
                        def match_found(m): self.num_repl_made_in_this_file += 1
                        editor.search(find_what, match_found)
        
                        if self.num_repl_made_in_this_file > 0:
        
                            self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file))
        
                            num_repl_made_in_all_files += self.num_repl_made_in_this_file
        
                            if pathname not in pathnames_with_content_changed_by_repl_list:
                                pathnames_with_content_changed_by_repl_list.append(pathname)
        
                            # FINALLY, the actual replacement!
                            editor.replace(find_what, replace_with)
        
                    if pathname in pathname_currently_open_in_a_tab_list:
                        editor.endUndoAction()
                    else:
                        if editor.getModify():
                            self.print('saving', pathname)
                            notepad.save()
                        self.print('closing', pathname)
                        notepad.close()
        
                # restore tab that was active before we started
                notepad.activateFile(substitutions_list_file_path)
        
                self.mb('\r\n\r\n'.join([
                    '---- DONE! ----',
                    '{nr} total replacements made in {nrf} files'.format(nr=num_repl_made_in_all_files,
                        nrf=len(pathnames_with_content_changed_by_repl_list)),
                    '(of {nfe} files matching filters provided)'.format(nfe=num_files_to_examine),
                    ]))
        
            def print(self, *args):
                if self.debug:
                    print('RIFFLIAT:', *args)
        
            def mb(self, msg, flags=0, title=''):  # a message-box function
                return notepad.messageBox(msg, title if title else self.this_script_name, flags)
        
            def yes_no(self, question_text):
                retval = False
                answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNO, self.this_script_name)
                return True if answer == MESSAGEBOXFLAGS.RESULTYES else False
        
            def yes_no_cancel(self, question_text):
                retval = None
                answer = self.mb(question_text, MESSAGEBOXFLAGS.YESNOCANCEL, self.this_script_name)
                if answer == MESSAGEBOXFLAGS.RESULTYES: retval = True
                elif answer == MESSAGEBOXFLAGS.RESULTNO: retval = False
                return retval
        
            def prompt(self, prompt_text, default_text=''):
                if '\n' not in prompt_text: prompt_text = '\r\n' + prompt_text
                prompt_text += ':'
                return notepad.prompt(prompt_text, self.this_script_name, default_text)
        
        #-------------------------------------------------------------------------------
        
        if __name__ == '__main__': RIFFLIAT()
        

        For basic information about setting up scripting, see the REFERENCE I provided in an earlier post in this thread.

        (And BTW, thanks to @PeterJones for some prerelease testing on this!)

        Calvin FooC Yurble VươngY 2 Replies Last reply Reply Quote 3
        • Alan KilbornA Alan Kilborn referenced this topic on
        • Calvin FooC
          Calvin Foo @Alan Kilborn
          last edited by Calvin Foo

          @Alan-Kilborn Thanks, but wow, this seems a bit too complicated for me.
          Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

          I can just add new words in the excel file, then I just run the script

          PeterJonesP Terry RT Alan KilbornA 3 Replies Last reply Reply Quote 0
          • PeterJonesP
            PeterJones @Calvin Foo
            last edited by PeterJones

            @Calvin-Foo ,

            Most of the complication is just setting up Python Script plugin and installing the script once.

            After that, you just have to run the script when you have a file

            blue->orange
            replace this->with this
            the delimiter is->a minus followed by a greater than
            

            It’s really not hard to create that substitution file.

            And having the replacement data in Excel would make it harder for Alan to write a script for you (and this forum is not a code-writing service), but because the script would still be written in PythonScript, you would still have to install PythonScript and install that script once. Running it would be just as easy for you whether you run it with a text file as the source of the search->replace pairs or whether you run a script that has to parse some external Excel spreadsheet (easier, actually, for the text file, because then you don’t have to also run Excel just to prepare for a search-and-replace in Notepad++). He would still have to have all those confirmation dialogs whether the map is in Excel or in Notepad++… and he’d also have to have another dialog which asks you where the Excel spreadsheet was.

            He wrote this not just for you, but also for all the other people who ask for nearly the same functionality (we’ve seen similar requests a lot over the years, and he finally decided that we needed one generic script to handle them all, so that we’d stop having to write customized scripts for each user). If using this generic script is too complicated for you, you will not like any implementation that anyone here is able to give you.

            Good luck.

            1 Reply Last reply Reply Quote 4
            • Terry RT
              Terry R @Calvin Foo
              last edited by

              @Calvin-Foo said in Massive list and massive search and replace?:

              Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

              If you are referring to an Excel file with extension XLSX then no, Notepad++ does NOT read files which are binary in nature very well. It is after all a TEXT editor, not a Binary editor.

              However since you refer to an “Excel page” and refer to words in 2 columns, that could also be a CSV (comma separated value) file. And whilst @Alan-Kilborn has used a TSV (tab separated value) file, the 2 are very similar. He possibly could alter his code to use the comma instead of a tab, but possibly used the tab to prevent possible confusion within words.

              But doing that minor change to his code isn’t going to simplify the process anyways. Just be thankful he has gone to such lengths to help you. Sometimes doing processes such as you outlined can be made easier, but will still require time to setup.

              Terry

              PS I see @PeterJones has stated the same.

              Alan KilbornA 1 Reply Last reply Reply Quote 2
              • Alan KilbornA
                Alan Kilborn @Terry R
                last edited by

                @Terry-R said in Massive list and massive search and replace?:

                And whilst @Alan-Kilborn has used a TSV (tab separated value) file

                Actually it isn’t a “tab”, although I can see why you’d think that. I chose - followed by > as the delimiter. The delimiter is specifically variable-ized in the script, so one could easily change it to whatever is desired.

                1 Reply Last reply Reply Quote 3
                • Alan KilbornA
                  Alan Kilborn @Calvin Foo
                  last edited by

                  @Calvin-Foo said in Massive list and massive search and replace?:

                  Is it possible to simplify it just to read an Excel page, find A2 replace it with B2 for every text files are opened in NPP?

                  I suppose it IS possible, but not by me.
                  The intent of the script was to solve kind of a general case problem, in a general way.
                  Of course anyone can treat it as a demo, and feel free to modify it at will.

                  1 Reply Last reply Reply Quote 3
                  • Calvin FooC
                    Calvin Foo
                    last edited by

                    I guess I need to learn from ground up. I only have experience in writing ASP 3.0. and SQL Server.

                    I guess I just start from there

                    Maybe anyone can give me a simple guide on How to write a simple replace text script? I’ll further study from there and include a list of text (maybe import from CSV)

                    Neil SchipperN Alan KilbornA 2 Replies Last reply Reply Quote 0
                    • Neil SchipperN
                      Neil Schipper @Calvin Foo
                      last edited by Neil Schipper

                      @Calvin-Foo

                      There’s very, very, very little to learn. First, follow the instructions in Alan’s second post to you (“REFERENCE” link) LIKE A MONKEY.

                      Here’s a first script you can run:

                      #! python
                      import sys
                      print "Old style print syntax"
                      sys.stdout.write("Calvin-Foo's first script -- hello from Python %s\n" % (sys.version,))
                      

                      You don’t need to know what any of the lines mean or what they do. (Once you get it running, you can hack at it for fun).

                      There’s a lot complexity in airplanes and elevators and keyboards and phones that you don’t see and don’t need to deal with. It’s really quite similar.

                      1 Reply Last reply Reply Quote 1
                      • Alan KilbornA
                        Alan Kilborn @Calvin Foo
                        last edited by Alan Kilborn

                        @Calvin-Foo said in Massive list and massive search and replace?:

                        I guess I need to learn from ground up.
                        I guess I just start from thereI’ll further study from there and include a list of text (maybe import from CSV)

                        In case it isn’t obvious, you could take YOUR data, in whatever (textual) format, and use Notepad++ to change it with a replacement operation into MY demo format, and then just run with the demo solution.

                        This avoids programming and could (probably) be made into a N++ macro for easy repetitive running.

                        As an example, take your original problem statement data (I know it isn’t your real data, but we have none of that here, so…):

                        1. james - James
                        2. calvin - Calvin
                        3. new york - New York
                        

                        You could change that into the needed input format for the script by this operation:

                        Find: (?-s)^\d+\. (.+?) - (.+)
                        Replace: ${1}->${2}
                        Search mode: Regular expression
                        Wrap around: Checked
                        Action: Replace All button

                        After the replace-all, your data would then look like this:

                        james->James
                        calvin->Calvin
                        new york->New York
                        

                        which would be a direct feed-in to the demo script.


                        this seems a bit too complicated for me

                        Of course, this data transform may involve learning some regular expressions (don’t know your expertise), but I don’t think it is too much to ask people that request a moderately-complex solution to a problem to do some sort of learning of their own along the way.


                        How to write a simple replace text script?

                        Well about the simplest one I can think of would be a one-liner:

                        editor.replace('apple', 'Apple')

                        1 Reply Last reply Reply Quote 2
                        • Alan KilbornA Alan Kilborn referenced this topic on
                        • Alan KilbornA Alan Kilborn referenced this topic on
                        • Alan KilbornA Alan Kilborn referenced this topic on
                        • fenzek1F fenzek1 referenced this topic on
                        • nerdyone255N
                          nerdyone255
                          last edited by

                          this script is FANTASTIC.

                          i do have a question though- in the final output it prints how many replacements were made, but is there any way to see what the actual replacements were?

                          looking into the code i see

                          “”" if self.num_repl_made_in_this_file > 0:

                                          self.print('replacing "{fw}" with "{rw}" {n} times'.format(
                                              fw=find_what, rw=replace_with, n=self.num_repl_made_in_this_file)) """
                          

                          but i dont see where that would get printed- it doesnt show up in the python console either

                          again huge thanks for this one

                          Alan KilbornA 1 Reply Last reply Reply Quote 0
                          • Alan KilbornA
                            Alan Kilborn @nerdyone255
                            last edited by Alan Kilborn

                            @nerdyone255 said :

                            this script is FANTASTIC.

                            Well…glad you like it.

                            but i dont see where that would get printed- it doesnt show up in the python console either

                            The self.print() function calls are really meant as debug helpers while testing the script. Thus, in the version of the script above, they don’t do anything because the debug variable is set to False. If you change the 0 to a 1 in this line:

                            self.debug = True if 0 else False

                            or simply change it to:

                            self.debug = True

                            then the output of the self.print() calls will go to the PythonScript console window. You’ll see the output you indicated you were interested, plus output from other things that happen while the script is running.

                            nerdyone255N 1 Reply Last reply Reply Quote 2
                            • nerdyone255N
                              nerdyone255 @Alan Kilborn
                              last edited by

                              @Alan-Kilborn perfect!

                              1 Reply Last reply Reply Quote 1
                              • Alan KilbornA Alan Kilborn referenced this topic on
                              • Yurble VươngY
                                Yurble Vương @Alan Kilborn
                                last edited by PeterJones

                                @Alan-Kilborn

                                Thanks Alan, it work perfectly, Except one specially for me. Appreciate your help if possible:

                                I want to find only within word boundary.


                                For example:
                                Sentence: You are eating apple. The tree have a lot of apples. all the apples is green.
                                apple->cherry
                                apples->cherries


                                Hence, how can I add in a code to made it change only words start or end a transition from space to non-space character (space, common, dot, quote marks, questions mark…).

                                thanks in advance
                                Yurble

                                Alan KilbornA 1 Reply Last reply Reply Quote 0
                                • Alan KilbornA
                                  Alan Kilborn @Yurble Vương
                                  last edited by

                                  @Yurble-Vương said in Massive list and massive search and replace?:

                                  how can I add in a code to made it change only words start or end a transition from space to non-space character

                                  You can use \b in the regular expression to insist upon a word boundary; example: \bapple will match have an apple today but will not match have a crabapple today.

                                  Yurble VươngY 2 Replies Last reply Reply Quote 1
                                  • Yurble VươngY
                                    Yurble Vương @Alan Kilborn
                                    last edited by

                                    This post is deleted!
                                    1 Reply Last reply Reply Quote 0
                                    • Yurble VươngY
                                      Yurble Vương @Alan Kilborn
                                      last edited by Yurble Vương

                                      @Alan-Kilborn

                                      Sorry to ask and bother you. I tried to edit your py code as below, but it seems to be a wrong code. Could you advise how to correct:

                                      Not work:

                                           # FINALLY, the actual replacement!
                                                      editor.rereplace('\b'+find_what+'\b', replace_with)
                                      

                                      Not work 2:

                                           # FINALLY, the actual replacement!
                                                      editor.rereplace(r'\b'+find_what+'\b', replace_with)
                                      
                                      Alan KilbornA 1 Reply Last reply Reply Quote 1
                                      • Alan KilbornA
                                        Alan Kilborn @Yurble Vương
                                        last edited by

                                        @Yurble-Vương said :

                                        Not work 2

                                        This should work: editor.rereplace(r'\b'+find_what+r'\b', replace_with)

                                        I like how you showed initiative in trying to solve the problem yourself…and you had the right idea, you just didn’t take it far enough.

                                        Yurble VươngY 1 Reply Last reply Reply Quote 1
                                        • Yurble VươngY
                                          Yurble Vương @Alan Kilborn
                                          last edited by

                                          @Alan-Kilborn

                                          Many thanks for help

                                          1 Reply Last reply Reply Quote 0
                                          • Alan KilbornA Alan Kilborn referenced this topic on
                                          • First post
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors