Community
    • Login

    How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    24 Posts 5 Posters 3.4k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Ramanand JhingadeR
      Ramanand Jhingade @Ramanand Jhingade
      last edited by Ramanand Jhingade

      @Ramanand-Jhingade The name of each file should appear at the end of this “url” for that particular file.
      If I have to use the “Regular expression”, should I uncheck the “matches newline”?
      Your system is not letting me edit my question as 3 minutes have passed, so I am adding it here

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Ramanand Jhingade
        last edited by

        @Ramanand-Jhingade

        You’d probably better show some exact “before” and “after” data, to make sure your need is clear.

        Type/paste your data here, select it, then press the </> symbol in the little toolbar above the compose window. If you do this, your data will be clear and it will look something line this:

        this
        is
        my
        data,
        properly
        formatted
        
        Ramanand JhingadeR 2 Replies Last reply Reply Quote 0
        • Ramanand JhingadeR
          Ramanand Jhingade @Alan Kilborn
          last edited by

          @Alan-Kilborn I want to search and add the name of multiple files to a url mentioned in each file. I tried what is mentioned here: https://community.notepad-plus-plus.org/topic/17035/batch-function-need-to-add-filename-at-the-end-of-each-paragraph/4 but it only adds a “4”

          1 Reply Last reply Reply Quote 0
          • Ramanand JhingadeR
            Ramanand Jhingade @Alan Kilborn
            last edited by

            @Alan-Kilborn ```
            example.in to example.in/filename

            Ramanand JhingadeR 1 Reply Last reply Reply Quote 0
            • Ramanand JhingadeR
              Ramanand Jhingade @Ramanand Jhingade
              last edited by Ramanand Jhingade

              @Ramanand-Jhingade I hope it will add the “.htm” or “.html” at the end for each file (the file names are alpha-numeric). Please also let me know if if I should uncheck the “matches newline” in the Regular expression?

              Alan KilbornA 1 Reply Last reply Reply Quote 0
              • Alan KilbornA
                Alan Kilborn @Ramanand Jhingade
                last edited by

                @Ramanand-Jhingade

                I can’t tell what you are talking about…so I’m out.
                Perhaps someone else will come along that does understand you.

                Ramanand JhingadeR 3 Replies Last reply Reply Quote 0
                • Ramanand JhingadeR
                  Ramanand Jhingade @Alan Kilborn
                  last edited by Ramanand Jhingade

                  @Alan-Kilborn

                  example.in to example.in/filename
                  

                  By filename I meant the files with names such as 102_Anal-Fissure.html, Alopecia.html, Anal-fissure.html, fissure.htm, anal-warts.htm, anemia.html, ankylosing-spondylitis.htm, arthritis.htm, asthma.htm, autism.htm and so on. The name of each file should appear in each file just after the url using the “Search” and “Replace” using the “Regular expession” mode

                  1 Reply Last reply Reply Quote 0
                  • Ramanand JhingadeR
                    Ramanand Jhingade @Alan Kilborn
                    last edited by Ramanand Jhingade

                    @Alan-Kilborn I tried what is advised here: https://community.notepad-plus-plus.org/topic/17035/batch-function-need-to-add-filename-at-the-end-of-each-paragraph/10 but it is not working. What user “@Meta-Chuh” advised there just adds a “4” instead of the name of the file, what user “@Terry R” advised there replaced a whole lot of matter with gibberish, what user “@guy038” advised there also replaced a whole lot of matter with gibberish, so what to do now? How to get a reply to my query above?

                    PeterJonesP 1 Reply Last reply Reply Quote 0
                    • Ramanand JhingadeR
                      Ramanand Jhingade @Alan Kilborn
                      last edited by

                      @Alan-Kilborn Can Notepad++ do what is asked here: https://www.codeproject.com/Questions/1258369/Replace-string-value-in-a-file-with-filename - it is exactly what I want?

                      1 Reply Last reply Reply Quote 0
                      • Terry RT
                        Terry R
                        last edited by

                        @Ramanand-Jhingade said in How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?:

                        How to get a reply to my query above?

                        I am happy to help, since my solution in that link was one you tried. However as @Alan-Kilborn stated, you need to provide at least a portion of one of your files within the black window. Provide the portion containing the line (and some lines either side) you want to add the filename to and show both a before and after look so we know what you intend to do. So far you have not.

                        The solution in that link was to add filename to all lines and your request is different so there will be some changes necessary to the regex.

                        Terry

                        Ramanand JhingadeR 1 Reply Last reply Reply Quote 1
                        • Ramanand JhingadeR
                          Ramanand Jhingade @Terry R
                          last edited by

                          @Terry-R Let me try to explain. I have multiple files, some of their names are 201_Alopecia.html, Anal-fissure.html, fissure.htm, anal-warts.htm, anemia.html, ankylosing-spondylitis.htm, arthritis.htm, asthma.htm, autism.htm and so on. I want the file name of each file to get added to the end of the url in this meta tag: <link rel"canonical" href=“https://example.in” />

                          Change <link rel"canonical" href="https://cure4incurables.in" /> to <link rel"canonical" href="https://cure4incurables.in/201_Alopecia.html" /> in that file
                          Change <link rel"canonical" href="https://cure4incurables.in" /> to <linlkrel"canonical" href="https://cure4incurables.in/Anal-fissure.html" /> in that file and so on
                          1 Reply Last reply Reply Quote 0
                          • PeterJonesP
                            PeterJones @Ramanand Jhingade
                            last edited by PeterJones

                            @Ramanand-Jhingade ,

                            Saying

                            example.in to example.in/filename
                            

                            does not give us a good idea of your before and after data. The “before” and “after” data needs to come in separate boxes. otherwise, how are we supposed to know what’s “before” and what’s “after”.

                            What I think you are saying is that you have multiple files, for example a.html, b.html, and c.html. The contents of each file look like

                            • a.html
                              the contents
                              of this file
                              with URL:
                              example.in
                              
                            • b.html
                              similar
                              contents
                              for
                              a second file
                              with link:
                              example.in and maybe some text after
                              
                            • c.html
                              different contents for my third file with example.in
                              

                            And you want to transform that to

                            • a.html
                              the contents
                              of this file
                              with URL:
                              example.in/a.html
                              
                            • b.html
                              similar
                              contents
                              for
                              a second file
                              with link:
                              example.in/b.html and maybe some text after
                              
                            • c.html
                              different contents for my third file with example.in/c.html
                              

                            The short answer is you cannot do that with only regular expressions. The problem is that the regular expression engine does not know the name of the current file… it only has access to the contents of the current file.

                            Thus, in that other thread, @Terry-R advised the person who asked to use a macro – which allows you to use save a sequence of keystrokes, and apply the saved steps later; he then gave the contents of the macro, which would have to be saved in shortcuts.xml. @guy038 chimed in with an optmised version of the macro, which used fewer steps. The gist of either of those macros is a sequence of keystrokes / menu commands which

                            1. went to the end of the file,
                            2. added a newline sequence (CRLF), then
                            3. used the macro instruction to use Edit > Copy to Clipboard > Current Filename to Clipboard, which puts the current filename in the clipboard, then
                            4. pastes that text in the current file position (which is the end of the file).

                            For your problem, where you want it to go after the base URL, rather than going to the end of file or the end of every line, you’ll need a slightly different algorithm

                            1. search for example.in
                            2. move the cursor to after the found text (so you aren’t ovewriting)
                            3. add the / separator
                            4. copy the name of the file into clipboard
                            5. paste the clipboard contents at the current position

                            I’d actualy combine the first two steps to be a single regular expression, like FIND=(?<=example\.in), which finds the text example.in, but will leave the cursor after that text.

                            If you wanted to record this macro yourself, you would

                            1. Macro > Start Recording
                            2. Search > Find (or Ctrl+F), and set
                              FIND = (?<=example\.in)
                              SEARCH MODE = regular expression
                              0688eca9-3cbd-48f8-89e8-6f1d76010a22-image.png
                            3. Find Next
                            4. Close
                            5. type the / key
                            6. Edit > Copy to Clipboard > Current Filename to Clipboard
                            7. Edit > Paste (or Ctrl+V)
                            8. Macro > Stop Recording
                            9. Macro > Save Current Recorded Macro, and give it a name (like Append Filename to example.in).

                            Now, to use that macro, open up the next file you want to edit, use Macro > Append Filename after example.in, and it will run it for you. You will have to open each file and run that macro.

                            If you don’t want to record the macro yourself, then you can use the procedure:

                            1. File > Open %AppData%\Notepad++\shortcuts.xml

                            2. copy the following contents and paste them as the last entry before the </macros> line

                               <Macro name="Append Filename after example.in" Ctrl="no" Alt="no" Shift="no" Key="0">
                                   <Action type="3" message="1700" wParam="0" lParam="0" sParam="" />
                                   <Action type="3" message="1601" wParam="0" lParam="0" sParam="(?&lt;=example\.in)" />
                                   <Action type="3" message="1625" wParam="0" lParam="2" sParam="" />
                                   <Action type="3" message="1702" wParam="0" lParam="768" sParam="" />
                                   <Action type="3" message="1701" wParam="0" lParam="1" sParam="" />
                                   <Action type="1" message="2170" wParam="0" lParam="0" sParam="/" />
                                   <Action type="2" message="0" wParam="42030" lParam="0" sParam="" />
                                   <Action type="0" message="2179" wParam="0" lParam="0" sParam="" />
                               </Macro>
                              
                            3. save shortcuts.xml

                            4. exit Notepad++

                            5. open Notepad++

                            At this point, you can use Macro > Append Filename after example.in the same as before to run that macro.

                            This is my best guess. If it’s not sufficient, please provide the information that Alan has asked for multiple times and Terry has now asked for as well.

                            (Based on your most recent example, made long after I started my reply, I think I’ve interpreted right. But your example before and after data are still in the same black box, and you seem to be putting the contents of multiple files in the same black box, which makes it really hard to be sure.

                            Ramanand JhingadeR 3 Replies Last reply Reply Quote 1
                            • Ramanand JhingadeR
                              Ramanand Jhingade @PeterJones
                              last edited by Ramanand Jhingade

                              @PeterJones Yes, you got it right. I want Notepad++ to find and add the file name of each file to the end of the url mentioned in the Meta canonical tag as above

                              Change <link rel"canonical" href="https://cure4incurables.in" /> to <link rel"canonical" href="https://cure4incurables.in/201_Alopecia.html" /> in that file
                              Change <link rel"canonical" href="https://cure4incurables.in" /> to <linlk rel"canonical" href="https://cure4incurables.in/Anal-fissure.html" /> in that file Change <link rel"canonical" href="https://cure4incurables.in" /> to <linlk rel"canonical" href="https://cure4incurables.in/adhd.htm" /> in that file and so on
                              
                              1 Reply Last reply Reply Quote 0
                              • Ramanand JhingadeR
                                Ramanand Jhingade @PeterJones
                                last edited by

                                @PeterJones 201_Alopecia.html, Anal-fissure.html, adhd.htm, asthma.htm etc. are the file names

                                1 Reply Last reply Reply Quote 0
                                • Ramanand JhingadeR
                                  Ramanand Jhingade @PeterJones
                                  last edited by

                                  @PeterJones If I have to open each file, I might as well add the file name to that meta tag in each file manually - tell me something easier. Thanks for making time to answer

                                  1 Reply Last reply Reply Quote -1
                                  • Terry RT
                                    Terry R
                                    last edited by

                                    @Ramanand-Jhingade said in How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?:

                                    If I have to open each file, I might as well add the file name to that meta tag in each file manually - tell me something easier.

                                    If you read my post in that link then you would know that in my solution each file DID need to be opened in Notepad++. However that is not the only method of adding the filename to a text (or html) file. You can use the DOS command in a similar fashion to:
                                    for %1 in (*.txt) do echo %1 >> %1
                                    In this case you would have the current directory as the one containing the html files. The (*.html) would be all the html files you want to edit. If some of those files are not to be included you would need to change that filter to exclude them, such as (t*.html) would only work on html files that have a name starting with t.

                                    It adds the filename directly behind the last character in the file, so if there isn’t a blank line as the last line you will find the filename adds itself to the end of some characters and might be difficult to find with a regex at a later stage.

                                    No doubt you can see it isn’t as easy as you think. What ever method you use you will have some steps involved to achieve it. If you really only have a small number of files to edit, then you might be better off doing it manually.

                                    Generally the use of macros and regex are for when a task is performed over and over again and/or for a large number of files. It does sound like your task is neither.

                                    Terry

                                    Ramanand JhingadeR 1 Reply Last reply Reply Quote 0
                                    • Ramanand JhingadeR
                                      Ramanand Jhingade @Terry R
                                      last edited by Ramanand Jhingade

                                      @Terry-R I have about 300 files, so I would certainly want to try what you are saying. If I open Command Prompt from the Windows “Start” menu and select/change to the directory containing these html files (using the command “cd folder name”), will the command “for %1 in (*.txt) do echo %1 >> %1” get the file name of each file and add it to the “last character in the file” as you say?

                                      PeterJonesP 1 Reply Last reply Reply Quote 0
                                      • Terry RT
                                        Terry R
                                        last edited by

                                        @Ramanand-Jhingade said in How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?:

                                        will the command “for %1 in (*.txt) do echo %1 >> %1” get the file name of each file and add it to the “last character in the file” as you say?

                                        Well in a test it did it for me. As I say if the last line of your file contains text then doing this will add the filename directly behind the text. Similar to:

                                        https://community.notepad-plus-plus.org/topic/10470
                                        https://community.notepad-plus-plus.org/topic/10471
                                        https://community.notepad-plus-plus.org/topic/104721.txt
                                        

                                        Note the last line has the number followed by the filename which is 1.txt. So a possibility is to do the echo command twice. The first time use it to add 2 special characters such as @@ first, then run it again with the %1 so the filename is added. in this situation you will get the @@ directly against the last characters within the file, then a space and/or line feed (I think), then the filename.

                                        Once that’s completed then a regex would need to be constructed to find the line to add the filename, then look ahead, grab the filename and copy it to the current position. Then the regex would (as in my solution) perform a last step of erasing the additions (@@ and filename). That regex you want is NOT the one in my solution, but something similar.

                                        So in your example above, is the line that needs changing (starting with <link rel"canonical" the only one like that. because the regex needs to be able to correctly identify it. If more than one line similar, how do you identify the exact one amongst the other similar lines?

                                        Terry

                                        1 Reply Last reply Reply Quote 0
                                        • PeterJonesP
                                          PeterJones @Ramanand Jhingade
                                          last edited by

                                          @Ramanand-Jhingade said in How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?:

                                          I have about 300 files, so I would certainly want to try what you are saying.

                                          Quite honestly, if you’ve got that many files, and want to just add the filename into each of the files based on the same literal text, I don’t know why you didn’t use the powershell solution you already found in the https://www.codeproject.com/Questions/1258369/Replace-string-value-in-a-file-with-filename link you already shared. This isn’t a powershell forum, so you can find somewhere else if you want more help with doing it that way.

                                          Based on what you’ve said, use the cmd script that Terry handed you (even though this isn’t a cmd forum) – including the @@ echo before the filename echo (as long as @@ isn’t anywhere in your files).

                                          for %1 in (*.txt) do echo @@ >> %1
                                          for %1 in (*.txt) do echo %1 >> %1
                                          

                                          Then use Notepad++'s Find in Files to search for (?s)(<link rel"canonical" href=".*?)(".*)@@(.*) and replace with $1/$3$2 (this works by putting the <link rel"canonical" href=" in group1, the bulk of the file in group2, and the filename in group3). However, this won’t work if your file is too big, so that group3 takes up too much memory – a 1MB or 16MB file worked for me; wow, even 100MB worked – though it took a long time to complete.

                                          1 Reply Last reply Reply Quote 1
                                          • guy038G
                                            guy038
                                            last edited by guy038

                                            Hello, @ramanand-jhingade, @alan-kilborn, @terry-r, @peterjones and All,

                                            @ramanand-jhingade, I’ve found out a solution to achieve what you want ! My method use the Windows version of the Unix gawk utility program.

                                            It allows you to add the name of the current file before the string " /> ending any line which begins with the string <link rel"canonical" href=" in current file

                                            Here is the road map :

                                            • Open a DOS command prompt

                                            • cd /d <Asolute_Path to Folder containing ALL your *.HTM? files>

                                            • md resul ( create a sub-folder resul which will contain the same files as original ones, after modifications )

                                            • Double-click to the link below, to download the gawk-4.1.0-bin.zip archive

                                            https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/gnu-on-windows/gawk-4.1.0-bin.zip

                                            • Move this archive to the folder containing all your *.HTM? files

                                            • Double-click on the gawk-4.1.0-bin.zip archive

                                            • Extract the only program gawk.exe from the archive, into the folder containing all your *.HTM? files

                                            • Paste the long line, below, in your DOS command prompt and valid with the Enter key :

                                            for %F in (*.htm?) do @del "resul\%F" 2>nul & @gawk.exe -F"\x22 />" " /^<link rel\"canonical\" href=\x22/ {$1 = $1 \"/\" FILENAME FS} ; {print} " "%F" >> "resul\%F"

                                            • After complete execution, double-click on the resul folder

                                            => Within the resul folder, you should get the list of all original *.htm? files with the current filename added at the end of any line, beginning with <link rel\"canonical\" href=", right before the string " />

                                            For instance, if current filename is Test.htm, any line, beginning with the string <link rel"canonical" href=", in current file, will be changed into <link rel"canonical" href="•••••/Test.htm" />, where the part ••••• represents the initial link of that line


                                            Notes :

                                            • Your original files are not changed at all !

                                            • You may re-run the line for %F in •••••••••• "resul\%F", without any problem, as the process deletes any current file, in resul folder, before rewriting it !

                                            Best Regards,

                                            guy038

                                            Ramanand JhingadeR 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors