Community
    • Login

    How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    24 Posts 5 Posters 8.0k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G Offline
      guy038
      last edited by guy038

      Hello, @ramanand-jhingade, @alan-kilborn, @terry-r, @peterjones and All,

      @ramanand-jhingade, I’ve found out a solution to achieve what you want ! My method use the Windows version of the Unix gawk utility program.

      It allows you to add the name of the current file before the string " /> ending any line which begins with the string <link rel"canonical" href=" in current file

      Here is the road map :

      • Open a DOS command prompt

      • cd /d <Asolute_Path to Folder containing ALL your *.HTM? files>

      • md resul ( create a sub-folder resul which will contain the same files as original ones, after modifications )

      • Double-click to the link below, to download the gawk-4.1.0-bin.zip archive

      https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/gnu-on-windows/gawk-4.1.0-bin.zip

      • Move this archive to the folder containing all your *.HTM? files

      • Double-click on the gawk-4.1.0-bin.zip archive

      • Extract the only program gawk.exe from the archive, into the folder containing all your *.HTM? files

      • Paste the long line, below, in your DOS command prompt and valid with the Enter key :

      for %F in (*.htm?) do @del "resul\%F" 2>nul & @gawk.exe -F"\x22 />" " /^<link rel\"canonical\" href=\x22/ {$1 = $1 \"/\" FILENAME FS} ; {print} " "%F" >> "resul\%F"

      • After complete execution, double-click on the resul folder

      => Within the resul folder, you should get the list of all original *.htm? files with the current filename added at the end of any line, beginning with <link rel\"canonical\" href=", right before the string " />

      For instance, if current filename is Test.htm, any line, beginning with the string <link rel"canonical" href=", in current file, will be changed into <link rel"canonical" href="•••••/Test.htm" />, where the part ••••• represents the initial link of that line


      Notes :

      • Your original files are not changed at all !

      • You may re-run the line for %F in •••••••••• "resul\%F", without any problem, as the process deletes any current file, in resul folder, before rewriting it !

      Best Regards,

      guy038

      Ramanand JhingadeR 1 Reply Last reply Reply Quote 0
      • Ramanand JhingadeR Offline
        Ramanand Jhingade @guy038
        last edited by Ramanand Jhingade

        @guy038 I did whatever you wrote above but it copied all the files to the resul folder without any addition of the names of the files. I used Solution 2 mentioned at www.codeproject.com/Answers/5301640/How-to-find-and-add-the-file-name-of-each-file-in#answer2 and it added the file names but without the “.htm” or “.html” after the names of the files. I have read what you have posted in other threads, so I think you can help to add the filenames with the “.htm” or “.html” after the names of the files. You can give me a method which usesCommand Prompt or PowerShell. Thanks for your time and help!

        Ramanand JhingadeR 1 Reply Last reply Reply Quote 0
        • Ramanand JhingadeR Offline
          Ramanand Jhingade @Ramanand Jhingade
          last edited by Ramanand Jhingade

          @Ramanand-Jhingade OK, in the Solution2 mentioned in the link above, I changed BaseName to FullName. I then used Notepad++ to remove the full path (using “find” and “replace”) but keep the filenames and their extensions. Thanks for all the help guys!

          1 Reply Last reply Reply Quote 0
          • guy038G Offline
            guy038
            last edited by

            Hello, @ramanand-jhingade, @alan-kilborn, @terry-r, @peterjones and All,

            @ramanand-jhingade, sorry that my method did not work :-( I don’t understand !

            It’s important to note that my method would work ONLY IF the beginning of the “canonical” line is exactly :

            <link rel"canonical" href="
            ^
            |
            Beginning of line
            

            If this line could be, for instance, any of these ones, below :

            <link rel "canonical" href="
            <link rel"canonical"href="
            <link rel "canonical"href="
            
              <link rel "canonical" href="
              <link rel"canonical"href="
              <link rel "canonical"href="
            

            My method would not find the canonical line ! So, just tell me about it !


            I even made tests with files containing a space chars in their names like This is a test.html and it did create the correct line :

            <link rel"canonical" href="https://cure4incurables.in/This is a test.html" />

            Of course, in this specific case, in order to get functional links, we must change any space char with the %20 syntax, thank to the S/R, below :

            SEARCH (?-i)(?:href="|(?!\A)\G)(?:(?!").)*?\K\x20

            REPLACE %20

            Using the Regular expression search mode and ticking the Wrap around option would result, after replacement, in the functional line :

            <link rel"canonical" href="https://cure4incurables.in/This%20is%20a%20test.html" />

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 0
            • dr ramaanandD dr ramaanand referenced this topic on

            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

            With your input, this post could be even better 💗

            Register Login
            • First post
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors