How to replace a particular "url" with "url" of each webpage in multiple files using Notepad++?
-
Hello, @ramanand-jhingade, @alan-kilborn, @terry-r, @peterjones and All,
@ramanand-jhingade, I’ve found out a solution to achieve what you want ! My method use the Windows version of the Unix
gawk
utility program.It allows you to add the name of the current file before the string
" />
ending any line which begins with the string<link rel"canonical" href="
in current fileHere is the road map :
-
Open a
DOS
command prompt -
cd /d <Asolute_Path to Folder containing ALL your *.HTM? files>
-
md resul
( create a sub-folderresul
which will contain the same files as original ones, after modifications ) -
Double-click to the link below, to download the
gawk-4.1.0-bin.zip
archive
-
Move this archive to the folder containing all your
*.HTM?
files -
Double-click on the
gawk-4.1.0-bin.zip
archive -
Extract the only program
gawk.exe
from the archive, into the folder containing all your*.HTM?
files -
Paste the long line, below, in your
DOS
command prompt and valid with theEnter
key :
for %F in (*.htm?) do @del "resul\%F" 2>nul & @gawk.exe -F"\x22 />" " /^<link rel\"canonical\" href=\x22/ {$1 = $1 \"/\" FILENAME FS} ; {print} " "%F" >> "resul\%F"
- After complete execution, double-click on the
resul
folder
=> Within the
resul
folder, you should get the list of all original*.htm?
files with the current filename added at the end of any line, beginning with<link rel\"canonical\" href="
, right before the string" />
For instance, if current filename is
Test.htm
, any line, beginning with the string<link rel"canonical" href="
, in current file, will be changed into<link rel"canonical" href="•••••/Test.htm" />
, where the part•••••
represents the initial link of that line
Notes :
-
Your original files are not changed at all !
-
You may re-run the line
for %F in •••••••••• "resul\%F"
, without any problem, as the process deletes any current file, inresul
folder, before rewriting it !
Best Regards,
guy038
-
-
@guy038 I did whatever you wrote above but it copied all the files to the resul folder without any addition of the names of the files. I used Solution 2 mentioned at www.codeproject.com/Answers/5301640/How-to-find-and-add-the-file-name-of-each-file-in#answer2 and it added the file names but without the “.htm” or “.html” after the names of the files. I have read what you have posted in other threads, so I think you can help to add the filenames with the “.htm” or “.html” after the names of the files. You can give me a method which usesCommand Prompt or PowerShell. Thanks for your time and help!
-
@Ramanand-Jhingade OK, in the Solution2 mentioned in the link above, I changed BaseName to FullName. I then used Notepad++ to remove the full path (using “find” and “replace”) but keep the filenames and their extensions. Thanks for all the help guys!
-
Hello, @ramanand-jhingade, @alan-kilborn, @terry-r, @peterjones and All,
@ramanand-jhingade, sorry that my method did not work :-( I don’t understand !
It’s important to note that my method would work ONLY IF the beginning of the “canonical” line is exactly :
<link rel"canonical" href=" ^ | Beginning of line
If this line could be, for instance, any of these ones, below :
<link rel "canonical" href=" <link rel"canonical"href=" <link rel "canonical"href=" <link rel "canonical" href=" <link rel"canonical"href=" <link rel "canonical"href="
My method would not find the canonical line ! So, just tell me about it !
I even made tests with files containing a space chars in their names like
This is a test.html
and it did create the correct line :<link rel"canonical" href="https://cure4incurables.in/This is a test.html" />
Of course, in this specific case, in order to get functional links, we must change any space char with the
%20
syntax, thank to the S/R, below :SEARCH
(?-i)(?:href="|(?!\A)\G)(?:(?!").)*?\K\x20
REPLACE
%20
Using the
Regular expression
search mode and ticking theWrap around
option would result, after replacement, in the functional line :<link rel"canonical" href="https://cure4incurables.in/This%20is%20a%20test.html" />
Best Regards,
guy038
-