Converting Windows file names to links
-
Hi, @tho-gru, @peterjones, @alan-kilborn and All,
First of all, I want to apologize about an obvious error, in the Regex
A
Replacement part :I said :
REPLACE file///\2
Of course, the correct regex is
REPLACE
file:///\2
Secondly, to avoid that this regex, itself, to be an hyperlink, I prefered this final syntax :
REPLACE
file:\x2F\x2F\x2F\2
Important : I updated my previous post, accordingly, in order the info to be accurate !
Now regarding your problem, I first noticed that your regex is not the same as mime, as its just considers the case of pathnames with slashs only, like in
D:/ABC/xYZ.txt
But, even with this restriction, if I run your regex against the INPUT text, provided in my previous post, I did get
6
replacements, relative to all the pathnames with usual/
of the test list !So, I don’t know why you found a malformed regex expression ?
Notepad++ v8.4.4 (64-bit)
Build time : Jul 15 2022 - 17:54:42
Path : E:\844_x64\notepad++.exe
Command Line :
Admin mode : OFF
Local Conf mode : ON
Cloud Config : OFF
OS Name : Windows 10 Pro (64-bit)
OS Version : 21H2
OS Build : 19044.1949
Current ANSI codepage : 1252
Plugins :
mimeTools (2.8)
NppConverter (4.4)
NppExport (0.4)
ComparePlus (1)
If you use the complete regex S/R
A
, given ( and updated ) in my previous post, you should get18
replacements, as before !Best Regards
guy038
-
Hi @guy038 and @Alan-Kilborn and everyone,
I finally find the (or my) error…
When I use the regex A from guy is does not work. After watching some additional time on the regex I simply guess that the character combinations \] must be converted to \\] (when writing this text the preview function shows that \-chars must be entered twice to appear in the preview).
This following regex works for me:
(?xi) ^ (")? ( [A-Z] : [/\\\\] (?: (?: (?![/"\\\\]) \S | \x20 )* (?![/"\\\\]) \S [/\\\\] )* (?: (?: (?![/"\\\\]) \S | \x20 )* (?![/"\\\\]) \S )? ) (?(1)\x20*") $
So perhaps the solution provided by guy is now more useful for other people.
-
Hi @guy038 and @Alan-Kilborn and everyone,
I finally find the (or my) error…
When I use the regex A from guy is does not work. After watching some additional time on the regex I simply guess that the character combinations backslash followed by closing square bracket must be converted to backslash, backslash followed by closing square bracket.
Even the live preview looks different than the final post.
Unfortunately while trying to provide some text representation of the new regex failed. Here is a screen shot:
So perhaps the solution provided by guy is now more useful for other people.
PS: This forum software is quite ugly to use. It often complain about spam suspicious posts and only left 5 minutes to edit. Therefore the repost.
-
This thread is hopelessly muddled.
I hope a moderator comes along and straightens it out.the character combinations backslash followed by closing square bracket must be converted to backslash, backslash followed by closing square bracket
Yes, there is a website bug with the backslash-closingSquareBracket sequence. It is better to use the following for the closingSquareBracket to avoid the bug:
0x5D
.Guy knows about this bug and I would have thought his earlier postings would not have included such problems.
-
avoid the bug:
0x5D
I think you mean
\x5D
;-)I hope a moderator comes along and straightens it out.
@guy038 has the moderator power necessary to go back and edit his own posts, to have the regex use a reasonable escape of
\x5D
instead of trying to get the backslash-closebracket to go through this forum correctly. I am not about to try to edit a guy-regex for republishing, because I’d never know if I broke it while trying to edit it. ;-) -
@tho-gru said:
This forum software is quite ugly to use. It often complain about spam suspicious posts and only left 5 minutes to edit.
It may complain about certain things until you have a reputation >= 2.
The 5 minute limit affects everyone. It’s not so bad. It encourages you to proofread what you write before posting. If you see an error later (passed the time limit), simply post again and say the error and the correction.
-
@tho-gru said in Converting Windows file names to links:
It often complain about spam suspicious posts
I think you mean “It rarely complains about spam suspicious posts”. In my ~6000 posts here, I have seen it only a handful of times (around 1 time per thousand). That doesn’t seem a bad tradeoff for the number of times the Akismet filter has caught posts which contains huge amounts of spam, or someone trying to post the contents of a binary file (presumably trying to upload a virus or similar). It’s a tradeoff that the Administrators are willing to make.
The 5 minute limit affects everyone
And some of us more often than others. ;-) (I am notorious for finding typos after I post.)
I recently checked the forum settings: unfortunately, there is no way to enable the feature I’d really want: “you are allowed to edit your post until someone has replied”. The 5 minute timeout is the most reasonable compromise we can make between giving people long enough to fix typos and making sure that people don’t change history (otherwise, a nefarious user could say something innocuous, encouraging other users to chime in with “I agree”, and then go back and change the statement to something completely different, which makes the whole conversation a lie)
And despite saying, a few minutes ago, that @guy038 could go back and edit his older posts to fix the regex, at this point, it would make much more sense in the context of the discussion, for Guy to make a new reply post, with the regex fixed to use
\x5D
instead of trying to get the forum to handle backslash-closeSquareBracket correctly. -
Hello, @tho-gru, @peterjones, @alan-kilborn and All,
Oh, my god ! I’m terribly sorry because I haven’t noticed that the
\
character was not properly displayed in regexA
:-((@tho-gru, you said :
I finally find the (or my) error…
It’s definitively not your error, It’s mine :-(
Most of the time, due to our forum and/or Markdown syntax some characters need to be changed into their escape syntax when expressed in regexes.
So :
-
When you search for a literal
[
char, use preferably the\x5B
syntax -
When you search for a literal
\
char, use preferably the\x5C
syntax -
When you search for a literal
]
char, use preferably the\x5D
syntax -
When you search for a literal
backtick
char, use preferably the\x60
syntax
Of course, in the regex
A
, the parts[/\x5C]
and[/"\x5C]
are real class characters, so the[
and]
are not literal chars but a specific regex syntax !Again, sorry for my mistake which must have made you look for a long time !
So, here is the correct version of my old post :
Hello, @tho-gru, @peterjones, @alan-kilborn and All,
I would like, first, to give some information about pathnames, space chars in files or folders and characters that need the
%nn
syntax in links
When using a DOS command prompt, you can define a full pathname with, either,
\
( anti-slash ) or/
( slash ) characters or a mix of them ! Of course, when you use normal slashs, the auto-completion mechanism with theTAB
key, of folders or files, will not work.In summary, the lines
C:\my folder\My file.txt
andC:/my folder/My file.txt
are totally equivalent
Usually, when creating a file from the explorer, you may include any space char inside, but not at the very beginning or at the very end of a file or folder. However, if you’ve opened a DOS command prompt and use the DOS
ren
command, you may use the following command :ren File.txt " Fi le . txt " which will rename the "File.txt" file as " Fi le . txt"
As you can see, trailing spaces, after any text, in files or folders, are not allowed but you may insert space char(s) at any other location !
Given the regex, used in notepad++, to detect links, I tried to identify all the characters which need the
%nn
syntax, like%20
to replace any space char. After some tries, it happens that only two other chars need to be changed, using%nn
. So, the complete list is :SPACE %20 { %7B } %7D
Now, given all these preliminary elements, we can build a regex to change any full pathmane into a valid hyperlink ! I assume two hypotheses :
-
Only one full pathname per line
-
Each pathmane begins and ends its current line
So, here is a TEST sheet, which explains what must be matched or not matched :
- The regex must MATCH the following lines : d:\ # FULL pathname WITHOUT quotes "d:\" # FULL pathname with QUOTES d:\x\y\file.docx # FULL pathname with SEVERAL levels of SUBDIRS and a final WORD document "d:\ä\ö\straße.txt " # FULL pathname with folders with ACCUENTUATED characters and a SPACE char before the ENDING quote d:/x/y z/file.txt # FULL pathname with a SUBFOLDER containing a SPACE and / SEPARATORS d:\x\y z\fi le.xlsx # FULL pathname with a SUBFOLDER and the FILENAME containing a SPACE and an EXCEL sheet C:\ # FULL pathname with a DIFFERENT DRIVE letter "D:\_DEF\ XY Z" # FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name "D:\_DEF\ XY Z\" # FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name and a TRAILING anti-slash "D:/_DEF/ XY Z" # FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name and / SEPARATORS "D:/_DEF/ XY Z/" # FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name and / SEPARATORS and a TRAINLING slash "D:\_DEF\ XY Z\ AB C . txt" # FULL pathname with a SUBDIR and the FILE name containing LEADING and MIDDLE spaces "D:/_DEF/ XY Z/ AB C . txt" # FULL pathname with a SUBDIR and the FILE name containing LEADING and MIDDLE spaces and / SEPARATORS "D:\_DEF\ XY Z\RST" # FULL pathname with a SUBDIR containing SPACES at BEGINNING and INSIDE its name and a FILE name WITHOUT extension "D:/_DEF/ XY Z/RST" # FULL pathname with a SUBDIR containing SPACES at BEGINNING and INSIDE its name and a FILE name WITHOUT extension D:\@@\792\!#$%&'()+,-.;@[]^_` {}~€à×.txt # FULL pathname with SEVERAL levels of SUBDIRS and a FILE name with ALL symbols between 0X20 and 0x7F plus €à× > 0X7F D:/@@/792/!#$%&'()+,-.;@[]^_` {}~€à×.txt # FULL pathname with SEVERAL levels of SUBDIRS and a FILE name with ALL symbols between 0X20 and 0x7F plus €à× > 0X7F and / SEPARATORS "d:\x\y\z\abc.txt " # FULL pathmane with SEVERAL levels of SUBDIRS and TRAILING SPACES after the file EXTENSION - The regex must NOT match any of the following lines : d:\ # With a TRAILING space, leading to file:///d:\%20 , identical to file:///d:\ x: # NO slash NOR anti-slash, after the COLON -:\ # NO DRIVE letter Ä:\ # FORBIDDEN drive value ä:\ # FORBIDDEN drive value \ # NO DRIVE letter and NO COLON d:\ # SPACE beginning the FULL pathname "d:\dir # TRAILING double quote MISSING "d:/dir with space # TRAILING double quote MISSING p:\dir" # LEADING double quote missing d:x\y\file.txt # SLASH or ANTISLASH MISSING after the COLON this is a test # NO FULL pathname d:\x\y \z\abc.txt # Trailing SPACES at END of the Y SUBdirectory "d:\x\y \z\abc.txt" # Trailing SPACES at END of the Y SUBdirectory
As we suppose that only one FULL pathmane must appear on each line, this leads to the following INPUT text :
d:\ "d:\" d:\x\y\file.docx "d:\ä\ö\straße.txt " d:/x/y z/file.txt d:\x\y z\fi le.xlsx C:\ "D:\_DEF\ XY Z" "D:\_DEF\ XY Z\" "D:/_DEF/ XY Z" "D:/_DEF/ XY Z/" "D:\_DEF\ XY Z\ AB C . txt" "D:/_DEF/ XY Z/ AB C . txt" "D:\_DEF\ XY Z\RST" "D:/_DEF/ XY Z/RST" D:\@@\792\!#$%&'()+,-.;@[]^_` {}~€à×.txt D:/@@/792/!#$%&'()+,-.;@[]^_` {}~€à×.txt "d:\x\y\z\abc.txt " d:\ x: -:\ Ä:\ ä:\ \ d:\ "d:\dir "d:/dir with space p:\dir" d:x\y\file.txt this is a test d:\x\y \z\abc.txt "d:\x\y \z\abc.txt"
Beware : the second
d:\
is followed with a space char, => NO match
We need two regexes S/R in order to get functional hyperlinks :
-
The first S/R
A
will transform any FULL pathname into an hyperlink, beginning with the syntaxfile:///
-
The second S/R
B
will replace any character SPACE,{
and}
with, respectively, the strings%20
,%7B
and%7D
, only on lines beginning with thefile:///
syntax
I will use the free-spacing mode
(?x)
, in the search regexes, to better visualize the different regex parts. So :- The S/R A is :
SEARCH
(?xi) ^ (")? ( [A-Z] : [/\x5C] (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S [/\x5C] )* (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S )? ) (?(1)\x20*") $
REPLACE
file:\x2F\x2F\x2F\2
- The S/R B is :
SEARCH
(?x-is) ^ (?!file:/{3}) .+ (*SKIP) (*FAIL) | ( \x20 ) | ( { ) | ( } )
REPLACE
(?1%20)(?2%7B)(?3%7D)
Then, the road map is :
-
Select the zone containing all your full pathnames list, which will be changed into hyperlinks
-
Open the Replace dialog (
Ctrl + H
)-
Hit the
Del
key to clear theFind what:
zone -
Check the
In selection
option -
SEARCH
(?xi) ^ (")? ( [A-Z] : [/\x5C] (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S [/\x5C] )* (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S )? ) (?(1)\x20*") $
-
REPLACE
file:\x2F\x2F\x2F\2
-
Select the
Regular expression
search mode -
Click on the
Replace All
button
-
=> A message appears :
Replace All: 18 occurrences were replaced in selected text
-
-
SEARCH
(?x-is) ^ (?!file:/{3}) .+ (*SKIP) (*FAIL) | ( \x20 ) | ( { ) | ( } )
-
REPLACE
(?1%20)(?2%7B)(?3%7D)
-
Click again on the
Replace All
button
-
=> A message appears :
Replace All: 69 occurrences were replaced in selected text
And it gives the expected OUTPUT text :
file:///d:\ file:///d:\ file:///d:\x\y\file.docx file:///d:\ä\ö\straße.txt file:///d:/x/y%20z/file.txt file:///d:\x\y%20z\fi%20le.xlsx file:///C:\ file:///D:\_DEF\%20%20XY%20%20%20Z file:///D:\_DEF\%20%20XY%20%20%20Z\ file:///D:/_DEF/%20%20XY%20%20%20Z file:///D:/_DEF/%20%20XY%20%20%20Z/ file:///D:\_DEF\%20%20XY%20%20%20Z\%20%20AB%20%20%20C%20%20.%20%20%20txt file:///D:/_DEF/%20%20XY%20%20%20Z/%20%20AB%20%20%20C%20%20.%20%20%20txt file:///D:\_DEF\%20%20XY%20%20%20Z\RST file:///D:/_DEF/%20%20XY%20%20%20Z/RST file:///D:\@@\792\!#$%&'()+,-.;@[]^_`%20%7B%7D~€à×.txt file:///D:/@@/792/!#$%&'()+,-.;@[]^_`%20%7B%7D~€à×.txt file:///d:\x\y\z\abc.txt d:\ x: -:\ Ä:\ ä:\ \ d:\ "d:\dir "d:/dir with space p:\dir" d:x\y\file.txt this is a test d:\x\y \z\abc.txt "d:\x\y \z\abc.txt"
IMPORTANT : When you double -click on an hyperlink, containing an existing file, it automatically opens in its default application => For instance, any
.docx
is lauched with Microsoft Word and any.xlsx
starts with Microsoft Excel !
Notes :
- For the first S/R
A
, we can, as well, split over the search regex, in several lines, like below :
(?xi) ^ (")? # An OPTIONAL leading quote at BEGINNING of line, stored in Group 1 ( # START of Group 2 [A-Z] : [/\x5C] # DRIVE letter, colon and SLASH or ANTI-SLASH (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S [/\x5C] )* # OPTIONAL SUBSEQUENT subfolder(s) chars, ending with a NON-SPACE char, before a SLASH or ANTI-SLASH (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S )? # OPTIONAL NON-SPACE or SPACE chars, ending with a NON-SPACE char ( LAST subfolder or FOLE NAME part ) ) # END of Group 2 (?(1)\x20*") $ # OPTIONAL SPACES and the ENDING quote ( if LEADING quote at START ), before the END of CURRENT line
- Regarding the second S/R
B
, we could also have used this other search version, for same results :
(?x-is) (?: ^file:/{3} | (?!\A)\G ) .*? \K (?: ( \x20 ) | ( { ) | ( } ) )
Best Regards,
guy038
P.S. : This method works for both
ANSI
andUTF-8
/UTF-8-BOM
encodings ! -
-
This post is deleted! -
Thanks for all your help. I appreciate it.
When using this macro in my environment it work really fine.
I figured out an additional requirement while using it: The macro must leave already converted links untouched. This requirement is already fulfilled in my opinion because either S/R
A
and S/RB
will not match any more.This is really useful!
Regards
Thomas