Community
    • Login

    Converting Windows file names to links

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    19 Posts 4 Posters 1.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn @tho-gru
      last edited by

      @tho-gru said in Converting Windows file names to links:

      When I click on find I got the message that the regex is syntactically wrong.
      Can you please tell me, what i did wrong?

      Well, this is dangerously close to expecting other people to do your work for you. I mean, if you get into that situation, you should remove stuff from your expression slowly, experimenting, and at some point you’ll see what you did wrong. Then you can slowly add back in, testing, and get your expression correct.

      But… If I copy your expression and press Find Next, I get no message about it being “syntactically wrong”. I get “Can’t find the text…” which means the syntax is correct.

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hi, @tho-gru, @peterjones, @alan-kilborn and All,

        First of all, I want to apologize about an obvious error, in the Regex A Replacement part :

        I said :

        REPLACE file///\2
        

        Of course, the correct regex is

        REPLACE file:///\2

        Secondly, to avoid that this regex, itself, to be an hyperlink, I prefered this final syntax :

        REPLACE file:\x2F\x2F\x2F\2

        Important : I updated my previous post, accordingly, in order the info to be accurate !


        Now regarding your problem, I first noticed that your regex is not the same as mime, as its just considers the case of pathnames with slashs only, like in D:/ABC/xYZ.txt

        But, even with this restriction, if I run your regex against the INPUT text, provided in my previous post, I did get 6 replacements, relative to all the pathnames with usual / of the test list !

        So, I don’t know why you found a malformed regex expression ?

        00978f72-9cc2-4152-812a-f278413d3636-Test.png

        Notepad++ v8.4.4 (64-bit)
        Build time : Jul 15 2022 - 17:54:42
        Path : E:\844_x64\notepad++.exe
        Command Line :
        Admin mode : OFF
        Local Conf mode : ON
        Cloud Config : OFF
        OS Name : Windows 10 Pro (64-bit)
        OS Version : 21H2
        OS Build : 19044.1949
        Current ANSI codepage : 1252
        Plugins :
        mimeTools (2.8)
        NppConverter (4.4)
        NppExport (0.4)
        ComparePlus (1)


        If you use the complete regex S/R A, given ( and updated ) in my previous post, you should get 18 replacements, as before !

        Best Regards

        guy038

        tho-gruT 1 Reply Last reply Reply Quote 0
        • tho-gruT
          tho-gru @guy038
          last edited by tho-gru

          Hi @guy038 and @Alan-Kilborn and everyone,

          I finally find the (or my) error…

          When I use the regex A from guy is does not work. After watching some additional time on the regex I simply guess that the character combinations \] must be converted to \\] (when writing this text the preview function shows that \-chars must be entered twice to appear in the preview).

          This following regex works for me:

          (?xi) ^ (")? ( [A-Z] : [/\\\\] (?: (?: (?![/"\\\\]) \S | \x20 )* (?![/"\\\\]) \S [/\\\\] )* (?: (?: (?![/"\\\\]) \S | \x20 )* (?![/"\\\\]) \S )? ) (?(1)\x20*") $
          

          So perhaps the solution provided by guy is now more useful for other people.

          tho-gruT 1 Reply Last reply Reply Quote 0
          • tho-gruT
            tho-gru @tho-gru
            last edited by tho-gru

            Hi @guy038 and @Alan-Kilborn and everyone,

            I finally find the (or my) error…

            When I use the regex A from guy is does not work. After watching some additional time on the regex I simply guess that the character combinations backslash followed by closing square bracket must be converted to backslash, backslash followed by closing square bracket.

            Even the live preview looks different than the final post.

            Unfortunately while trying to provide some text representation of the new regex failed. Here is a screen shot:
            Notepad++Filename2Link.png

            So perhaps the solution provided by guy is now more useful for other people.

            PS: This forum software is quite ugly to use. It often complain about spam suspicious posts and only left 5 minutes to edit. Therefore the repost.

            Alan KilbornA PeterJonesP 3 Replies Last reply Reply Quote 1
            • Alan KilbornA
              Alan Kilborn @tho-gru
              last edited by Alan Kilborn

              @tho-gru

              This thread is hopelessly muddled.
              I hope a moderator comes along and straightens it out.

              the character combinations backslash followed by closing square bracket must be converted to backslash, backslash followed by closing square bracket

              Yes, there is a website bug with the backslash-closingSquareBracket sequence. It is better to use the following for the closingSquareBracket to avoid the bug: 0x5D.

              Guy knows about this bug and I would have thought his earlier postings would not have included such problems.

              PeterJonesP 1 Reply Last reply Reply Quote 0
              • PeterJonesP
                PeterJones @Alan Kilborn
                last edited by

                @Alan-Kilborn ,

                avoid the bug: 0x5D

                I think you mean \x5D ;-)

                I hope a moderator comes along and straightens it out.

                @guy038 has the moderator power necessary to go back and edit his own posts, to have the regex use a reasonable escape of \x5D instead of trying to get the backslash-closebracket to go through this forum correctly. I am not about to try to edit a guy-regex for republishing, because I’d never know if I broke it while trying to edit it. ;-)

                1 Reply Last reply Reply Quote 2
                • Alan KilbornA
                  Alan Kilborn @tho-gru
                  last edited by

                  @tho-gru said:

                  This forum software is quite ugly to use. It often complain about spam suspicious posts and only left 5 minutes to edit.

                  It may complain about certain things until you have a reputation >= 2.

                  The 5 minute limit affects everyone. It’s not so bad. It encourages you to proofread what you write before posting. If you see an error later (passed the time limit), simply post again and say the error and the correction.

                  1 Reply Last reply Reply Quote 1
                  • PeterJonesP
                    PeterJones @tho-gru
                    last edited by

                    @tho-gru said in Converting Windows file names to links:

                    It often complain about spam suspicious posts

                    I think you mean “It rarely complains about spam suspicious posts”. In my ~6000 posts here, I have seen it only a handful of times (around 1 time per thousand). That doesn’t seem a bad tradeoff for the number of times the Akismet filter has caught posts which contains huge amounts of spam, or someone trying to post the contents of a binary file (presumably trying to upload a virus or similar). It’s a tradeoff that the Administrators are willing to make.

                    The 5 minute limit affects everyone

                    And some of us more often than others. ;-) (I am notorious for finding typos after I post.)

                    I recently checked the forum settings: unfortunately, there is no way to enable the feature I’d really want: “you are allowed to edit your post until someone has replied”. The 5 minute timeout is the most reasonable compromise we can make between giving people long enough to fix typos and making sure that people don’t change history (otherwise, a nefarious user could say something innocuous, encouraging other users to chime in with “I agree”, and then go back and change the statement to something completely different, which makes the whole conversation a lie)

                    And despite saying, a few minutes ago, that @guy038 could go back and edit his older posts to fix the regex, at this point, it would make much more sense in the context of the discussion, for Guy to make a new reply post, with the regex fixed to use \x5D instead of trying to get the forum to handle backslash-closeSquareBracket correctly.

                    1 Reply Last reply Reply Quote 0
                    • guy038G
                      guy038
                      last edited by guy038

                      Hello, @tho-gru, @peterjones, @alan-kilborn and All,

                      Oh, my god ! I’m terribly sorry because I haven’t noticed that the \ character was not properly displayed in regex A :-((

                      @tho-gru, you said :

                      I finally find the (or my) error…

                      It’s definitively not your error, It’s mine :-(

                      Most of the time, due to our forum and/or Markdown syntax some characters need to be changed into their escape syntax when expressed in regexes.

                      So :

                      • When you search for a literal [ char, use preferably the \x5B syntax

                      • When you search for a literal \ char, use preferably the \x5C syntax

                      • When you search for a literal ] char, use preferably the \x5D syntax

                      • When you search for a literal backtick char, use preferably the \x60 syntax


                      Of course, in the regex A, the parts [/\x5C] and [/"\x5C] are real class characters, so the [ and ] are not literal chars but a specific regex syntax !

                      Again, sorry for my mistake which must have made you look for a long time !


                      So, here is the correct version of my old post :

                      Hello, @tho-gru, @peterjones, @alan-kilborn and All,

                      I would like, first, to give some information about pathnames, space chars in files or folders and characters that need the %nn syntax in links


                      When using a DOS command prompt, you can define a full pathname with, either, \ ( anti-slash ) or / ( slash ) characters or a mix of them ! Of course, when you use normal slashs, the auto-completion mechanism with the TAB key, of folders or files, will not work.

                      In summary, the lines C:\my folder\My file.txt and C:/my folder/My file.txt are totally equivalent


                      Usually, when creating a file from the explorer, you may include any space char inside, but not at the very beginning or at the very end of a file or folder. However, if you’ve opened a DOS command prompt and use the DOS ren command, you may use the following command :

                      ren File.txt "   Fi  le  .  txt   " which will rename the "File.txt" file as "   Fi  le  .  txt"
                      

                      As you can see, trailing spaces, after any text, in files or folders, are not allowed but you may insert space char(s) at any other location !


                      Given the regex, used in notepad++, to detect links, I tried to identify all the characters which need the %nn syntax, like %20 to replace any space char. After some tries, it happens that only two other chars need to be changed, using %nn. So, the complete list is :

                          SPACE    %20
                          {        %7B
                          }        %7D
                      

                      Now, given all these preliminary elements, we can build a regex to change any full pathmane into a valid hyperlink ! I assume two hypotheses :

                      • Only one full pathname per line

                      • Each pathmane begins and ends its current line


                      So, here is a TEST sheet, which explains what must be matched or not matched :

                      - The regex must MATCH the following lines :
                      
                      d:\                                        #  FULL pathname WITHOUT quotes
                      "d:\"                                      #  FULL pathname with QUOTES
                      d:\x\y\file.docx                           #  FULL pathname with SEVERAL levels of SUBDIRS and a final WORD document
                      "d:\ä\ö\straße.txt "                       #  FULL pathname with folders with ACCUENTUATED characters and a SPACE char before the ENDING quote
                      d:/x/y z/file.txt                          #  FULL pathname with a SUBFOLDER containing a SPACE and / SEPARATORS
                      d:\x\y z\fi le.xlsx                        #  FULL pathname with a SUBFOLDER and the FILENAME containing a SPACE and an EXCEL sheet
                      C:\                                        #  FULL pathname with a DIFFERENT DRIVE letter
                      "D:\_DEF\  XY   Z"                         #  FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name
                      "D:\_DEF\  XY   Z\"                        #  FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name and a TRAILING anti-slash
                      "D:/_DEF/  XY   Z"                         #  FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name and / SEPARATORS
                      "D:/_DEF/  XY   Z/"                        #  FULL pathname with a SUBDIR containing SPACES at BEGINNING and WITHIN its name and / SEPARATORS and a TRAINLING slash
                      "D:\_DEF\  XY   Z\  AB   C  .   txt"       #  FULL pathname with a SUBDIR and the FILE name containing LEADING and MIDDLE spaces
                      "D:/_DEF/  XY   Z/  AB   C  .   txt"       #  FULL pathname with a SUBDIR and the FILE name containing LEADING and MIDDLE spaces and / SEPARATORS
                      "D:\_DEF\  XY   Z\RST"                     #  FULL pathname with a SUBDIR containing SPACES at BEGINNING and INSIDE its name and a FILE name WITHOUT extension
                      "D:/_DEF/  XY   Z/RST"                     #  FULL pathname with a SUBDIR containing SPACES at BEGINNING and INSIDE its name and a FILE name WITHOUT extension
                      D:\@@\792\!#$%&'()+,-.;@[]^_` {}~€à×.txt   #  FULL pathname with SEVERAL levels of SUBDIRS and a FILE name with ALL symbols between 0X20 and 0x7F plus €à× > 0X7F
                      D:/@@/792/!#$%&'()+,-.;@[]^_` {}~€à×.txt   #  FULL pathname with SEVERAL levels of SUBDIRS and a FILE name with ALL symbols between 0X20 and 0x7F plus €à× > 0X7F and / SEPARATORS
                      "d:\x\y\z\abc.txt   "                      #  FULL pathmane with SEVERAL levels of SUBDIRS and TRAILING SPACES after the file EXTENSION
                      
                      - The regex must NOT match any of the following lines :
                      
                      d:\                                        #  With a TRAILING space, leading to file:///d:\%20 , identical to file:///d:\
                      x:                                         #  NO slash NOR anti-slash, after the COLON
                      -:\                                        #  NO DRIVE letter
                      Ä:\                                        #  FORBIDDEN drive value 
                      ä:\                                        #  FORBIDDEN drive value 
                      \                                          #  NO DRIVE letter and NO COLON
                       d:\                                       #  SPACE beginning the FULL pathname
                      "d:\dir                                    #  TRAILING double quote MISSING
                      "d:/dir with space                         #  TRAILING double quote MISSING
                      p:\dir"                                    #  LEADING  double quote missing
                      d:x\y\file.txt                             #  SLASH or ANTISLASH MISSING after the COLON
                      this is a test                             #  NO FULL pathname
                      d:\x\y  \z\abc.txt                         #  Trailing SPACES at END of the Y SUBdirectory
                      "d:\x\y  \z\abc.txt"                       #  Trailing SPACES at END of the Y SUBdirectory
                      

                      As we suppose that only one FULL pathmane must appear on each line, this leads to the following INPUT text :

                      d:\
                      "d:\"
                      d:\x\y\file.docx
                      "d:\ä\ö\straße.txt "
                      d:/x/y z/file.txt
                      d:\x\y z\fi le.xlsx
                      C:\
                      "D:\_DEF\  XY   Z"
                      "D:\_DEF\  XY   Z\"
                      "D:/_DEF/  XY   Z"
                      "D:/_DEF/  XY   Z/"
                      "D:\_DEF\  XY   Z\  AB   C  .   txt"
                      "D:/_DEF/  XY   Z/  AB   C  .   txt"
                      "D:\_DEF\  XY   Z\RST"
                      "D:/_DEF/  XY   Z/RST"
                      D:\@@\792\!#$%&'()+,-.;@[]^_` {}~€à×.txt
                      D:/@@/792/!#$%&'()+,-.;@[]^_` {}~€à×.txt
                      "d:\x\y\z\abc.txt   "
                      
                      d:\ 
                      x:
                      -:\
                      Ä:\
                      ä:\
                      \
                       d:\
                      "d:\dir
                      "d:/dir with space
                      p:\dir"
                      d:x\y\file.txt
                      this is a test
                      d:\x\y  \z\abc.txt
                      "d:\x\y  \z\abc.txt"
                      

                      Beware : the second d:\ is followed with a space char, => NO match


                      We need two regexes S/R in order to get functional hyperlinks :

                      • The first S/R A will transform any FULL pathname into an hyperlink, beginning with the syntax file:///

                      • The second S/R B will replace any character SPACE,{ and } with, respectively, the strings %20, %7B and %7D, only on lines beginning with the file:/// syntax

                      I will use the free-spacing mode (?x), in the search regexes, to better visualize the different regex parts. So :

                      • The S/R A is :

                      SEARCH (?xi) ^ (")? ( [A-Z] : [/\x5C] (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S [/\x5C] )* (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S )? ) (?(1)\x20*") $

                      REPLACE file:\x2F\x2F\x2F\2

                      • The S/R B is :

                      SEARCH (?x-is) ^ (?!file:/{3}) .+ (*SKIP) (*FAIL) | ( \x20 ) | ( { ) | ( } )

                      REPLACE (?1%20)(?2%7B)(?3%7D)


                      Then, the road map is :

                      • Select the zone containing all your full pathnames list, which will be changed into hyperlinks

                      • Open the Replace dialog ( Ctrl + H )

                        • Hit the Del key to clear the Find what: zone

                        • Check the In selection option

                        • SEARCH (?xi) ^ (")? ( [A-Z] : [/\x5C] (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S [/\x5C] )* (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S )? ) (?(1)\x20*") $

                        • REPLACE file:\x2F\x2F\x2F\2

                        • Select the Regular expression search mode

                        • Click on the Replace All button

                      => A message appears : Replace All: 18 occurrences were replaced in selected text

                        • SEARCH (?x-is) ^ (?!file:/{3}) .+ (*SKIP) (*FAIL) | ( \x20 ) | ( { ) | ( } )

                        • REPLACE (?1%20)(?2%7B)(?3%7D)

                        • Click again on the Replace All button

                      => A message appears : Replace All: 69 occurrences were replaced in selected text

                      And it gives the expected OUTPUT text :

                      file:///d:\
                      file:///d:\
                      file:///d:\x\y\file.docx
                      file:///d:\ä\ö\straße.txt
                      file:///d:/x/y%20z/file.txt
                      file:///d:\x\y%20z\fi%20le.xlsx
                      file:///C:\
                      file:///D:\_DEF\%20%20XY%20%20%20Z
                      file:///D:\_DEF\%20%20XY%20%20%20Z\
                      file:///D:/_DEF/%20%20XY%20%20%20Z
                      file:///D:/_DEF/%20%20XY%20%20%20Z/
                      file:///D:\_DEF\%20%20XY%20%20%20Z\%20%20AB%20%20%20C%20%20.%20%20%20txt
                      file:///D:/_DEF/%20%20XY%20%20%20Z/%20%20AB%20%20%20C%20%20.%20%20%20txt
                      file:///D:\_DEF\%20%20XY%20%20%20Z\RST
                      file:///D:/_DEF/%20%20XY%20%20%20Z/RST
                      file:///D:\@@\792\!#$%&'()+,-.;@[]^_`%20%7B%7D~€à×.txt
                      file:///D:/@@/792/!#$%&'()+,-.;@[]^_`%20%7B%7D~€à×.txt
                      file:///d:\x\y\z\abc.txt
                      
                      d:\ 
                      x:
                      -:\
                      Ä:\
                      ä:\
                      \
                       d:\
                      "d:\dir
                      "d:/dir with space
                      p:\dir"
                      d:x\y\file.txt
                      this is a test
                      d:\x\y  \z\abc.txt
                      "d:\x\y  \z\abc.txt"
                      

                      IMPORTANT : When you double -click on an hyperlink, containing an existing file, it automatically opens in its default application => For instance, any .docx is lauched with Microsoft Word and any .xlsx starts with Microsoft Excel !


                      Notes :

                      • For the first S/R A, we can, as well, split over the search regex, in several lines, like below :
                      (?xi)
                      ^  (")?                                                        #  An OPTIONAL leading quote at BEGINNING of line, stored in Group 1
                      (                                                              #  START of Group 2
                      [A-Z]  :  [/\x5C]                                              #  DRIVE letter, colon and SLASH or ANTI-SLASH
                      (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S  [/\x5C] )*  #  OPTIONAL SUBSEQUENT subfolder(s) chars, ending with a NON-SPACE char, before a SLASH or ANTI-SLASH
                      (?: (?: (?![/"\x5C]) \S | \x20 )* (?![/"\x5C]) \S )?           #  OPTIONAL NON-SPACE or SPACE chars, ending with a NON-SPACE char ( LAST subfolder or FOLE NAME part )
                      )                                                              #  END of Group 2
                      (?(1)\x20*")  $                                                #  OPTIONAL SPACES and the ENDING quote ( if LEADING quote at START ), before the END of CURRENT line
                      
                      • Regarding the second S/R B, we could also have used this other search version, for same results :

                      (?x-is) (?: ^file:/{3} | (?!\A)\G ) .*? \K (?: ( \x20 ) | ( { ) | ( } ) )

                      Best Regards,

                      guy038

                      P.S. : This method works for both ANSI and UTF-8 / UTF-8-BOM encodings !

                      Alan KilbornA 1 Reply Last reply Reply Quote 0
                      • Alan KilbornA
                        Alan Kilborn @guy038
                        last edited by Alan Kilborn

                        This post is deleted!
                        1 Reply Last reply Reply Quote 0
                        • tho-gruT
                          tho-gru
                          last edited by

                          Thanks for all your help. I appreciate it.

                          When using this macro in my environment it work really fine.

                          I figured out an additional requirement while using it: The macro must leave already converted links untouched. This requirement is already fulfilled in my opinion because either S/R A and S/R B will not match any more.

                          This is really useful!

                          Regards
                          Thomas

                          1 Reply Last reply Reply Quote 0
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors