Community
    • Login

    Is a filename legal?

    Scheduled Pinned Locked Moved General Discussion
    1 Posts 1 Posters 26 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G Offline
      guy038
      last edited by guy038

      Hello, All,

      Quite by chance, I came across this article :

      How do I check if a given string is a legal/valid file name under Windows?

      https://stackoverflow.com/questions/62771

      So, I decided to run a few tests and “rework” the different regular expressions suggested as answers !

      I finally came up with the multi-lines regex, below :

      (?xi-s)
      ^
      ("?)
      (?!
        (?: PRN | AUX | NUL | CON | COM (?!0) \d | LPT (?!0) \d )
        (?: \. .+ )? \1 $
      )
      [^\x00-\x1F"*/:<>?\\|\x7F]+
      (?<! [\x00-\x20.] )
      \1
      $
      

      Three remarks :

      • This regex considers that cases "abcd.txt and abcd.txt" are invalid. In other words, the legal syntaxes are abcd.txt or "abcd.txt" ONLY. To do so, I use as group 1 the string "?. Thus :

        • If a " character begins the file name, the group 1 exists and the two back-references \1 represents the character " itself

        • If no " character begins current file name, the group 1 exists too but, as well as the back-references, it represents only an empty string

      • The reserved names, like CON or AUX, are also reserved when written lower-case as con or aux ! Thus, the insensitive i flag

      • In the suggested regexes, they use the final negative look-behind (?<! [\s.] ). However, I verified that ANY char over \x20, from the \s range, like \xA0, \x{2007}, \x{205F} … or even a char outside \s like \x97, \x9F … , may be added as trailing chars of a filename ! Thus, I preferred the (?<! [\x00-\x20.] ) syntax


      To insert a character, from the six characters SPACE & , ; = ^ and, especially, at beginning of the filename :

      • Rename the filename between two DOUBLES QUOTES when within DOS or within any application like N++ or else

      • Except for the space char, you may also simply use the rename option of Microsoft Explorer to insert any allowed character


      For information :

      ALWAYS forbidden               :       \x00-\x1F    "    *    /    :    <    >    ?    \    |    \x7F
      
                                     :         .    at the END of file name
                                         
                                     :       SPACE  at the END of file name
      
                                     :       ALL DOTS file name
      
                                     :       PRN    AUX    NUL
      
                                     :       COM1    COM2    com3    COM4    COM5    COM6    COM7    COM8    COM9    COM¹    COM²    COM³
      
                                     :       com1    com2    com3    com4    com5    com6    com7    com8    com9    com¹    com²    com³
      
                                     :       LPT1    LPT2    LPT3    LPT4    LPT5    lpt6    LPT7    LPT8    LPT9    LPT¹    LPT²    LPT³
      
                                     :       lpt1    lpt2    lpt3    lpt4    lpt5    lpt6    lpt7    lpt8    lpt9    lpt¹    lpt²    lpt³
      
      
      Allowed WITHIN  double quotes  :       SPACE    &    ,    ;    =    ^
      
      
      Allowed WITHOUT double quotes  :       !    #    $    %    '    (    )    +    -    @    [    ]    _    `   {    }    ~ 
      
                                     :       .       if NOT at END of file name
      
                                     :       SPACE   if NOT at BEGINNING or END of file name
      

      You may test the multi-line regex above against the text below, pasted in a new tab :

      ============================================= Cas KO : =============================================
      
      ""
      
      PRN
      aux
      NUL
      COM1
      COM2
      com3
      COM4
      COM5
      COM6
      COM7
      COM8
      COM9
      COM¹
      COM²
      COM³
      LPT1
      LPT2
      LPT3
      LPT4
      LPT5
      lpt6
      LPT7
      LPT8
      LPT9
      LPT¹
      LPT²
      LPT³
      
      "PRN"
      "aux"
      "NUL"
      
      "COM1"
      "COM2"
      "com3"
      "COM4"
      "COM5"
      "COM6"
      "COM7"
      "COM8"
      "COM9"
      "COM¹"
      "COM²"
      "COM³"
      
      "LPT1"
      "LPT2"
      "LPT3"
      "LPT4"
      "LPT5"
      "lpt6"
      "LPT7"
      "LPT8"
      "LPT9"
      "LPT¹"
      "LPT²"
      "LPT³"
      
      PRN.
      aux.
      NUL.
      COM1.
      COM2.
      com3.
      COM4.
      COM5.
      COM6.
      COM7.
      COM8.
      COM9.
      COM¹.
      COM².
      COM³.
      LPT1.
      LPT2.
      LPT3.
      LPT4.
      LPT5.
      lpt6.
      LPT7.
      LPT8.
      LPT9.
      LPT¹.
      LPT².
      LPT³.
      
      "PRN."
      "aux."
      "NUL."
      "COM1."
      "COM2."
      "com3."
      "COM4."
      "COM5."
      "COM6."
      "COM7."
      "COM8."
      "COM9."
      "COM¹."
      "COM²."
      "COM³."
      "LPT1."
      "LPT2."
      "LPT3."
      "LPT4."
      "LPT5."
      "lpt6."
      "LPT7."
      "LPT8."
      "LPT9."
      "LPT¹."
      "LPT²."
      "LPT³."
      
      PRN.txt
      aux.txt
      NUL.txt
      COM1.txt
      COM2.txt
      com3.txt
      COM4.txt
      COM5.txt
      COM6.txt
      COM7.txt
      COM8.txt
      COM9.txt
      COM¹.txt
      COM².txt
      COM³.txt
      LPT1.txt
      LPT2.txt
      LPT3.txt
      LPT4.txt
      LPT5.txt
      lpt6.txt
      LPT7.txt
      LPT8.txt
      LPT9.txt
      LPT¹.txt
      LPT².txt
      LPT³.txt
      
      "PRN.txt"
      "aux.txt"
      "NUL.txt"
      "COM1.txt"
      "COM2.txt"
      "com3.txt"
      "COM4.txt"
      "COM5.txt"
      "COM6.txt"
      "COM7.txt"
      "COM8.txt"
      "COM9.txt"
      "COM¹.txt"
      "COM².txt"
      "COM³.txt"
      "LPT1.txt"
      "LPT2.txt"
      "LPT3.txt"
      "LPT4.txt"
      "LPT5.txt"
      "lpt6.txt"
      "LPT7.txt"
      "LPT8.txt"
      "LPT9.txt"
      "LPT¹.txt"
      "LPT².txt"
      "LPT³.txt"
      
      .
      ...
      ..txt..
      
      "."
      "..."
      "..txt.."
      
      ABCDE			
      "ABCDE			"
      
         abc def ghi . txt      
      "   abc def ghi . txt      "
      
      .abc.def.ghi...txt.
      ".abc.def.ghi...txt."
      
      ============================================= Cas OK : =============================================
      
         xyz
      "   xyz"
      
           .txt
      "     .txt"
      
      abc   def   .txt
      "abc   def   .txt"
      
         abc   def   .  txt
      "   abc   def   .  txt"
      
      abc.txt
      
      lpt0
      COM0
      
      "lpt0"
      "COM0"
      
      CONt
      tCOM2
      tLPT1t
      t.NULt
      
      "CONt"
      "tCOM2"
      "tLPT1t"
      "t.NULt"
      
      ...xyz...hij..tx
      .txt
      ....txt
      
      "...xyz...hij..tx"
      ".txt"
      "....txt"
      
      a.bcdefghijklmonp
      abcdefghijlmnop.z
      a.b.c.d.e.f.g.h.i
      
      "a.bcdefghijklmonp"
      "abcdefghijlmnop.z"
      "a.b.c.d.e.f.g.h.i"
      
       abc def ghi . txt
      " abc def ghi . txt"
      
      .abc.def.ghi...txt
      ".abc.def.ghi...txt"
      
      !abc!def!ghi!.!txt!
      #abc#def#ghi#.#txt#
      $abc$def$ghi$.$txt$
      %abc%def%ghi%.%txt%
      &abc&def&ghi&.&txt&
      'abc'def'ghi'.'txt'
      (abc(def(ghi(.(txt(
      )abc)def)ghi).)txt)
      +abc+def+ghi+.+txt+
      ,abc,def,ghi,.,txt,
      -abc-def-ghi-.-txt-
      ;abc;def;ghi;.;txt;
      =abc=def=ghi=.=txt=
      @abc@def@ghi@.@txt@
      [abc[def[ghi[.[txt[
      ]abc]def]ghi].]txt]
      ^abc^def^ghi^.^txt^
      _abc_def_ghi_._txt_
      `abc`def`ghi`.`txt`
      {abc{def{ghi{.{txt{
      }abc}def}ghi}.}txt}
      ~abc~def~ghi~.~txt~
      
      "!abc!def!ghi!.!txt!"
      "#abc#def#ghi#.#txt#"
      "$abc$def$ghi$.$txt$"
      "%abc%def%ghi%.%txt%"
      "&abc&def&ghi&.&txt&"
      "'abc'def'ghi'.'txt'"
      "(abc(def(ghi(.(txt("
      ")abc)def)ghi).)txt)"
      "+abc+def+ghi+.+txt+"
      ",abc,def,ghi,.,txt,"
      "-abc-def-ghi-.-txt-"
      ";abc;def;ghi;.;txt;"
      "=abc=def=ghi=.=txt="
      "@abc@def@ghi@.@txt@"
      "[abc[def[ghi[.[txt["
      "]abc]def]ghi].]txt]"
      "^abc^def^ghi^.^txt^"
      "_abc_def_ghi_._txt_"
      "`abc`def`ghi`.`txt`"
      "{abc{def{ghi{.{txt{"
      "}abc}def}ghi}.}txt}"
      "~abc~def~ghi~.~txt~"
      

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 1

      Hello! It looks like you're interested in this conversation, but you don't have an account yet.

      Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

      With your input, this post could be even better 💗

      Register Login
      • First post
        Last post
      The Community of users of the Notepad++ text editor.
      Powered by NodeBB | Contributors