Community

    • Login
    • Search
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search

    what are list Regex available "named capture group" for find and replace in current version

    Help wanted · · · – – – · · ·
    regex replace
    4
    13
    472
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Jergen Ross Estaco
      Jergen Ross Estaco last edited by Jergen Ross Estaco

      I’m sorry for who has confuse to read my english because I’m deaf (my ears can’t hear)
      I hope you understand to my post.

      Hi,

      I’ve read looking for documentation and forum.
      I don’t want number group backreference,
      when I trying find and replace with regex with use named capture group
      one worked only:
      Find: (?<name>subexp) or (?'name'subexp)
      Replace: $+{name}

      not work:
      Replace: ${name} , \g{name} , \k{name} \g<name> or \k<name>

      why I have one is $+{name} still worked? :/

      I was doing with worked:
      Find:
      ^\s*(?<size>[+-]?([0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)?|\d*\.\d+|\d+))\s?(?<size_type>(?i)gb|mb|m|g)?\s*(?<path>C:|D:.+\\)(?<file>.+)(?'type'\..+$)
      or
      ^\s*(?'size'[+-]?([0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)?|\d*\.\d+|\d+))\s?(?'size_type'(?i)gb|mb|m|g)?\s*(?'path'C:|D:.+\\)(?'file'.+)(?'type'\..+$)

      Replace:
      $+{size} $+{size_type} | $+{file}$+{type}\r\n\t$+{path}\r\n

      but NOT WORK
      Replace:
      \g{size} \g{size_type} | \g{file}\g{type}\r\n\t\g{path}\r\n
      or
      \k{size} \k{size_type} | \k{file}\k{type}\r\n\t\k{path}\r\n
      or
      \g<size> \g<size_type> | \g<file>\g<type>\r\n\t\g<path>\r\n
      or
      \k<size> \k<size_type> | \k<file>\k<type>\r\n\t\k<path>\r\n

      example input (NOTICE: there is space or whitespace with beginning of line)

                5.7 GB  D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
                5.6 GB  D:\Movies by Jen\The.Lost.City.2022.1080p.AMZN.WEBRip.DDP5.1.x264-CM\The.Lost.City.2022.1080p.AMZN.WEB-DL.DDP5.1.H264-CM.mkv
      

      I did output: (NOTICE: <tab> is indented paragraph)

      5.7 GB | Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
      <tab> D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP
      5.7 GB | Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
      <tab> D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\

      and not work output with \g<name> , \k<name>...

      I have Question:
      \g<name> , \k<name> is deprecated?
      if not, how I do use \g<name>?

      I want know other Regexp available work with replace only in current version.

      Jergen Ross Estaco 1 Reply Last reply Reply Quote 0
      • Jergen Ross Estaco
        Jergen Ross Estaco @Jergen Ross Estaco last edited by

        What is version for RegEx?

        what is use Boost or PCRE engine for current Notepad++?

        PeterJones 1 Reply Last reply Reply Quote 0
        • PeterJones
          PeterJones @Jergen Ross Estaco last edited by

          @Jergen-Ross-Estaco said in what are list Regex available "named capture group" for find and replace in current version:

          What is version for RegEx?

          Notepad++ has been using Boost 1.78 since NPP v8.2, and 1.76 before that.

          (I am not the expert on the details of named capture groups, so I’ll let someone else handle the more technical aspects of your issue)

          1 Reply Last reply Reply Quote 0
          • guy038
            guy038 last edited by guy038

            Hello, @jergen-ross-estaco ,

            Be a little patient. I’m elaborating a reply !

            See you later,

            guy038

            1 Reply Last reply Reply Quote 0
            • guy038
              guy038 last edited by guy038

              Hello, @jergen-ross-estaco, @peterjones and All,

              First, I advice you to read these two posts, first :

              https://community.notepad-plus-plus.org/post/15930

              https://community.notepad-plus-plus.org/post/52715

              As you can see, in replacement, the following regexes :

              • \g{size} \g{size_type} | \g{file}\g{type}\r\n\t\g{path}\r\n

              • \k{size} \k{size_type} | \k{file}\k{type}\r\n\t\k{path}\r\n

              • \g<size> \g<size_type> | \g<file>\g<type>\r\n\t\g<path>\r\n

              • \k<size> \k<size_type> | \k<file>\k<type>\r\n\t\k<path>\r\n

              are totaly invalid !

              The only BOOST valid syntax, for named capturing groups, in the replacement regex, is :

              • $+{size} $+{size_type} | $+{file}$+{type}\r\n\t$+{path}\r\n

              Here is, below, your search regex, slighly modified, using the free-spacing mode (?x), which can easily be modified, if necessary :

              In some parts, I used non-capturing groups, in order that you may use numbered groups, instead of the named groups, in the replacement regex

              SEARCH

              (?x-s)            # FREE-SPACING mode + DOT matches STANDARD chars ONLY, not EOL
              ^  [\t\x20]*      # OPTIONAL range of TAB/SPACE chars
              (?<size>          # BEGINNING of the first NAMED group "size" [ or (?'size' ]
                (?:             #   NON-CAPTURING group
                  [+-]?         #     OPTIONAL sign + or -
                  (?:           #     NON-CAPTURING group
                    [0-9]{1,3}  #       From 1 to 3 DIGITS
                    (?:         #       NON-CAPTURING group
                      ,[0-9]{3} #         COMMA followed with THREE digits
                    )*          #       END, REPEATED from 0 to MORE
                  |             #     OR
                    \d+         #       ONE or MORE digits
                  )             #     END
                  (?:           #     NON-CAPTURING group 
                    \. [0-9]+   #       a DOT followed with some DIGITS
                  )?            #     END of the OPTIONAL part
                )               #   END
              |                 # OR
                \d* \. \d+      #   OPTIONAL INTEGER part followed with a DOT and some DIGITS
              )                 # END of the first NAMED group
              [\t\x20]*         # OPTIONAL range of TAB/SPACE chars
              (?<size_type>     # BEGINNING of the second NAMED group "size_type" [ or (?'size_type' ]
                (?i) gb|mb|m|g  #   String  'gb' or 'mb' or 'm' or 'g', INSENSITIVE
              )                 # END of the second NAMED group
              [\t\x20]*         # OPTIONAL TAB or SPACE characters
              (?<path>          # BEGINNING of the third NAMED group "path" [ or (?'path' ]
                (?i:C|D): .+ \\ #   EXACT string 'C' or 'D' followed by the GREATEST range of STANDARD chars till the LAST ANTI-SLASH of CURRENT line
              )                 # END of the third NAMED group
              (?<file>          # BEGINNING of the fourth NAMED group "file" [ or (?'file' ]
                .+              #   The GREATEST range of STANDARD chars till...
              )                 # END of the fourth NAMED group
              (?<type>          # BEGINNING of the fifth NAMED group "type" [ or (?'type' ]
                \.  .+          #   The LAST '.' char of CURRENT line, followed with the GREATEST range of STANDARD chars, till the END of LINE
              )                 # END of the fifth NAMED group
              

              REPLACE $+{size} $+{size_type} | $+{file}$+{type}\r\n\t$+{path}\r\n

              or

              REPLACE \1 \2 | \4\5\r\n\t\3\r\n


              Note that, without the free-spacing mode (?x), the search regex becomes :

              (?-s)^[\t\x20]*(?<size>(?:[+-]?(?:[0-9]{1,3}(?:,[0-9]{3})*|\d+)(?:\.[0-9]+)?)|\d*\.\d+)[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D):.+\\)(?<file>.+)(?<type>\..+)

              So, from this INPUT text :

                        5.7 GB  D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
              

              whatever the search and replacement regex used, you should always get this OUTPUT text :

              5.7 GB | Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
              	D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\
              

              If you decide to use the regex with the free-spacing mode, simply select all the text between (?x-s) and # END of the fifth NAMED group ( 2,045 bytes ). Note that the maximum of bytes, for the search field, is limited to 2,046 bytes !


              In summary, the 6 syntaxes \g{name}, \g<name>, \g'name', \k{name}, \k<name> and \k'name' are not deprecated but can only be used in the search part ( not in replacement ) !

              Best Regards

              guy038

              PeterJones Jergen Ross Estaco 2 Replies Last reply Reply Quote 1
              • PeterJones
                PeterJones @guy038 last edited by

                @guy038 said in what are list Regex available "named capture group" for find and replace in current version:

                As you can see, in replacement, the following regexes … are totally invalid

                I should have remembered that. We even added a very clear note in the replacement section of the user manual:

                Please note: the \g… and \k… backreference syntaxes only work in the search expression, and are not designed or intended to work in the substitution/replacement expression.

                There is a similar note in the searching section’s “numbered backreference”:

                Numbered Backreference: These syntaxes match the ℕth capture group earlier in the same expression. (Backreferences are used to refer to the capture group contents only in the search/match expression; see the Substitution Escape Sequences for how to refer to capture groups in substitutions/replacements.)

                … but apparently not in the [“named backreference”](Named Backreference)… I guess it needs to be spelled out there, too, even thought it’s just a few lines down in the docs.

                1 Reply Last reply Reply Quote 1
                • Jergen Ross Estaco
                  Jergen Ross Estaco @guy038 last edited by

                  @guy038 and @PeterJones.

                  First, I advice you to read these two posts, first :

                  I have already read since I tired searched before this post

                  I didn’t read non-capturing, am I missed read?,
                  also I didn’t know \x20 instead \s.

                  \x20 is necessary aka space character? but why \s?

                  you wrote “x” from ?x-s

                  (?x-s) # FREE-SPACING mode + DOT matches STANDARD chars ONLY, not EOL

                  but you didn’t write “x” from (?-s)

                  (?-s)^[\t\x20]*(?<size>(?:[+-]?(?:[0-9]{1,3}(?:,[0-9]{3})*|\d+)(?:\.[0-9]+)?)|\d*\.\d+)[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D):.+\\)(?<file>.+)(?<type>\..+)

                  I like that name group with start group subexp
                  example like my regex previous:
                  (?<name>subexp)

                  and your regex
                  (?<name>(subexp))

                  btw, now I can’t figure out when search can’t match part line

                           4.0 GB  C:\pagefile.sys
                           1.5 GB  C:\hiberfil.sys
                  

                  also can’t match file type will skipped line because they don’t have extension like

                         100.0 MB  C:\Users\Username\AppData\Local\Google\Chrome Beta\User Data\Default\Cache\Cache_Data\data_3
                  

                  you can help me?
                  If you can’t? please disregard this, fine.

                  If you decide to use the regex with the free-spacing mode, simply select all the text between (?x-s) and # END of the fifth NAMED group ( 2,045 bytes ). Note that the maximum of bytes, for the search field, is limited to 2,046 bytes !

                  How I do check get bytes in search field?

                  anyway, thank you for help and answer.

                  1 Reply Last reply Reply Quote 0
                  • guy038
                    guy038 last edited by guy038

                    Hi, @jergen-ross-estaco,

                    Before answering your questions, in a next post, could you tell me which of the four cases, below, will never happen ?

                    A  5.7 GB D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
                    B  5.7GB D:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv
                    C  5.7 GBD:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv		  
                    D  5.7GBD:\Movies by Jen\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP\Memory.2022.1080p.WEBRip.DD5.1.x264-NOGRP.mkv		  
                    

                    I also revisited my post and change the part of the regex regarding the size, as below :

                    (?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})*|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))

                    For instance, this regex matches all the following cases, from A to Z and @

                    A  1                     E  +1                     I  -1  
                    B  12345                 F  +12345                 J  -12345
                    C  1234.0123             G  +1234.0123             K  -1234.0123
                    D  123456.789            H  +123456.789            L  -123456.789
                    
                    M  1,234                 P  +1,234                 S  -1,234
                    N  12,345.01             Q  +12,345.01             T  -12,345.01
                    O  1,234,567.0123        R  +1,234,567.0123        U  -1,234,567.0123
                    
                    V  .0                    X  +.0                    Z  -.0
                    W  .01234                Y  +.01234                @  -.01234
                    

                    In the same way, could you tell me which cases are sure to not happen ?


                    I’ll provide you a correct and final regex solution, very soon !

                    See you later,

                    Best Regards,

                    guy038

                    1 Reply Last reply Reply Quote 2
                    • Jergen Ross Estaco
                      Jergen Ross Estaco last edited by Jergen Ross Estaco

                      @guy038, @PeterJones and all.

                      I trying what I said my explain. you know I’m deaf because slow learn, mistake grammar and sentence or bad? but I hope u understand me.

                      Before answering your questions, in a next post, could you tell me which of the four cases, below, will never happen ?

                      I tested use regex to four cases are look fine working because ‘[\t\x20]’ if no space will no failed and next step. but look similar ‘\s’.

                      I also revisited my post and change the part of the regex regarding the size, as below :
                      (?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})*|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))
                      For instance, this regex matches all the following cases, from A to Z and @
                      …
                      In the same way, could you tell me which cases are sure to not happen ?

                      ‘A,E,I,M-Z and @’ are correct

                      not all, ‘B-C,F-H, andJ-L’ are match two group that “123” and any digit. if when you use find and replace “foo” will result is “foofoo” meanwhile “123” and “4.0123” to “foo” “foo”

                      input and output:

                      A  1                     E  +1                     I  -1  
                      B  12345                 F  +12345                 J  -12345
                      C  1234.0123             G  +1234.0123             K  -1234.0123
                      D  123456.789            H  +123456.789            L  -123456.789
                      
                      M  1,234                 P  +1,234                 S  -1,234
                      N  12,345.01             Q  +12,345.01             T  -12,345.01
                      O  1,234,567.0123        R  +1,234,567.0123        U  -1,234,567.0123
                      
                      V  .0                    X  +.0                    Z  -.0
                      W  .01234                Y  +.01234                @  -.01234
                      
                      A  foo                     E  foo                     I  foo  
                      B  foofoo                 F  foofoo                 J  foofoo
                      C  foofoo             G  foofoo             K  foofoo
                      D  foofoo            H  foofoo            L  foofoo
                      
                      M  foo                 P  foo                 S  foo
                      N  foo             Q  foo             T  foo
                      O  foo        R  foo        U  foo
                      
                      V  foo                    X  foo                    Z  foo
                      W  foo                Y  foo                @  foo
                      

                      I fixed replaced ‘*’ to ‘+’ from where is end of comma and three digit group ‘(?:,[0-9]{3})+’ will matched “1234.0123”.

                      (?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})+|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))
                      

                      if * is zero or more if no comma or more will match passed.
                      + is one or more which required at least one comma or more if no comma will next step to ‘|\d+’ is alternative match any number and decimal if don’t have comma separated number instead.

                      input and output result: all matched without double “foo” also Replace “number” to “foo” are corrected

                      A  1                     E  +1                     I  -1  
                      B  12345                 F  +12345                 J  -12345
                      C  1234.0123             G  +1234.0123             K  -1234.0123
                      D  123456.789            H  +123456.789            L  -123456.789
                      
                      M  1,234                 P  +1,234                 S  -1,234
                      N  12,345.01             Q  +12,345.01             T  -12,345.01
                      O  1,234,567.0123        R  +1,234,567.0123        U  -1,234,567.0123
                      
                      V  .0                    X  +.0                    Z  -.0
                      W  .01234                Y  +.01234                @  -.01234
                      
                      A  foo                     E  foo                     I  foo  
                      B  foo                 F  foo                 J  foo
                      C  foo             G  foo             K  foo
                      D  foo            H  foo            L  foo
                      
                      M  foo                 P  foo                 S  foo
                      N  foo             Q  foo             T  foo
                      O  foo        R  foo        U  foo
                      
                      V  foo                    X  foo                    Z  foo
                      W  foo                Y  foo                @  foo
                      

                      btw, I changed that size put my regex.

                      I realized mentioned that

                      btw, now I can’t figure out when search can’t match part line

                           4.0 GB  C:\pagefile.sys
                           1.5 GB  C:\hiberfil.sys
                      

                      I mean There is no folder or it’s drive letter but I need figure out match ‘c:/’ for ‘(?<path>)’ and now,
                      I have solved my myself that removed ‘:’ before dot character from ‘(?i:C|D):.+\\)’ because already “+” is one or more, this is at least match one character which is matched ‘\’ will start current position after ‘\’ when ‘\\’ will searching ‘\’ character until start current position but not found match ‘\’ error , so removed ‘:’ become when ‘+’ is matched to ‘:’ will start current position after ‘:’, found match ‘\’ from ‘C:\’ It will worked match zero folder or more level. (c:\ or c:\folder1\subfolder\subfolder2\...)

                      Regex: (?<path>(?i:C|D).+\\)

                      and again I mentioned that

                      also can’t match file type will skipped line because they don’t have extension like

                         100.0 MB  C:\Users\Username\AppData\Local\Google\Chrome Beta\User Data\Default\Cache\Cache_Data\data_3
                      

                      I added lookaround and alternative ‘(?<file>.+(?=\.)|.+)(?<type>(?:\..+)?$)’ will match filename if have extension or not.

                      • (?<file>...)
                        .+(?=\.) is one or more any character before positive lookahead with match ‘.’ dot character. It’s match filename before dot character as file extension if no dot character will next step to ‘|.+’ is alternative match filename if don’t have extension instead.
                      • (?<type>...)
                        \..+)?$ is optional dot character and one or more any character. as optional file extension.

                      I finally made my regex :

                      (?-xs)^[\t\x20]*(?<size>[+-]?(?:(?:\d{1,3}(?:,\d{3})+|\d+)(?:\.\d+)?|\.\d+))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D).+\\)(?<file>.+(?=\.)|.+)(?<type>(?:\..+)?$)
                      
                      or
                      
                      (?-xs)^[\t\x20]*(?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})+|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D).+\\)(?<file>.+(?=\.)|.+)(?<type>(?:\..+)?$)
                      

                      completed found all matched lines 100/100 in my input text.

                                    4.0 GB  C:\pagefile.sys
                                    1.8 GB  C:\Users\Jhecrose\Downloads\uTorrent\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG.mp4
                                    1.5 GB  C:\hiberfil.sys
                                  900.1 MB  C:\Games\MBAACC\0002.p
                                  851.8 MB  C:\$MFT
                                  569.6 MB  C:\Program Files (x86)\Steam\steamapps\common\Left 4 Dead 2\left4dead2\addons\workshop\1504837401.vpk
                      

                      after replaced, result output:

                      4.0 GB | pagefile.sys
                      	C:\
                      
                      1.8 GB | Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG.mp4
                      	C:\Users\Username\Downloads\uTorrent\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG\
                      
                      1.5 GB | hiberfil.sys
                      	C:\
                      
                      900.1 MB | 0002.p
                      	C:\Games\MBAACC\
                      
                      851.8 MB | $MFT
                      	C:\
                      
                      569.6 MB | 1504837401.vpk
                      	C:\Program Files (x86)\Steam\steamapps\common\Left 4 Dead 2\left4dead2\addons\workshop\
                      

                      I’m sorry if you confuse when you read this.
                      thank you, so much! this case is close now but I’m waiting for my pervious question.

                      1 Reply Last reply Reply Quote 0
                      • guy038
                        guy038 last edited by guy038

                        Hello, @jergen-ross-estaco, @peterjones and All,

                        I ended up with this search regex :

                        SEARCH (?-s)^[\t\x20]*(?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})+|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D):.*\\)(?<file>.+)(?<type>\..+)?

                        This regex is slightly shorter than yours :

                        • I removed the useless x modifier, at beginning of the regex, as the regex is mono line !

                        • I kept your regexes regarding the <size> and <size_type> named groups

                        • I modified the <path>, <file> and <type> named groups as below :

                          • I used a * after the part (?i:C|D):, needed when files are located right under the root

                          • I added a ?, right after the named group (?<type>\..+), for the case of files without extension

                        Like you, my regex version matches all the cases of your INPUT file :

                                      4.0 GB  C:\pagefile.sys
                                      1.8 GB  C:\Users\Jhecrose\Downloads\uTorrent\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG.mp4
                                      1.5 GB  C:\hiberfil.sys
                                    900.1 MB  C:\Games\MBAACC\0002.p
                                    851.8 MB  C:\$MFT
                                    569.6 MB  C:\Program Files (x86)\Steam\steamapps\common\Left 4 Dead 2\left4dead2\addons\workshop\1504837401.vpk
                        

                        Now, regarding your question :

                        How I do check get bytes in search field?

                        It’s quite easy : Select a range of characters and, simply, look right after the indication Sel :, in the status bar, at bottom of the Notepad++ window !

                        Best Regards

                        guy038

                        Jergen Ross Estaco 1 Reply Last reply Reply Quote 0
                        • Jergen Ross Estaco
                          Jergen Ross Estaco @guy038 last edited by Jergen Ross Estaco

                          Hi @guy038,

                          (?<type>\..+)?

                          Same I tried but I have problem (?<type>) won’t match instead both file and extension will match (?<file>) cause this is one .+ will start current position to end of line will (?<type>) is not found match which is searching to current end of line or won’t search back to steps character.

                          try test what’s going happen:
                          Replace: File:\x20\4\r\ntype:\x20\5\r\n

                          expected behavior:

                          I used my regex Find: (?-s)^[\t\x20]*(?<size>[+-]?(?:(?:\d{1,3}(?:,\d{3})+|\d+)(?:\.\d+)?|\.\d+))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D).+\\)(?<file>.+(?=\.)|.+)(?<type>(?:\..+)?)$

                          File: The.Sea.Beast.2022.1080p.WEBRip.x264-RARBG
                          type: .mp4
                          
                          File: The.Unbearable.Weight.of.Massive.Talent.2022.1080p.WEBRip.x264-RARBG
                          type: .mp4
                          
                          File: Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG
                          type: .mp4
                          
                          File: hiberfil
                          type: .sys
                          

                          actual behavior:

                          used your regex:

                          File: The.Sea.Beast.2022.1080p.WEBRip.x264-RARBG.mp4
                          type: 
                          
                          File: The.Unbearable.Weight.of.Massive.Talent.2022.1080p.WEBRip.x264-RARBG.mp4
                          type: 
                          
                          File: Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG.mp4
                          type: 
                          
                          File: hiberfil.sys
                          type: 
                          

                          you have tried debugger in regex? better this, I find out happen regex getting error or problem. btw, I’m using two regex editor are:

                          • for regex101.com, quick hilghlight match and group and very easily debugger show one by one steps is useful
                          • for notepad++, before use test output depends their boost engine.

                          It’s quite easy : Select a range of characters and, simply, look right after the indication Sel :, in the status bar, at bottom of the Notepad++ window !

                          uh, I was just looking sel : is just number?! or is that bytes? I didn’t see bytes

                          1 Reply Last reply Reply Quote 0
                          • guy038
                            guy038 last edited by guy038

                            Hi, @jergen-ross-estaco, @peterjones and All,

                            You are perfectly right about my regex : it did not respect the named groups :-((

                            I finally succeeded to build a correct search regex, ( 2 characters longer than yours, if I subsitute the \d by the [0-9] syntax ! )

                            It uses a particular feature, not very-well known : the Branch Reset mechanism, with the (?|pattern_1|pattern_2|....|pattern_N) syntax

                            Refer to this link for further explanations :

                            https://www.boost.org/doc/libs/1_78_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.branch_reset


                            So, given this INPUT text :

                                          4.0 GB  C:\pagefile.sys
                                          1.8 GB  C:\Users\Jhecrose\Downloads\uTorrent\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG.mp4
                                          1.5 GB  C:\hiberfil.sys
                                        900.1 MB  C:\Games\MBAACC\0002.p
                                        851.8 MB  C:\$MFT
                                        569.6 MB  C:\Program Files (x86)\Steam\steamapps\common\Left 4 Dead 2\left4dead2\addons\workshop\1504837401.vpk
                            
                                          1.8 MB  C:\pqr.tuv.xyz.123
                                          1.8 MB  C:\xyz.123
                                          1.8 MB  C:\xyz
                                          1.8 MB  C:\abc\def\ghi\pqr.tuv.xyz.123
                                          1.8 MB  C:\abc\def\ghi\xyz.123
                                          1.8 MB  C:\abc\def\ghi\xyz
                            

                            The following regex S/R :

                            SEARCH (?-s)^[\t\x20]*(?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})+|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D):.*\\)(?|(?<file>.+)(?<type>\..+)|([^.\r\n]+))

                            REPLACE Path : \3\r\nFile : \4\r\nExtension : \5\r\n

                            will give this OUTPUT text :

                            Path      : C:\
                            File      : pagefile
                            Extension : .sys
                            
                            Path      : C:\Users\Jhecrose\Downloads\uTorrent\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG\
                            File      : Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG
                            Extension : .mp4
                            
                            Path      : C:\
                            File      : hiberfil
                            Extension : .sys
                            
                            Path      : C:\Games\MBAACC\
                            File      : 0002
                            Extension : .p
                            
                            Path      : C:\
                            File      : $MFT
                            Extension : 
                            
                            Path      : C:\Program Files (x86)\Steam\steamapps\common\Left 4 Dead 2\left4dead2\addons\workshop\
                            File      : 1504837401
                            Extension : .vpk
                            
                            
                            Path      : C:\
                            File      : pqr.tuv.xyz
                            Extension : .123
                            
                            Path      : C:\
                            File      : xyz
                            Extension : .123
                            
                            Path      : C:\
                            File      : xyz
                            Extension : 
                            
                            Path      : C:\abc\def\ghi\
                            File      : pqr.tuv.xyz
                            Extension : .123
                            
                            Path      : C:\abc\def\ghi\
                            File      : xyz
                            Extension : .123
                            
                            Path      : C:\abc\def\ghi\
                            File      : xyz
                            Extension : 
                            

                            If you prefer to use the named groups in replacement, the following regex S/R :

                            SEARCH (?-s)^[\t\x20]*(?<size>[+-]?(?:(?:[0-9]{1,3}(?:,[0-9]{3})+|[0-9]+)(?:\.[0-9]+)?|\.[0-9]+))[\t\x20]*(?<size_type>(?i)gb|mb|m|g)[\t\x20]*(?<path>(?i:C|D):.*\\)(?|(?<file>.+)(?<type>\..+)|([^.\r\n]+))

                            REPLACE Path : $+{path}\r\nFile : $+{file}\r\nExtension : $+{type}\r\n

                            will return the same OUTPUT text :

                            Path      : C:\
                            File      : pagefile
                            Extension : .sys
                            
                            Path      : C:\Users\Jhecrose\Downloads\uTorrent\Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG\
                            File      : Paws.of.Fury.The.Legend.of.Hank.2022.1080p.WEBRip.x264-RARBG
                            Extension : .mp4
                            
                            Path      : C:\
                            File      : hiberfil
                            Extension : .sys
                            
                            Path      : C:\Games\MBAACC\
                            File      : 0002
                            Extension : .p
                            
                            Path      : C:\
                            File      : $MFT
                            Extension : 
                            
                            Path      : C:\Program Files (x86)\Steam\steamapps\common\Left 4 Dead 2\left4dead2\addons\workshop\
                            File      : 1504837401
                            Extension : .vpk
                            
                            
                            Path      : C:\
                            File      : pqr.tuv.xyz
                            Extension : .123
                            
                            Path      : C:\
                            File      : xyz
                            Extension : .123
                            
                            Path      : C:\
                            File      : xyz
                            Extension : 
                            
                            Path      : C:\abc\def\ghi\
                            File      : pqr.tuv.xyz
                            Extension : .123
                            
                            Path      : C:\abc\def\ghi\
                            File      : xyz
                            Extension : .123
                            
                            Path      : C:\abc\def\ghi\
                            File      : xyz
                            Extension : 
                            

                            To end with, in the status bar, the Sel : field gives you the number of characters ( not bytes) of the current selection !

                            BR

                            guy038

                            P.S. :

                            So, in the part (?|(?<file>.+)(?<type>\..+)|([^.\r\n]+)) :

                            • (?<file>.+) represents the named group file ( group 4 )

                            • (?<type>\..+) represents the named group type ( group 5 )

                            • ([^.\r\n]+) represents the group 4, too, due to the branch reset syntax

                            Alan Kilborn 1 Reply Last reply Reply Quote 0
                            • Alan Kilborn
                              Alan Kilborn @guy038 last edited by

                              @guy038 said in what are list Regex available "named capture group" for find and replace in current version:

                              Refer to this link for further explanations :
                              https://www.boost.org/doc/libs/1_78_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax.branch_reset

                              Why not this LINK as well?:

                              1 Reply Last reply Reply Quote 1
                              • First post
                                Last post
                              Copyright © 2014 NodeBB Forums | Contributors