Search and replace using named capturing group in regular expression



  • I want to search and replace text in a file using the named capturing group functionality of regular expressions. As an example, in the string “this is a test”, the search string “(this is)(?<name1> a )” matches the "this is a " successfully. But I am not sure how do I refer to this named capturing group “name1” in the replace text box. I have tried “\k<name1>”, “${name1}”, “${name1}” and multiple other combinations, but all have failed.

    Can someone please help me in identifying the correct syntax for giving the named capturing group in the replace text box.

    Thanks,
    Akbar.



  • Try using $+{name1}. I think that \g and \k are just used within the find field to back reference a named group.



  • Hello,
    I tried your solution but … Fail :( Maybe i’m don’t unerstand your post or i use wrong syntax. Where i can read actual regular expression documentation ?
    Tnx.





  • Hello akbarmunir, Владислав Ласский, and All,

    I’ll ONLY refer to the syntaxes, relative to named capturing groups, used by the Boost regex engine, included in N++, of course !


    A) Named capturing groups :

    • Each, of the two syntaxes (?<Name>.....) or (?'Name'.....), represents a named capturing group

    • The name must be made up of words characters only ( \w ) and must not exceed 32 characters

    • The name, of a capturing group, is sensible to case ! For instance, the capturing groups (?<Digits>\d\d) and (?<digitS>\d\d\d) represent two different groups

    • If a regex contains two or more named capturing groups with a same name, only the first one is taken in account, and all the subsequent groups are ignored


    B) Back-references to previous named capturing groups :

    • Each, of the six syntaxes \g{Name}, \g<Name>, \g'Name', \k{Name}, \k<Name>, \k'Name', represents a back-reference to the named capturing group, of name = “Name”, which must be located BEFORE, in the regex

    So, as there are two forms of named capturing groups and six forms of back-references, the 12 possible syntaxes, below, using the named capturing group Test, would find, for instance, the string ABC, surrounded by the SAME, non null range of digits !

    (?<Test>\d+)ABC\g{Test} , (?<Test>\d+)ABC\g<Test> , (?<Test>\d+)ABC\g'Test'

    (?<Test>\d+)ABC\k{Test} , (?<Test>\d+)ABC\k<Test> , (?<Test>\d+)ABC\k'Test'

    (?'Test'\d+)ABC\g{Test} , (?'Test'\d+)ABC\g<Test> , (?'Test'\d+)ABC\g'Test'

    (?'Test'\d+)ABC\k{Test} , (?'Test'\d+)ABC\k<Test> , (?'Test'\d+)ABC\k'Test'

    So, ANY of these 12 syntaxes, matches the four lines, below :

    1ABC1
    12345ABC12345
    456ABC456
    789ABC789
    

    C) Subroutine calls to a named capturing group :

    • Each of the two syntaxes (?&Name) or (?P>Name) represents a subroutine call to the regex pattern of the named capturing group, of name = “Name”, which may be located BEFORE or AFTER, in the regex

    So, as there are two forms of named capturing groups and two forms of subroutine calls, the 4 possible syntaxes, below, using the named capturing group Test, would find, for instance, the string ABC, surrounded by non null ranges of digits !

    (?<Test>\d+)ABC(?&Test) , (?<Test>\d+)ABC(?P>Test)
    (?'Test'\d+)ABC(?&Test) , (?'Test'\d+)ABC(?P>Test)

    So, ANY of the 4 syntaxes matches the nine lines below :

    1ABC1
    12345ABC12345
    456ABC456
    789ABC789
    
    456ABC789
    789ABC456
    0ABC123456789
    0123456789ABC1
    111ABC999
    

    And, as the subroutine call can be located BEFORE its associated named capturing group, the 4 syntaxes, below, are also valid ones and would find the nine lines above, too !

    (?&Test)ABC(?<Test>\d+) , (?P>Test)ABC(?<Test>\d+)
    (?&Test)ABC(?'Test'\d+) , (?P>Test)ABC(?'Test'\d+)


    It’s important to fully understand the fundamental difference between a back-reference and a subroutine call to a group, named or not :

    • A back-reference, to a group, represents the present match of this group

    • A subroutine call, to a group, represents the regex pattern of this group

    For instance, the regex (?<Test>\d+)ABC\g<Test> would match the first fourth lines, of the above example ( In other words, the numbers, surrounding the string ABC, have to be identical, because the form \g<Test> is, simply, a reference to the present number, preceding the string ABC )

    Whereas the regex (?<Test>\d+)ABC(?&Test) would match all the lines of the above example ( In other words, the numbers, surrounding the string ABC, may be different. Indeed, the form (?&Test) is strictly equal to \d+, the pattern of the group Test ! )

    Best Regards,

    guy038

    P.S. :

    When a subroutine call is located inside the parentheses of the group to which it refers, it operates as a recursive pattern. But this is an other story … :-)



  • @gerdb42 said:

    http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html

    Hello
    How can I change many lines in one time — BUT in many files.
    For example 10000 files in a folder.
    For example I search 3 lines and instead of this lines I need to put in 10 different lines.
    In former times this was possible with the program Homesite (include regex / wildcard and so on), but they stopped and destroy UTF-8 documents.

    Can I find MANY lines with REGEX like above ???
    The greater problem seemed to be the inclusion of many lines, because the “Search / Replaces in files”-function allows only 1 line.

    Please help
    Mayer



  • “Search / Replaces in files”-function allows only 1 line.

    Who said so? If you know where line breaks will occur, try \R-Pattern as Placeholder. Or check option . finds \r and \n.

    In replacement, insert \r\n at places where you want line breaks.

    When using . finds \r and \n pay special attention to greedy/non greedy repeats.


Log in to reply