Regular expression : Can we get number of repeats of {n,m} Quantifier for replacement?



  • for example, I want to replace Aaaaaa…B with Accccc…B , using Regular expression, I can match the text with such pattern : A(a{5,})B , but how can I replace it with A(c{5,})B ?



  • I haven’t done this myself but this might give you the info you need.



  • Hello, @jakang-chen and All,

    Seemingly, you need to search for any range of consecutive lowercase letters a, embedded between the uppercase letters A and B and replace each of them with the lowercase letter c

    Here is a possible solution :

    So, given this sample text, below :

    aaa    AB              CB              AC              BC               AB              aaa
    aaa    AaB             CaB             AaC             BaC              AaB             aaa
    aaa    AaaB            CaaB            AaaC            BaaC             AaaB            aaa
    aaa    AaaaB           CaaaB           AaaaC           BaaaC            AaaaB           aaa
    aaa    AaaaaB          CaaaaB          AaaaaC          BaaaaC           AaaaaB          aaa
    aaa    AaaaaaB         CaaaaaB         AaaaaaC         BaaaaaC          AaaaaaB         aaa
    aaa    AaaaaaaB        CaaaaaaB        AaaaaaaC        BaaaaaaC         AaaaaaaB        aaa
    aaa    AaaaaaaaB       CaaaaaaaB       AaaaaaaaC       BaaaaaaaC        AaaaaaaaB       aaa
    aaa    AaaaaaaaaB      CaaaaaaaaB      AaaaaaaaaC      BaaaaaaaaC       AaaaaaaaaB      aaa
    aaa    AaaaaaaaaaB     CaaaaaaaaaB     AaaaaaaaaaC     BaaaaaaaaaC      AaaaaaaaaaB     aaa
    aaa    AaaaaaaaaaaB    CaaaaaaaaaaB    AaaaaaaaaaaC    BaaaaaaaaaaC     AaaaaaaaaaaB    aaa
    

    If you run this regex S/R :

    SEARCH (?-si)(A\K|\G)a(?=\w*?B)

    REPLACE c

    You should get your expected text :

    aaa    AB              CB              AC              BC               AB              aaa
    aaa    AcB             CaB             AaC             BaC              AcB             aaa
    aaa    AccB            CaaB            AaaC            BaaC             AccB            aaa
    aaa    AcccB           CaaaB           AaaaC           BaaaC            AcccB           aaa
    aaa    AccccB          CaaaaB          AaaaaC          BaaaaC           AccccB          aaa
    aaa    AcccccB         CaaaaaB         AaaaaaC         BaaaaaC          AcccccB         aaa
    aaa    AccccccB        CaaaaaaB        AaaaaaaC        BaaaaaaC         AccccccB        aaa
    aaa    AcccccccB       CaaaaaaaB       AaaaaaaaC       BaaaaaaaC        AcccccccB       aaa
    aaa    AccccccccB      CaaaaaaaaB      AaaaaaaaaC      BaaaaaaaaC       AccccccccB      aaa
    aaa    AcccccccccB     CaaaaaaaaaB     AaaaaaaaaaC     BaaaaaaaaaC      AcccccccccB     aaa
    aaa    AccccccccccB    CaaaaaaaaaaB    AaaaaaaaaaaC    BaaaaaaaaaaC     AccccccccccB    aaa
    

    It’s easy to verify that contents have changed, only between the individual ranges A............B

    Notes :

    • The in-line modifier (?-i) ensures that the search will be processed in a NON-insentive way

    • The in-line modifier (?-s) forces the regex engine to consider any dot ( . ) as representing a single standard character ( and not an EOL char )

    • The A obvioulsy matches the litteral uppercase letter A

    • Then the \K syntax immediately cancels any previous match and resets the regex engine working position

    • If a letter A is not found, the second part of the alternative, \G, which represents the zero-length location right after the previous match, is invoked

    • Now, the regex engine tries to match the lower-case letter a, but ONLY IF  the positive look-ahead (?=\w*?B) is true, i.e this letter is followed by any range, even null, of word characters, till an upper-case letter B

    • If so, it is simply replaced with the lower-case letter c

    Best Regards,

    guy038


Log in to reply