Regular expression : Can we get number of repeats of {n,m} Quantifier for replacement?

Developing_TE

for example, I want to replace Aaaaaa…B with Accccc…B , using Regular expression, I can match the text with such pattern : A(a{5,})B , but how can I replace it with A(c{5,})B ?

Ekopalypse

I haven’t done this myself but this might give you the info you need.

guy038

Hello, @jakang-chen and All,

Seemingly, you need to search for any range of consecutive lowercase letters a, embedded between the uppercase letters A and B and replace each of them with the lowercase letter c

Here is a possible solution :

So, given this sample text, below :

aaa    AB              CB              AC              BC               AB              aaa
aaa    AaB             CaB             AaC             BaC              AaB             aaa
aaa    AaaB            CaaB            AaaC            BaaC             AaaB            aaa
aaa    AaaaB           CaaaB           AaaaC           BaaaC            AaaaB           aaa
aaa    AaaaaB          CaaaaB          AaaaaC          BaaaaC           AaaaaB          aaa
aaa    AaaaaaB         CaaaaaB         AaaaaaC         BaaaaaC          AaaaaaB         aaa
aaa    AaaaaaaB        CaaaaaaB        AaaaaaaC        BaaaaaaC         AaaaaaaB        aaa
aaa    AaaaaaaaB       CaaaaaaaB       AaaaaaaaC       BaaaaaaaC        AaaaaaaaB       aaa
aaa    AaaaaaaaaB      CaaaaaaaaB      AaaaaaaaaC      BaaaaaaaaC       AaaaaaaaaB      aaa
aaa    AaaaaaaaaaB     CaaaaaaaaaB     AaaaaaaaaaC     BaaaaaaaaaC      AaaaaaaaaaB     aaa
aaa    AaaaaaaaaaaB    CaaaaaaaaaaB    AaaaaaaaaaaC    BaaaaaaaaaaC     AaaaaaaaaaaB    aaa

If you run this regex S/R :

SEARCH (?-si)(A\K|\G)a(?=\w*?B)

REPLACE c

You should get your expected text :

aaa    AB              CB              AC              BC               AB              aaa
aaa    AcB             CaB             AaC             BaC              AcB             aaa
aaa    AccB            CaaB            AaaC            BaaC             AccB            aaa
aaa    AcccB           CaaaB           AaaaC           BaaaC            AcccB           aaa
aaa    AccccB          CaaaaB          AaaaaC          BaaaaC           AccccB          aaa
aaa    AcccccB         CaaaaaB         AaaaaaC         BaaaaaC          AcccccB         aaa
aaa    AccccccB        CaaaaaaB        AaaaaaaC        BaaaaaaC         AccccccB        aaa
aaa    AcccccccB       CaaaaaaaB       AaaaaaaaC       BaaaaaaaC        AcccccccB       aaa
aaa    AccccccccB      CaaaaaaaaB      AaaaaaaaaC      BaaaaaaaaC       AccccccccB      aaa
aaa    AcccccccccB     CaaaaaaaaaB     AaaaaaaaaaC     BaaaaaaaaaC      AcccccccccB     aaa
aaa    AccccccccccB    CaaaaaaaaaaB    AaaaaaaaaaaC    BaaaaaaaaaaC     AccccccccccB    aaa

It’s easy to verify that contents have changed, only between the individual ranges A............B

Notes :

The in-line modifier (?-i) ensures that the search will be processed in a NON-insentive way
The in-line modifier (?-s) forces the regex engine to consider any dot ( . ) as representing a single standard character ( and not an EOL char )
The A obvioulsy matches the litteral uppercase letter A
Then the \K syntax immediately cancels any previous match and resets the regex engine working position
If a letter A is not found, the second part of the alternative, \G, which represents the zero-length location right after the previous match, is invoked
Now, the regex engine tries to match the lower-case letter a, but ONLY IF the positive look-ahead (?=\w*?B) is true, i.e this letter is followed by any range, even null, of word characters, till an upper-case letter B
If so, it is simply replaced with the lower-case letter c

Best Regards,

guy038