RegularExpression, replace with to keep some variables

yinjason

drop table abc ;
drop table def ;

result to be
drop table before abc after ;
drop table before def after ;

find would be: drop table \w+ ;
but what would be replace to ?

Terry R

I think your find would be
^(drop table)\h\b(.+?)\b\h;
And replace would be
\1\hbefore\h\2\hafter\h;

So looking for start of line followed by drop table, a space then a boundary with some letters following until another boundary, than a space and then the ;. We save the relevant text and re use it by the \1 and \2 options and inserting the additional text as needed.

Depending on the actual real data there may need to be some adjustments which I hope you can figure out, especially the before and after text replacements.

Terry

guy038

Hello @yinjason, @terry-r and All,

Yinjason, regarding the use of unamed groups, in replacement, refer to this other post :

https://notepad-plus-plus.org/community/topic/12342/what-this-regex-doing/2

Terry-R, I would like to point out two points :

1) In your search regex, I think that the \b assertion is useless in this specific regex ! Indeed, the \b assertion is a zero-length location between, either :

A NON-word character and a WORD character
A WORD character and a NON-word character

Assuming that a WORD character belongs, by default, in the ANSI / Windows-1252 encoding, to the range, below :

[0-9A-Z_a-zƒŠŒŽšœžŸª²³µ¹ºÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ] identical to \w

And, that, of course, the very begining and the very end of the file, as well, as beginnings and ends of line, are considered as NON-word characters or NON-word locations

So the part of your regex \h\b(.+?)\b\h can be shortened to \h(.+?)\h as there is, automatically, a \b location, right before the first WORD char AND right after the last WORD char of the range (.+?) !

2) In your replacement regex, unfortunately, the \h escape sequence does not represent any sort of blank character, just the literal lowercase letter h !

Indeed, in a regex replacement, the allowed escape sequences, apart from the hexadecimal or octale notation, are :

\r = \x0D (CR CARRIAGE RETURN )
\n = \x0A (LF LINE FEED )
\t = \x09 (TAB TABULATION )
\e = \x1B (ESC ESCAPE )
\a = \x07 (BEL BELL )
\v = \x0B (VT VERTICAL TABULATION )
\f = \x0C (FF FORM FEED )

So, we need to use, either, the \t for a tabulation char or the \x20 for a space char ( or a true space )

Of course, if you don’t know, which blank character surrounds the last group, and want to rewrite it, in replacement, a possible regex could be :

SEARCH : ^(drop table)(\h)(.+?)\h;

REPLACE \1\2before\2\3after\2;

Best Regards,

guy038

Terry R

Thanks for the correction guy038. It was late at night, and I didn’t test my answer before presenting it. Funny enough I had thought of capturing the \h and re-using it but I never presented that in my solution.

Terry