Find and Replace: Multiple Replacements in Part of a String
-
Hi,
I’m using the build-in find and replace tool (CTRL+H) with case sensitivity turned on and regular expressions. I’m limited to using vanilla Notepad++ (no plugins, no python etc.).
I have a file with sets of ID’s and strings separated by a comma structured like so:
ID, String
12345-01, A+A*2B+B*+A
12345-02, A+AB+B*+AAI want to make the following replacements but only in the string (after the comma).
+A* --> 1
+A --> a
+B* --> 2
+B --> b
2 --> XAs example, “12345-01, A+A*2B+B*+A” should be changed to “12345-01, A1XB2a”.
Now if the ID was not there the following works like a charm:
Search for: (\+A\*)|(\+A)|(\+B\*)|(\+B)|(2)
Replace with: (?{1}1)(?{2}a)(?{3}2)(?{4}b)(?{5}X)
However, when the ID is present I cannot seem to find a solution that will leave the ID unchanged while making all the replacements in the string.
Do you have any suggestions?
-
@Anos said in Find and Replace: Multiple Replacements in Part of a String:
that will leave the ID unchanged
A quick “fix” might be to use:
(\+A\*)|(\+A)|(\+B\*)|(\+B)|(\b2\b)
So here the 2 must be at a "boundary. For the 2 example lines provided it does work, however 2 examples does NOT a book make! It will depend on whether the “2” in the rest of the expression is surrounded by different characters on both sides.Terry
-
@Terry-R said in Find and Replace: Multiple Replacements in Part of a String:
So here the 2 must be at a "boundary
Sorry, jumped the gun slightly, it did work on 2nd example, missed that it didn’t work on the first examples. Yes it IS a bit of a poser. It will involve a bit more thought.
Terry
-
@Terry-R said in Find and Replace: Multiple Replacements in Part of a String:
It will involve a bit more thought.
Sorry, about that false start, I think I now have it. We have
FW:(?-s)((\+A\*)|(\+A)|(\+B\*)|(\+B)|(2))(?!.*?,)
RW:(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)
So as I had to add a negative lookahead the bracket numbering all changed hence a new replace with code as well.
So basically whenever it finds a character, so long as no,
after it on the line it will be changed. As the ID is before the,
nothing there should be changed.Terry
PS should have paid more attention to your statement
I want to make the following replacements but only in the string (after the comma). -
@Anos
I came up with this; seems to work but maybe has holes:
find:
(^[^,]+,)|(\+A\*)|(\+A)|(\+B\*)|(\+B)|(2)
repl:(?{1}\1)(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)
The result of the replacement with it:
12345-01, A1XB2a 12345-02, AaB2aA
-
@Terry-R said in Find and Replace: Multiple Replacements in Part of a String:
So as I had to add a negative lookahead the bracket numbering all changed
You could have made the wrapping parentheses a non-capturing group:
(?-s)(?:(\+A\*)|(\+A)|(\+B\*)|(\+B)|(2))(?!.*?,)
, to avoid the renumbering in the replacement.TIMTOWTDI
-
@Alan-Kilborn said in Find and Replace: Multiple Replacements in Part of a String:
find: (^[^,]+,)|(+A*)|(+A)|(+B*)|(+B)|(2)
repl: (?{1}\1)(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)I vote for yours. As an interesting aside, using regex101.com and inputting the 2 example lines and the Find What code, my code took twice as long as @Alan-Kilborn to process. It’s obvious the lookahead is where the extra time is spent.
For a small file to process it may not mean a lot, but sometimes efficiency in coding can be an advantage, hence my vote for @Alan-Kilborn code.
Terry
-
@Terry-R This does seem to work as intended, at least with my limited testing. Thank you very much for your quick replies. I have never really familiarized myself with lookaheads, they certainly look useful though.
-
@Anos said in Find and Replace: Multiple Replacements in Part of a String:
I have never really familiarized myself with lookaheads
There are LOTS of wonderful things to try and remember, as @PeterJones just reminded me. I should have made that a non-capture group, then it would not have required a rejig of the replace with code.
As I always say
“The day you stop learning is the day you die”Terry
-
@Alan-Kilborn Thank you for this solution. This also gets the job done, and as @Terry-R points out it seems to be more efficient.
-
Hello, @anos, @terry-r, @alan-kilborn, @peterjones and All,
And here is my solution !
If we use the FREE-SPACING mode (?x), for the SEARCH part : SEARCH (?x-s) (?: ( \+A (\*)? ) | ( \+B (\*)? ) | (2) ) (?!.*,) Groups --> No 1 2 3 4 5 Look-Ahead REPLACE (?1(?{2}1:a))(?3(?{4}2:b))?5X BEWARE that, in the REPLACE part, the FREE-SPACING mode is FORBIDDEN. So, ONLY for INFO : REPLACE ( ?1 ( ?{2} 1 : a ) ) ( ?3 ( ?{4} 2 : b ) ) ?5 X
and given the data :
12345-01, A+A*2B+B*+A 12345-02, A+AB+B*+AA
it would return :
12345-01, A1XB2a 12345-02, AaB2aA
Notes :
-
The first part
(?x-s)
of the regex search means that :-
The free-spacing mode is set ( Spaces are not taken in account, except for the
[ ]
syntax or anescaped
space char ) -
Due to
(?-s)
syntax, the dot regex symbol matches a single standard char only ( not anEOL
char )
-
-
Then, the
(?:......)
syntax defines a non-capturing group -
Now, in this non-capturing group, we have
3
alternatives and the first two contain an optional inner group(\*)?
( Remember that the?
is an other form of the{0,1}
quantifier ) -
To end, all this regex , so far, will match ONLY IF the final negative look-ahead structure
(?!.*,)
is verified, that is to say if at current position, reached by the regex engine, there is never a comma, at any further position, in current line -
Now, in the replacement regex :
-
The
(?1(?{2}1:a))
syntax means that if group1
exists, then if group2
exists, then write1
else writea
-
The
(?3(?{4}2:b))
syntax means that if group3
exists, then if group4
exists, then write2
else writeb
-
Finally, the
?5X
means that if group5
exists, then write anX
( The parentheses are not mandatory as this part ends the regex -
Note also that it’s not necessary to surround the groups
1
,3
and5
with braces as these groups are not immediately followed with a digit !
-
Best Regards,
guy038
-