Find and Replace: Multiple Replacements in Part of a String
-
Hi,
I’m using the build-in find and replace tool (CTRL+H) with case sensitivity turned on and regular expressions. I’m limited to using vanilla Notepad++ (no plugins, no python etc.).
I have a file with sets of ID’s and strings separated by a comma structured like so:
ID, String
12345-01, A+A*2B+B*+A
12345-02, A+AB+B*+AAI want to make the following replacements but only in the string (after the comma).
+A* --> 1
+A --> a
+B* --> 2
+B --> b
2 --> XAs example, “12345-01, A+A*2B+B*+A” should be changed to “12345-01, A1XB2a”.
Now if the ID was not there the following works like a charm:
Search for: (\+A\*)|(\+A)|(\+B\*)|(\+B)|(2)
Replace with: (?{1}1)(?{2}a)(?{3}2)(?{4}b)(?{5}X)However, when the ID is present I cannot seem to find a solution that will leave the ID unchanged while making all the replacements in the string.
Do you have any suggestions?
-
@Anos said in Find and Replace: Multiple Replacements in Part of a String:
that will leave the ID unchanged
A quick “fix” might be to use:
(\+A\*)|(\+A)|(\+B\*)|(\+B)|(\b2\b)
So here the 2 must be at a "boundary. For the 2 example lines provided it does work, however 2 examples does NOT a book make! It will depend on whether the “2” in the rest of the expression is surrounded by different characters on both sides.Terry
-
@Terry-R said in Find and Replace: Multiple Replacements in Part of a String:
So here the 2 must be at a "boundary
Sorry, jumped the gun slightly, it did work on 2nd example, missed that it didn’t work on the first examples. Yes it IS a bit of a poser. It will involve a bit more thought.
Terry
-
@Terry-R said in Find and Replace: Multiple Replacements in Part of a String:
It will involve a bit more thought.
Sorry, about that false start, I think I now have it. We have
FW:(?-s)((\+A\*)|(\+A)|(\+B\*)|(\+B)|(2))(?!.*?,)
RW:(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)So as I had to add a negative lookahead the bracket numbering all changed hence a new replace with code as well.
So basically whenever it finds a character, so long as no,after it on the line it will be changed. As the ID is before the,nothing there should be changed.Terry
PS should have paid more attention to your statement
I want to make the following replacements but only in the string (after the comma). -
@Anos
I came up with this; seems to work but maybe has holes:
find:
(^[^,]+,)|(\+A\*)|(\+A)|(\+B\*)|(\+B)|(2)
repl:(?{1}\1)(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)The result of the replacement with it:
12345-01, A1XB2a 12345-02, AaB2aA -
@Terry-R said in Find and Replace: Multiple Replacements in Part of a String:
So as I had to add a negative lookahead the bracket numbering all changed
You could have made the wrapping parentheses a non-capturing group:
(?-s)(?:(\+A\*)|(\+A)|(\+B\*)|(\+B)|(2))(?!.*?,), to avoid the renumbering in the replacement.TIMTOWTDI
-
@Alan-Kilborn said in Find and Replace: Multiple Replacements in Part of a String:
find: (^[^,]+,)|(+A*)|(+A)|(+B*)|(+B)|(2)
repl: (?{1}\1)(?{2}1)(?{3}a)(?{4}2)(?{5}b)(?{6}X)I vote for yours. As an interesting aside, using regex101.com and inputting the 2 example lines and the Find What code, my code took twice as long as @Alan-Kilborn to process. It’s obvious the lookahead is where the extra time is spent.
For a small file to process it may not mean a lot, but sometimes efficiency in coding can be an advantage, hence my vote for @Alan-Kilborn code.
Terry
-
@Terry-R This does seem to work as intended, at least with my limited testing. Thank you very much for your quick replies. I have never really familiarized myself with lookaheads, they certainly look useful though.
-
@Anos said in Find and Replace: Multiple Replacements in Part of a String:
I have never really familiarized myself with lookaheads
There are LOTS of wonderful things to try and remember, as @PeterJones just reminded me. I should have made that a non-capture group, then it would not have required a rejig of the replace with code.
As I always say
“The day you stop learning is the day you die”Terry
-
@Alan-Kilborn Thank you for this solution. This also gets the job done, and as @Terry-R points out it seems to be more efficient.
-
Hello, @anos, @terry-r, @alan-kilborn, @peterjones and All,
And here is my solution !
If we use the FREE-SPACING mode (?x), for the SEARCH part : SEARCH (?x-s) (?: ( \+A (\*)? ) | ( \+B (\*)? ) | (2) ) (?!.*,) Groups --> No 1 2 3 4 5 Look-Ahead REPLACE (?1(?{2}1:a))(?3(?{4}2:b))?5X BEWARE that, in the REPLACE part, the FREE-SPACING mode is FORBIDDEN. So, ONLY for INFO : REPLACE ( ?1 ( ?{2} 1 : a ) ) ( ?3 ( ?{4} 2 : b ) ) ?5 Xand given the data :
12345-01, A+A*2B+B*+A 12345-02, A+AB+B*+AAit would return :
12345-01, A1XB2a 12345-02, AaB2aA
Notes :
-
The first part
(?x-s)of the regex search means that :-
The free-spacing mode is set ( Spaces are not taken in account, except for the
[ ]syntax or anescapedspace char ) -
Due to
(?-s)syntax, the dot regex symbol matches a single standard char only ( not anEOLchar )
-
-
Then, the
(?:......)syntax defines a non-capturing group -
Now, in this non-capturing group, we have
3alternatives and the first two contain an optional inner group(\*)?( Remember that the?is an other form of the{0,1}quantifier ) -
To end, all this regex , so far, will match ONLY IF the final negative look-ahead structure
(?!.*,)is verified, that is to say if at current position, reached by the regex engine, there is never a comma, at any further position, in current line -
Now, in the replacement regex :
-
The
(?1(?{2}1:a))syntax means that if group1exists, then if group2exists, then write1else writea -
The
(?3(?{4}2:b))syntax means that if group3exists, then if group4exists, then write2else writeb -
Finally, the
?5Xmeans that if group5exists, then write anX( The parentheses are not mandatory as this part ends the regex -
Note also that it’s not necessary to surround the groups
1,3and5with braces as these groups are not immediately followed with a digit !
-
Best Regards,
guy038
-