add a prefix to a each string in a column
-
Hi guys,
I was looking already long time to this subject, but coulnd find something to this.I want to add a prefix (A_) to a each string in a column. This hole string looks like this:
.3 |1180| 1|1234 |abcd | 0| 0| 0| 2.0000|mm | 0 |W78 | 50|No | IC10 IC9
and the result schoul look like this:
.3 |1180| 1|1234 |abcd | 0| 0| 0| 2.0000|mm | 0 |W78 | 50|No | A_IC10 A_IC9
Do you know what to do?
-
Hello @yascha-badenhop,
Welcome to the
Notepad++ Community
!As usual, the problem is not to build the right S/R but, simply, to define in which cases the S/R should occur ;-))
-
Must the S/R occur on the last column of each line ?
-
Must the S/R occur on the
15th
column of each line, like in your example ? -
Must the S/R occur right before any string
IC##
, whatever its column, for any line ? -
Must the S/R occur right before any string
IC##
, in the last column of each line ? -
Must the S/R occur right before any string
IC##
of the15th
column of each line ?
From your answer, it won’t be difficult to find out the correct regex S/R !
Best Regards,
guy038
-
-
Hello @guy038
This case is quit easy, but I have also for example R10 / C10 / vdr10 / xtal10. I would like to add the prefix to als strings in column 15 (seperated with | ).
I acctualy wanted to edit my last post, but I could anymore - Sorry for the indistict question :) -
Hi, @yascha-badenhop, and All,
OK, I get the problem ! I’ll try to describe my different steps in order to reach the right solution. If you’re in a hurry, just skip this section ;-))
-
Firstly, I tried to imagine the way to access to the
15th
column, with the|
separator-
A column is composed of some standard characters, different from
|
, followed with the|
separator which can be regex-translated as[^|\r\n]+\|
. Indeed[^|\r\n]
represents any char different from|
and from any EOL character, repeated many times (+
) and followed with a literal pipe character\|
( must be escaped because it’s a meta-regex character ) -
To get the
15th
column we need to count the first14
columns, from beginning of line, so the regex^(?:[^|\r\n]+\|){14}
. Note that I’m using a non-capturing group(?:........)
group as we do not care about the contents of these different columns and we don’t need to store them, for further recall ! -
Then we place the
\K
regex feature which resets the regex engine search and locate the working position right after the|
separator of the14th
column
-
-
Secondly, I considered the contents of the
15th
column :-
Seemingly, it contents several words, each of them preceded with a space char ( from your second post, we learn that these may be either
R10
,C10
,vdr10
orxtal10
) However, I preferred to suppose that the first word of the15th
column could directly follow the|
separator. -
So, for the moment, the search regex is
^(?:[^|\r\n]+\|){14}\K\x20*\w+
, as\x20*
stands for any range of space chars, even none and\w+
a non-null list of word characters. But, as we must insert theA_
string between the possible blank character(s) and the subsequent word, I enclosed them between parentheses, which define two groups1
and2
( as, you remember, the very first group is a non-capturing one ) giving the regex^(?:[^|\r\n]+\|){14}\K(\x20*)(\w+)
-
-
Thirdly, I thought about the way of matching the second and subsequent words of the
15th
column :-
I first thought about to add an alternative (
|
) to the search regex and search for an other word, preceded with space character(s), also enclosed for parentheses as we need the contents for replacement => the regex^(?:[^|\r\n]+\|){14}\K(\x20*)(\w+)|(\x20+)(\w+)
. However, this does not work as, when the present working location, of the regex engine, is after the15th
column, and that your text contains other columns, the second alternative, of the search regex, would match any subsequent words ! Not what we want, obviously ! -
The solution is to use the
\G
feature which needs that the next regex match begins at the exact location where the previous match ends. So, when all the blocks “space(s) + word” (\x20+\w+
) of the15th
column will be matched, the process will stop. Indeed, the first word of the16th
column is not closed to the last word of the15th
column because of the|
separator and breaks the\G
condition !
-
-
Finally a solution for the search regex could be :
SEARCH
^(?:[^|\r\n]+\|){14}\K(\x20*)(\w+)|\G(\x20+)(\w+)
-
Fourthly, I built the replacement regex :
-
Note that groups
1
and2
store the first space character(s) and word characters of the15th
column -
Groups
3
and4
store the second and subsequent space character(s) and word characters of the15th
column -
So, the use conditional replacement
(?#........)
is needed and gives the replacement regex(?2\1A_\2)(?4\3A_\4)
. This means that :-
If group
2
exists, we rewrite the space(s) characters first (\1
), followed with the stringA_
, and, finally, the word characters (\2
) -
If group
4
exists, we rewrite the space(s) characters first (\3
), followed with the stringA_
, and, finally, the word characters (\4
)
-
-
After a while I realized that these two cases are mutually exclusive. And, as a non defined group
n
, noted\n
in replacement, is supposed to be an empty group, with the Boost regex engine, I finally got the final replacement regex :\1\3A_\2\4
-
So, @yascha-badenhop, the final regex S/R, to solve your problem, is :
SEARCH
^(?:[^|\r\n]+\|){14}\K(\x20*)(\w+)|\G(\x20+)(\w+)
REPLACE
\1\3A_\2\4
To test it, I considered the
2
colums sample text,R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10|
, with various ranges of space chars before the words. this range is then repeated9
times, giving18
columns, in totality :R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10|
Now :
-
Open the Replace dialog (
Ctrl + H
) -
Type in the regex
^(?:[^|\r\n]+\|){14}\K(\x20*)(\w+)|\G(\x20+)(\w+)
, in the Find what: zone -
Type in the regex
\1\3A_\2\4
in the Replace with: zone -
Preferably, tick the
Wrap around
option -
Choose the
Regular expression
search mode -
Click once on the
Replace All
button, exclusively ( Because of the\K
syntax, you must not use theReplace
button ! )
You should get your expected text :
R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10| A_R10 A_C10 A_vdr10 A_xtal10|xtal10 vdr10 C10 R10| R10 C10 vdr10 xtal10|xtal10 vdr10 C10 R10|
If you want to be more restrictive and match the exact words, that you spoke of in your second post, we just have to change the regex
\w+
into the non-capturing group,followed with 10, so(?:xtal|vdr|R|C)10
, giving the final regex S/R :SEARCH
^(?:[^|\r\n]+\|){14}\K(\x20*)((?:xtal|vdr|R|C)10)|\G(\x20+)((?:xtal|vdr|R|C)10)
REPLACE
\1\3A_\2\4
This second solution is even better, as it prevents a second unwanted execution of the regex S/R, leading, for instance, to a
15th
column like| A_A_IC10 A_A_IC9
Best Regards,
guy038
-