Delete characters that exceed the number of characters between the 11th and 12th occurence of "
-
Hello everyone,
I have many lines, which look like this:“TRX_TT_ID”;“TRX_APP_DD”;“ORIGIN”;“SUB_TYPE”;“NUMBER”;“TABLE_TRX_LNR”;“DATABANK_NUMBER”;“STATE”;“CVBT_ISO_LKZ”;“PRODUKT”
“1000”;“1”;“1”;“”;“834600000”;“13340000 2227”;“1082803 / 13341837 2227”;“1”;“EN”
“1000”;“1”;“1”;“”;“834600000”;“13350001 33 This is very long and needs to be truncated”;“1080668 / 13341845 This is technical”;“1”;“EN”;“Call”The text between occurence no. 11 and 12 of " needs to be trimmed after the first 10 characters (that column must have maximum 10 characters).
Can anyone please help out?
Thank you!
Cristian -
Hello, @Cristian-tanasa, and All,
There still are some points not totally clear !
-
Firstly, I suppose that you’re using regular double quotes
"
and not the“
and”
characters ! -
Secondly, from the header line, each row of your table should contain
10
fields. Apparently, it’s not the case of your first line, which contains9
fields ! -
Thirdly, if we keep the first
10
characters, only, of the6th
field concerned, we get the string13350001 3
! Do you expect such result ?
In case of a positive result :
-
Open the Replace dialog (
Ctrl + H
)-
SEARCH
(?-s)^(?:"([^";\r\n])*";){5}"(?1){10}\K(?1)+(?=")
-
REPLACE
Leave EMPTY
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click on the
Replace All
button ( Do not use the Replace button ! )
-
=> All characters, of the
6th
field, after the10th
character, are deleted !If a row contains a
6th
field with less than11
chars, the line is not processed !Best Regards,
guy038
-
-
Hello @guy038,
oh my God, it worked! :)
It just worked!
Thank you so much for your effective solution! :)Best regards,
Cristian -
Hi, @Cristian-tanasa, and All,
I improved and generalized the process with these
8
new search regexes, in order to find part of a particular fieldn
Of course, I assume that :
-
Each row of the table contains the same number of fields
-
The field delimiter is the double quote char (
"
) -
The field separator is the semicolon (
;
) -
Any field is preceded and/or followed with a
;
-
Any char, within a field, is different from, either, a
"
and a;
chars
Here are these generic regexes :
(?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " \K (?1)* # ALL chars, even NONE, of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " (?1){#} \K (?1)* # ALL chars, even NONE, AFTER the #th char of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " (?1){#} \K (?1){p} # p chars, AFTER the #th char of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " \K (?1){p} # The p FIRST chars of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " (?1)* \K (?1){p} (?=") # The p LAST chars of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " \K # EMPTY string, at BEGINNING of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " (?1){#} \K # EMPTY string, AFTER the #th char of FIELD n (?x) ^ (?: " ( [^";\r\n] )* " ; ) {n-1} " (?1)* \K # EMPTY string, at END of FIELD n
Notes :
-
Let
f
be the total number of fields and letm
be the maximum number of characters of the fieldn
. Then :-
The variable
n
is between the values1
included andf
included ( Son-1
is in range[0,f-1]
) -
The variable
#
is between the values0
included andm
included -
The variable
p
is between the values0
included andm
included
-
Let’s test these
8
real regexes, below :(?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " \K (?1)* # ALL chars, even NONE, of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " (?1){10} \K (?1)* # ALL chars, even NONE, AFTER the 10th char of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " (?1){10} \K (?1){3} # THREE chars, AFTER the 10th char of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " \K (?1){5} # The 5 FIRST chars of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " (?1)* \K (?1){7} (?=") # The 7 LAST chars of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " \K # EMPTY string, at BEGINNING of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " (?1){10} \K # EMPTY string, AFTER the 10th char of FIELD 6 (?x) ^ (?: " ( [^";\r\n] )* " ; ) {5} " (?1)* \K # EMPTY string, at END of FIELD 6
Against the following sample text :
Field 6 V "1000";"1";"1";"";"834600000";"";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"1";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"12";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"123";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"1234";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"12345";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"123456";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"1234567";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"12345678";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"123456789";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"1234567890";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"12345678901";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"123456789012";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"1234567890123";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"12345678901234";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"123456789012345";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"1234567890123456";"1080668 / 13341845 This is technical";"1";"EN";"Call" "1000";"1";"1";"";"834600000";"12345678901234567";"1080668 / 13341845 This is technical";"1";"EN";"Call"
Super, isn’t it ?
Best Regards,
guy038
-