Help me please, how can I extract the mail and the next column?
-
Help me please, how can I extract the mail and the next column?
678:ina_caeter@yahoo.com :Lina:S:{0}:2.0791812460:C:{5,0,2}:codepostal_23456:student:Kenty:level57:Elite:Knight
889:Dogietreats:Terry:1.5000000000000000:0.3010299957:C:{2}:1:codepostal_13567:doctor:Tygger:level34:Elder:Druid
990:Charz4you:367589:1.7500000000000000:0.6020599913:S:{0,2}:34:codepostal_45217:architect:Pog:level122:Elite:Knight
.
.
.
.
.
friend i need it
ina_caeter@yahoo.com :Lina
Dogietreats:Terry
Charz4you:367589my file is 7286246 lines
-
What have you tried already?
-
Hi, @oscar-remiccc,
Let’s try to be logic !
-
The different fields of your text are delimited with a colon character
-
This search can be considered as a mono-line search, as the different fields are not split on several lines
-
As you want to keep the
2nd
and3rd
fields, only, any search will have to refer to an anchor ( the beginning of line location^
seems obvious ! ) -
To search for a complete range of chars, between two
:
delimiters, we should search for any non-null range of consecutive characters, different from, either, a colon and any EOL char. So the negative class character[^:\n\r]
From above, one solution could be, then :
SEARCH
^[^:\n\r]+:([^:\n\r]+:[^:\n\r]+):.+
REPLACE
\1
Notes :
-
From beginning of line
^
, this regex looks for any line contents ( the first three fields, followed with the reminder of the line:.+
) -
The block
[^:\n\r]+:[^:\n\r]+
(2nd
+3rd
fields, surrounded with parentheses, defines the group1
-
So, in replacement, any line contents is replaced with these
2nd
and3rd
fields, separated with a:
character
Using the lazy quantifier
+?
, this regex S/R is a bit shorter and becomes :SEARCH
(?-s)^.+?:(.+?:.+?):.+
REPLACE
\1
Note that the first part
(?-s)^.+?:
searches, from beginning of line^
, the shortest non-null range of standard characters, which is followed with acolon
char. So, this range does not contain any:
character ;-))Best Regards,
guy038
-
-
@guy038
thank you very much guy038, a query I was trying with this code, this simple example but it did not work, it eliminates the last character990:Charz4you:367589:1.7500000000000000:0.6020599913
SEARCH: ^([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?).$
REPLACE $2:$5
Charz4you:0.602059991
Eliminate the number 3, what am I doing wrong? please
-
Hi, @oscar-remiccc,
So, to get the
2nd
and5th
fields only, just delete, in your regex, the last.
, before the$
, as below ! That should do the trick !SEARCH
^([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?):([^ ]+?)$
REPLACE
$2:$5
You’ll get the expected test :
Charz4you:0.6020599913
I think that, using the syntax of my previous post, we can simplify the search regex, as below :
SEARCH
^([^:\n\r]+):([^:\n\r]+):([^:\n\r]+):([^:\n\r]+):([^:\n\r]+)
REPLACE
$2:$5
But, you do not need to store all the fields between parentheses ! Just store the fields
2
and5
and if you include the:
in group2
, we get the regex S/R :SEARCH
^[^:\n\r]+:([^:\n\r]+:)[^:\n\r]+:[^:\n\r]+:([^:\n\r]+)
REPLACE
$1$2
Finally, you do not need to explicit the groups
3
and4
, too ! So, the part[^:\n\r]+:[^:\n\r]+
( groups3
and4
), can, simply, be changed into.+
, giving the final S/R :SEAARCH
^[^:\n\r]+:([^:\n\r]+:).+:([^:\n\r]+)
REPLACE
$1$2
Best Regards,
guy038
-
@guy038
you are a great teacher, thank you very much