Filter a certain amount of letters out of a code
-
So basically I am having a long list of codes with 12 characters, including A-Z and 0-9.
I want to delete every code which has more than 3 letters.
for exampe: C9RUEOCXK60E
has more than 3 letters - should be filtered out.
C95678C3210E has 3 letters - I want to keep this one.How am i gonna do this?
-
This could be done with a RegEx:
\b([[:alpha:]]+?[[:digit:]]*){4,}\b
to search for and an empty String for replacement. If you want to delete the entire line, replace the second\b
with\R
.What it does: after a word boundary (
\b
) look for the shortest possible sequence of alpha-characters ([[:alpha:]]+?
), optionally followed by a sequence of digits ([[:digit:]]*
) with at least four repetitions ({4,}
) up to the next word boundary. -
Hi, @daniel-elpunkt,
@gerdb42’s regex works fine. Just an other formulation, based on the fact that your codes do not contain lower-case letters :
SEARCH
\b(\d*\u){4,}\d*\b
REPLACE
EMPTY
And, as @gerdb42 said, if you have a single code, only, of 12 characters long, per line, just replace the final
\b
by the syntax\R
, which represents any End of Line characters, whatever the file type ( Windows, Unix or Mac )SEARCH
\b(\d*\u){4,}\d*\R
REPLACE
EMPTY
After replacement, only codes, containing 0, 1, 2 or 3 capital letters, are kept !
Best Regards,
guy038