Remove column from CSV files
-
Hi,
I want to remove a specific column from multiple csv files. There are 10 columns separated by commas, Please see example below
09/08/2022,08:49:02,WWW1_005,604876.5,6383832.7,7.5,50939.33,0,2.2,0079-099_EEE_RR_EM2040D_TTTT_CCC_WWW1_005-0005.db
09/08/2022,08:49:02,WWW1_005,604876.4,6383832.9,7.4,50939.33,0,2.2,0079-099_EEE_RR_EM2040D_TTTT_CCC_WWW1_005-0005.dbI want to remove the last column containing these values 0079-099_EEE_RR_EM2040D_TTTT_CCC_WWW1_005-0005.db.
Note: I cannot just use Alt+Shift+click as there are too many files, and I won’t be able to manually remove each independent column from 2000 csv files :)
Any help will be appreciated,
Thanks
-
Something like this maybe:
Find:
(?-s)^((.+?,){8}.+?),.+
Replace:${1}
Search mode: Regular expressionwill transform:
09/08/2022,08:49:02,WWW1_005,604876.5,6383832.7,7.5,50939.33,0,2.2,0079-099_EEE_RR_EM2040D_TTTT_CCC_WWW1_005-0005.db 09/08/2022,08:49:02,WWW1_005,604876.4,6383832.9,7.4,50939.33,0,2.2,0079-099_EEE_RR_EM2040D_TTTT_CCC_WWW1_005-0005.db
into:
09/08/2022,08:49:02,WWW1_005,604876.5,6383832.7,7.5,50939.33,0,2.2 09/08/2022,08:49:02,WWW1_005,604876.4,6383832.9,7.4,50939.33,0,2.2
-
@Alan-Kilborn Amazing, it worked thanks! this made my day! :D
-
Hello, @jeshua-guzman, @alan-kilborn and all,
@jeshua-guzman said :
I want to remove the last column…
So , I suppose that we can use this alternative :
-
SEARCH
,[^,\r\n]+$
-
REPLACE
Leave that field EMPTY
Note :
- However this regex S/R is less safe than the @alan-kilborn solution, as you must only run it ONCE only ! Indeed, any subsequent replacement would delete the present last column, …until all the columns, minus 1, would be consumed !
Best Regards,
guy038
-
-
@guy038 said in Remove column from CSV files:
,[^,\r\n]+$
Thanks! this option also works. For future tasks, would you mind to explain how this works or how can I build it myself?
Cheers,
-
Hi, @jeshua-guzman and All,
Not very difficult :
-
The
$
symbol represents the zero-length string beteeen the last character of a line and its line-break chars -
The
[....]
syntax represents a single character class character which matches any char within it -
The
[^....]
syntax represents a single character class character which does not match any char, located inside it -
The
[^....]+
syntax represents any non-null range of characters which does not match any char, located inside it
So, finally, the
,[^,\r\n]+$
regex means: searches for…-
A litteral
,
character, followed with… -
Any non-null range of characters (
+
) which are different (^
) from, either ([....]
), the comma,
and the possible line-break chars\n
and\r
till… -
The very end of current line
$
USEFUL REFERENCES
Best Regards,
guy038
-
-
@guy038 Great thanks!