Can a CSV file be displayed and edited in a table fashion. Excel mangles it.
-
I have to deal with long strings of CSV data. If I use Excel to do so it “helps” by stripping doublequotes from the file rendering it useless.
The issue is that I need to see it in a table format to examine the data in aligned columns like Excel does. Can Notepad++ do this ?Below is an example of 2 lines of this data.
“DI”,“A-DI”,“”,“Digital Placeholder for Dynamo Creation”,“SIM”,“”,“1:0”,“AUTO”,“ON”,“1”,“NO”,“OPEN”,“CLOSE”,“DISABLE”,“NONE”,“LOW”,“NONE”,“DISABLE”,“NONE”,“NONE”,“NONE”,“YES”,“Other”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“NONE”,“YES”,“NO”,“REJECT”,“NO”,“1,000”,“300,000”,“0”,“NO”,“NO”,“0”,“”,“NO”,“5,000.0”,“0.0”,“Milliseconds”,“ENABLE”,“0.0000”,“Absolute”,“0”,“ENABLE”,“CHOICES”
“DI”,“AAHH_195”,“”,“Ozone Gen Room Hi Amb Ozone ALARM”,“IDS”,“”,“OzGen.33.ALM_AMBIENT_OZONE_HIGH”,“AUTO”,“ON”,“1”,“NO”,“OK”,“ALARM”,“ENABLE”,“NONE”,“CRITICAL”,“CLOSE”,“DISABLE”,“NONE”,“NONE”,“NONE”,“NO”,“Ozone”,“SMS_CRIT”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“Concentration over .3 ppm - Ventilation Started - Generators Shut Down”,“”,“NONE”,“YES”,“NO”,“REJECT”,“NO”,“1,000”,“300,000”,“0”,“NO”,“NO”,“0”,“”,“NO”,“5,000.0”,“0.0”,“Milliseconds”,“DISABLE”,“0.0000”,“Absolute”,“0”,“ENABLE”,“CHOICES” -
@MyNameIsTooShort said in Can a CSV file be displayed and edited in a table fashion. Excel mangles it.:
I have to deal with long strings of CSV data. If I use Excel to do so it “helps” by stripping doublequotes from the file rendering it useless.
The issue is that I need to see it in a table format to examine the data in aligned columns like Excel does. Can Notepad++ do this ?Notepad++ cannot do this without adding a plugin, but a plugin I’m developing, called Columns++, can help you.
In my experience this plugin is stable; however, it’s not yet in Plugins Admin and it has not yet had extensive real-world testing. Be sure you have a safety copy of your data, just in case. The link above goes to the latest release on GitHub; there you will find a link to a “quick installer” which you can use if you have Notepad++ installed in the default location.
The steps would be to use Convert separated values to tabs…, enable Elastic tabstops, view and edit as needed and then use Convert tabs to separated values… to restore the original format.
You will want to be sure Preserve quotes, escapes and blanks when converting to tabbed is checked when you convert.
-
That’s a great plugin. It worked so far just as advertised and will be a great help with locating data.
It doesn’t however treat the data as columns. i.e. I can’t select a column of data to find/replace like you can in Excel - But this is not at all unexpected and probably too much to hope for in a text editor. But worth mentioning in case you had that trick in the queue.
Because your plugin still put these two in columns while still missing the doublequotes… I’m not sure yet if this is an artifact of your plugin or the entire reason I am having trouble with this file… but I expect its the latter and I will follow up. Thanks again.
Well I tried to post a pic to explain above, but I have no reputation yet so I can’t. -
@MyNameIsTooShort said in Can a CSV file be displayed and edited in a table fashion. Excel mangles it.:
It doesn’t however treat the data as columns. i.e. I can’t select a column of data to find/replace like you can in Excel
If you want that and understand SQL, have a look at CSVQuery.
Cheers.
-
@MyNameIsTooShort said in Can a CSV file be displayed and edited in a table fashion. Excel mangles it.:
I can’t select a column of data to find/replace like you can in Excel
Notepad++ can make rectangular selections. Click at one corner of the rectangle you want to select; then Alt+Shift+click at the diagonally opposite corner. See the help on Rectangular selections for a bit more information. (One tip is that if you make an ordinary selection in a single line covering the column or columns you want, when a rectangular selection is needed, Columns++ will offer to extend your selection vertically. That makes it easy to select a column, and to start on the line after the headers if that’s want you want.)
The native search in Notepad++ can’t find or replace in rectangular selections, but the Search… function in Columns++ can. (Strictly speaking, it creates a search region from a rectangular selection — a complexity I didn’t like, but found necessary to make everything consistent and usable.)
-
@MyNameIsTooShort said in Can a CSV file be displayed and edited in a table fashion. Excel mangles it.:
Because your plugin still put these two in columns while still missing the doublequotes… I’m not sure yet if this is an artifact of your plugin or the entire reason I am having trouble with this file… but I expect its the latter and I will follow up.
I’m not yet sure what you mean, but if it is stripping the double quotes when converting from comma-separated values to tabs and you don’t want that, be sure the box Preserve quotes, escapes and blanks when converting to tabbed is checked.
If you do not have that box checked, it will remove the quotes in your example when converting to tabbed. It won’t put them back when you convert back to comma-separated values because they aren’t required in a standards-compliant CSV file. I don’t have an option to quote all entries (though you could do that with regular expressions) — unless it interferes with your editing, it’s best to leave the quotes in place by checking the box if you need them.
If it’s removing quotes when you checked that box, there is a bug.
-
CSVLint is also good. It can handle some pretty pathological files.
Lastly, here’s a regular expression for matching a single column in any delimiter-separated variables file (replace all
NEWLINE
with your newline (\r
,\n
, or\r\n
),QUOTE_CHAR
with your quote character, andDELIMITER
with your delimiter.
(?'column'(?:(?!QUOTE_CHAR)(?:(?!NEWLINE)[^DELIMITER])*(?<!QUOTE_CHAR)|QUOTE_CHAR(?:[^QUOTE_CHAR]|\\QUOTE_CHAR)*QUOTE_CHAR))
Putting it into practice, here’s that regex used to match all seven columns in separate capture groups for a 7-column CSV file with"
as quote character and\r\n
newline:
^(?'column'(?:(?!")(?:(?!\r\n)[^,])*(?<!")|"(?:[^"]|\\")*")),((?&column)),((?&column)),((?&column)),((?&column)),((?&column)),((?&column))$
You can verify that a regex works for capturing 7 columns of a CSV file by testing if find/replacing your regex with
(?1a),(?2b),(?3c),(?4d),(?5e),(?6f),(?7g)
replaces every row of the file witha,b,c,d,e,f,g
-
Hello, @mark-olson and All,
I think I succeeded to simplify your generic regex. To be convinced of this fact, the magic regex, when QUOTE_CHAR =
"
and DELIMITER_CHAR =,
is :- MARK
(?:(?<=^")|(?<=","))[^\r\n]*?(?=","|"$)
Now, test this regex against the text below, with
63
examples, pasted in a new tab# With a COMMA or DOUBLE QUOTES char in MIDDLE of column : "abc,def","789" "abc"def","789" "123","abc,def","789" "123","abc"def","789" "123","abc,def" "123","abc"def" # With a COMMA or DOUBLE QUOTES char at BEGINNING of column : ",def","789" ""def","789" "123",",def","789" "123",""def","789" "123",",def" "123",""def" # With a COMMA or DOUBLE QUOTES char at END of column : "abc,","789" "abc"","789" "123","abc,","789" "123","abc"","789" "123","abc," "123","abc"" # With a COMMA or DOUBLE QUOTES char as a COMPLETE column : ",","789" ""","789" "123",",","789" "123",""","789" "123","," "123",""" # With an EMPTY column : "","789" "123","","789" "123","" # SINGLE column cases : "" "abc" ",def" "abc,def" "abc," "," ""def" "abc"def" "abc"" """ # MICELLANEOUS SINGLE columns : "ab""cd" "ab",cd" "ab,"cd" "ab,"",cd" "ab",,"cd" "ab,""",cd" "ab",,,"cd" "ab"",,cd" "ab,,""cd" """abcd" "",abcd" ","abcd" ","",abcd" "",,"abcd" ",""",abcd" "",,,"abcd" """,,abcd" ",,""abcd" "abcd""" "abcd"," # KO "abcd,"" "abcd,""," # KO "abcd",,"" "abcd,"""," # KO "abcd",,,"" "abcd"",," "abcd,,"""
-
As you can see, almost all the cases are correctly detected
-
Empty strings are also matched with the
zero length match
call-tip -
Three columns ONLY resist ( as it ends with the
","
string ). However, we should live with these very special cases ! -
The regex always get the total content of each column, minus the
3
cases, mentioned above ! -
NO need to escape a possible QUOTE_CHAR or DELIMITER_CHAR ( as
\"
or\,
) in the column contents, in order to get the all its content ;-))
So, for a seven columns
CSV
table exactly, we need this regex ( I shortened the column name tocol
) :- SEARCH
"(?'col'(?:(?:(?<=^")|(?<=","))[^\r\n]*?(?=","|"$)))","((?&col))","((?&col))","((?&col))","((?&col))","((?&col))","((?&col))"
Test it against the following text :
"abcd","efgh","ijkl","mnop","qrst","uvwx","yz01" ",abcd","efgh","ij,kl","mnop","qrst,","uvwx","," ""abcd","efgh","ij"kl","mnop","qrst"","uvwx",""" "","","","","","",""
Thus, the corresponding generic regex would be :
"(?'col'(?:(?:(?<=^QUOTE_CHAR)|(?<=QUOTE_CHARDELIMITER_CHARQUOTE_CHAR))[^NEW_LINE_CHAR(s)]*?(?=QUOTE_CHARDELIMITER_CHARQUOTE_CHAR|QUOTE_CHAR$)))","((?&col))","((?&col))","((?&col))","((?&col))","((?&col))","((?&col))"
Best Regards,
guy038
- MARK
-
@MyNameIsTooShort said in Can a CSV file be displayed and edited in a table fashion. Excel mangles it.:
It doesn’t however treat the data as columns. i.e. I can’t select a column of data to find/replace like you can in Excel - But this is not at all unexpected and probably too much to hope for in a text editor. But worth mentioning in case you had that trick in the queue.
You can also give the MultiReplace Plugin a try, ? It allows you to filter find and replace operations by a specific CSV column scope, offering precise editing within your CSV files.
-
@guy038
Your regex is better if everything is quoted, but that’s not really what I meant.In most CSV files, a column is only wrapped in quotes if it contains the separator or a newline.
In the above example (pretty colors courtesy of CSVLint), most columns aren’t wrapped in quotes except column 3 of row 5 and column 2 of row 6. The other part of my regex is the
(?:[^"]|\\")*
alternation inside the quotes, which is necessary to allow an escaped quote character to be present inside quotes.Using my column regex
(?'column'(?:(?!")(?:(?!\r\n)[^,])*(?<!")|"(?:[^"]|\\")*"))
convertsnums,names,cities,date,zone,subzone,contaminated nan,Bluds,BUS,,1,b,FALSE 0.5,dfsd,FUDG,12/13/2020 0:00,2,c,TRUE 1.2,qere,GOLAR,,3,f,TRUE 3.4,flodt,"comma,in \"column\"",,4,q,FALSE 4.6,"newline in column",QUZ,10/17/2014 0:00,5,"",TRUE 7,Unyir,MOKJI,5/11/2017 0:00,6,i,TRUE
to seven lines of
a,b,c,d,e,f,g
as expected.