Hello @texaskcfan, @Peterjones, @terry-r and All,
@texaskcfan, if I assume that :
Your list contains one record per line
The field separator is the comma, located right after the previous field and right before the next field
Any field is, either :
Any text, between two double quotes, beginning a line, possibly preceded with blanks characters and followed with a comma, as "xxxxxx",
Any text, between two double quotes, itself surrounded by two commas, as ,"xxxxxx",
Any text, between two double quotes, preceded by a comma and ending a line or the very end of current file, as ,"xxxxx"CRLF
The following regex, containing a recursive pattern, will find all the zones between two commas, which contains an even number of " ( double-quote characters ) :
(?x) (?:^\h*|,) (?: ( " (?: [^"\r\n,]++ | (?1) )* " ) )+? (?=,|\R|\z)
For instance, given this sample, that you’ll copy in a new tab :
"0","0","","","","","","","","","","","","",""
"Field1","Field2","Field3","Field4","Field5","Field6","Field7","Field8","123 Main "St AptD"","","Dallas","TX","12345","","","","","","","","","","","","","","","","","","","","","","",""
"Field1","Field2","Field3","Field4","Field5","Field6","Field7","Field8","This fie"ld is NOT correct","","Dallas","TX","12345","","","","","","","","","","","","","","","","","","","","","","",""
," abcde","","ijk "123"," 987 "This is"a small""pie"""ce of"text for" tests" !!","","abc"de"fgh"ij","12345"
" abcde","","ijk "123"," 987 "This is"a small""pie"""ce of"text for" tests" !!","","abc"de"fgh"ij","12345"
"","","",","","","",""
","","","","",""
" """ """ """ "
" " " """ """ "
1234567890","","","","",""
" abcde","","ijk "123"," 987 "This is"a small""pie"""ce of"text for" tests" !!","","abc"de"fgh"ij","12345"
Open the Mark dialog ( Ctrl + M )
Paste the above regex in the Find what: zone
Preferably, tick the Purge for each search option
Possibly, tick the Wrap around option
Select the Regular expression search mode
Click on the Mark All button
=> All the fields ,........, containing an even number of " ( thus any correct field ), are marked in red style. This means that any zone, still unmarked, contains an odd number of double quotes, probably indicating one " character, too many or too few ;-))
Note that, for a correct scanning process of the text, the regex marks from the comma, before each zone "....." till the ", right before the next comma !
You may, as well, use the Find dialog to visualize each correct field, one a a time !
Here is a snapshot of the sample, after the mark operation :
As you can see, all the fields non marked are incorrect in some ways and need examination.
Note, also, that it’s impossible build up a regex in order to get the opposite logic, i.e. which would match the general case of any zone, between two commas, containing an odd number of double quotes inside !
Best Regards
guy038