Replace comma in text between quotes
-
I have numerous fields in an input file where it is like:
field1, field2, “field3, firstname”, field4, field5The end result is that I want to replace all of the commas with | except the items within the quotes and remove the quotes.
I’m not sure this is best done in Notepad++ or not.
I wasn’t able to figure out a way to do the full change in Notepad++ so I was attempting to do this in multiple steps. Thinking I could- modify the comma within the quotes to some other delimiter,
- modify all of the commas to the or bar |
- then modify the other delimited created in step 1 back to a comma and then remove the quotes
Any assistance is appreciated, thanks
-
@Stephen-Yorke said:
I’m not sure this is best done in Notepad++ or not.
I’m thinking Excel might be a better option. The comma is the “default” delimiter, however another character can become the delimiter. The quotes exist due to the current delimiter being ALSO part of a field. So when you change the delimiter Excel would not need to quote that field anymore.
Yes if could be done with regexs, possibly as 3 steps just as you suggest although I think 1 regex might also achieve it. Unfortunately I’m not in a position to provide one right now.
I do wonder though why this is necessary. Most programs reading in delimited data know how to deal with it correctly, such as your situation.
Terry
-
@Stephen-Yorke
I have now created a regex that will only target commas (,
) outside of quoted fields.
So we have (use search mode as regular expression and wraparound ticked).
Find What:,\h*(?![^,]+”)
Replace With:|
<----- note there is a space after the|
character. This is to put a space after every |.After running that you then can remove all quotes. I haven’t given you the regex for that, I will leave that for you to do, it is a very easy one to create.
Give it a go and let us know if it suits your needs. Although your 4th field is quoted, this will work on leaving ANY field alone if quoted.
Terry