It is good that a solution here helped out the OP, but as I read the problem statement, it struck me that, although sample data was provided, perhaps the larger data set doesn’t always conform to the key parts of the sample. Only the OP knows this (or could know it by looking at more of his data set). People proposing solutions wouldn’t know it, so they may “key off” of some things that aren’t true in the larger data set.
I guess my basic point is that OP should really look hard at his data set and make sure that the solution chosen really does what is intended for all of the data set. This is somewhat obvious, but it is easy to “grab and go” with a solution, and then find out later that you haven’t covered all cases, or worse, you’ve irretrievably corrupted your data.
It always seems easier to “trust” someone else’s solution is good for your problem. (Think about “code-grab-and-go” from a stackoverflow solution.) When you’re developing your own solution, you tend to look at things with a closer and more-critical eye.
Of course, if a problem with the data conversion is noticed down the road, OP could come back here and “complain” that the solution was deficient, to be told that the solution was adequate for the sample data and problem description.