Advice on handling malformed CSV with FasterCSV

All,

FasterCSV 1.0.0
Ruby 1.8.4

I am writing functionality to replace an an older system. Part of this
system accepts uploaded CSV files, parses, and displays their contents.
When confronted with a file including one line that looks like this:

"data",'otherdata"

the legacy app. happily parses this into two fields with values data and
otherdata.

However, FasterCSV fails because of the quote mismatch on the 2nd field
and throws a MalformedCSVError when you attempt to manipulate the object
returned from FasterCSV.open().

I've taken a look at the code and I can see where the quote matching is
enforced and so I have a pretty good idea of how to "fix" this (which I
am only even considering because I'm trying to match the legacy
functionality).

I need some advice. My questions are:

1) Is there a way to get FasterCSV to accept input like the above? How
can I set up a custom parsing scheme?
2) Should I modify the FasterCSV parser directly to accept this sort of
input?
3) Should I pre-process the file to eliminate/modify these sorts of
cases to improve the likelihood of a successful FasterCSV parse?

Thanks,
Wes

···

--
Posted via http://www.ruby-forum.com/.

Wes Gamble wrote:

/ ...

I need some advice. My questions are:

1) Is there a way to get FasterCSV to accept input like the above? How
can I set up a custom parsing scheme?
2) Should I modify the FasterCSV parser directly to accept this sort of
input?
3) Should I pre-process the file to eliminate/modify these sorts of
cases to improve the likelihood of a successful FasterCSV parse?

I strongly recommend that you normalize the database to conform to CSV
conventions, before performing any processing. This is much to be preferred
to rewriting a standard library in such a way that it accepts the malformed
data.

The original database doesn't meet normal CSV conventions. If it is expected
to exist for any length of time in its present form, you may find yourself
rewriting a lot of parsers to get them to accept it, or you can fix the
database itself, once. I think the latter approach makes more sense.

···

--
Paul Lutus
http://www.arachnoid.com

Paul Lutus wrote:

I strongly recommend that you normalize the database to conform to CSV
conventions, before performing any processing. This is much to be
preferred
to rewriting a standard library in such a way that it accepts the
malformed
data.

I agree.

Thanks, Paul.

Wes

···

--
Posted via http://www.ruby-forum.com/\.