Double quote problem in CSV

Hi all

i am using following method to read the csv file and i want to save it
into the database.

@parsed_file=CSV.open("filename.csv",'r',"#{col_sep}")
  @parsed_file.each_with_index do |row, index|

         // more code here

  end
end

  my problem is i am unable to read data correctly when my data-fields
are as follows
"abc", "sdfds"string"ddsfdsf", "xyz", "pqr"

it gives me error illegal file format at above line.

Regards
Salil

···

--
Posted via http://www.ruby-forum.com/.

Salil Gaikwad wrote:

  my problem is i am unable to read data correctly when my data-fields
are as follows
"abc", "sdfds"string"ddsfdsf", "xyz", "pqr"

it gives me error illegal file format at above line.

That would be because that line is indeed wrongly formatted.

CSV files which have quote-delimited fields must double any quotes which
appear with the field. Your line above should be

"abc","sdfds""string""ddsfdsf","xyz","pqr"

This is the output you'll get if you export as CSV from Excel, for
example.

So I suggest you fix up your data source to use valid CSV. If it is not
valid CSV, you will end up having to write your own parser for it.

But there are good reasons for the doubling-up rule. Imagine, for
example, what happens if a field contains the sequence
quote-comma-quote. How could you distinguish between that being data
within one field, or the end of the field and the start of the next one?

···

--
Posted via http://www.ruby-forum.com/\.

Thanx Brian ,
  I know it's a wrong formatted csv file but i just want to know is
there any possibilty to read file like this.Actually i uploaded lot of
file in my application and i receive the file which is formatted as
above.So, thanks again...pls reply if you know anything related to this
so that i can move in a right direction.

Regards
Salil

···

--
Posted via http://www.ruby-forum.com/.

As multiple people have told you for days now, you will need to build your own parser for non-CSV data. I wish there was some shortcut we could give you, but that's still our best answer.

James Edward Gray II

···

On Mar 3, 2009, at 9:13 AM, Salil Gaikwad wrote:

I know it's a wrong formatted csv file but i just want to know is
there any possibilty to read file like this.Actually i uploaded lot of
file in my application and i receive the file which is formatted as
above.So, thanks again...pls reply if you know anything related to this
so that i can move in a right direction.

Salil Gaikwad wrote:

  I know it's a wrong formatted csv file but i just want to know is
there any possibilty to read file like this.Actually i uploaded lot of
file in my application and i receive the file which is formatted as
above.So, thanks again...pls reply if you know anything related to this
so that i can move in a right direction.

If the fields don't contain commas, maybe

    line.split(/\s*,\s*/)

will be sufficient.

Otherwise, you'll need to write something yourself, and to do this
you'll need to start by working out what rules you want to apply in
order to parse this strange format. For example, what would you expect
from parsing these?

"abc","def"","ghi"
"abc","def",","ghi"
"abc","def,","ghi"
"abc","def,",","
"abc","def,",","ghi"

···

--
Posted via http://www.ruby-forum.com/\.

i found solution to this problem i don't know it's correct way or not
neither how to do it programmatically.......

  my problem is i am unable to read data correctly when my data-fields
are as follows
"abc", "sdfds "string"ddsfdsf", "xyz", "pqr"

it gives me error illegal file format at above line.

but when i open my file in Excel and then save it and close it,my data
look like this....
abc, "sdfds string""ddsfdsf""", xyz, pqr

thogh i don't get data i desired but it also not giving me error illegal
file format.

as above data is an exceptional case but due to it i can't read my whole
file so if i get some manipulated data for such an exceptional case it's
ok.

my question is ,is it possible using rails?

···

--
Posted via http://www.ruby-forum.com/\.

Oh, sit back down James, let me take this reply. :wink:

I don't know what version of Excel you are using, but I've got Microsoft Office 2004 for Mac on my system and if I start with your text in /tmp/notcsv.csv, open it in Excel, put =LEN(A1) in cell A2 and '=LEN(A1) in cell A3 (and similarly for B2..C3), save it as ~/Documents/fromExcel.csv as a "CSV (Comma-delimited)" type, then I get this:

$ head /tmp/notcsv.csv ~/Documents/fromExcel.csv
==> /tmp/notcsv.csv <==
"abc", "sdfds "string"ddsfdsf", "xyz", "pqr"

==> /Users/rab/Documents/fromExcel.csv <==
abc," ""sdfds ""string""ddsfdsf"""," ""xyz"""," ""pqr"""
3,24,7,6
=LEN(A1),=LEN(B1),=LEN(C1),=LEN(D1)

So you seem to be missing some quotes in your output. Note that the first field had quotes in the original, but they are not required (no leading spaces, no comma or quote in the value). The original file, while technically invalid, is read by Excel and interpreted in a reasonable way. The output is strictly conforming to the CSV spec (although yours seems not to be).

As others have said, you'll have to parse it yourself if you need to accept a sloppy input format.

You also seem to have the impression that rails is somehow "more" than ruby. It isn't "more" than ruby, it just is written in ruby like any other ruby program you might write to handle the data. Just decide how each of the line types ought to be interpreted, write down the rules that result, and turn them into code.

If you insist on asking questions here, then at least LISTEN to the responses that you receive. Follow-up questions should include CODE that demonstrates two things: that you have absorbed the previous response and that you have made some additional progress which has led you to a new problem. (Or you don't understand the response and need a particular aspect clarified.)

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

···

On Mar 3, 2009, at 11:55 PM, Salil Gaikwad wrote:

i found solution to this problem i don't know it's correct way or not
neither how to do it programmatically.......

my problem is i am unable to read data correctly when my data-fields
are as follows
"abc", "sdfds "string"ddsfdsf", "xyz", "pqr"

it gives me error illegal file format at above line.

but when i open my file in Excel and then save it and close it,my data
look like this....
abc, "sdfds string""ddsfdsf""", xyz, pqr

thogh i don't get data i desired but it also not giving me error illegal
file format.

as above data is an exceptional case but due to it i can't read my whole
file so if i get some manipulated data for such an exceptional case it's
ok.

my question is ,is it possible using rails?

Thanks very much. :slight_smile:

James Edward Gray II

···

On Mar 3, 2009, at 11:31 PM, Rob Biedenharn wrote:

Oh, sit back down James, let me take this reply. :wink: