FasterCSV RCR?

James_Edward_Gray_II · 5 June 2006 14:50

I can understand your frustration about this point. When I wrote csv.rb
at first, I thought all csv users would do the following when I define
reader style.

  CSV.open("filename.csv", "r") do |reader|
    reader.each do |row|
      ...do something...
    end
  end

Why don't we just write like this;

  CSV.open("filename.csv", "r") do |row|
    ...do something...
  end

That's why we have foreach(). Better to use that and gain all the familiarity of Ruby programmers who are use to things working that way.

I know you are considering that IO-ish methods are important. But I
don't think CSV object should handle IO methods like fcntl, fileno,
seek, tell, tty?, and so on. Would you please tell me typical and
pragmatic examples of reader style, except 'each'?

If people only did what I could think of, programming would be very boring. It took me five or ten minutes to make all those methods available and now they are there if someone needs them.

I can tell you that it has already come in handy. I got a bug report that the line numbers in errors were off, because CSV allows embedded \n characters in fields. To fix it, I overrode IO's lineno() method with correct behavior. This seems very natural and the added bonus is that you can now get a CSV aware line number.

I'm confused about why CSV does this, since it offers the foreach()
method, which normally fills this role.

foreach and readlines are added recently from IO. Now I think it was a
bad choice though...

That makes me sad to hear. foreach() is easily my most used method with CSV and FasterCSV. I like readlines() too.

I still can't think of any good reason not to just follow Ruby's interface as much as is possible and natural. To do anything else forces programmers to adapt their expectations for no reason I can understand.

* I always have to think, "Now do I want the *_line() method or the
*_row() method here..."

Users don't need to use *_line and *_row methods I think. When do you
use generate_line?

I'm pretty sure we want to have our CSV library support data not in files. Am I missing something? Is there a better way to get a CSV string with your library?

* Most methods take a field separator and a row separator, but
foreach() and readlines() only take the row separator.

See IO.foreach and IO.readlines.

That's comparing apples and oranges. IO.foreach() doesn't need to be aware of fields, but CSV.foreach() does. IO.open() doesn't support a field separator or a row separator, but your CSV.open() does because it is needed.

* I have to set a field separator when I really just want to set a row
separator.

csv.rb in svn repository supports pseudo-keyword-like-method-argument
style. I'll merge it ruby's csv repository before the next release.
http://dev.ctor.org/csv/browser/trunk/lib/csv.rb

# I defined keywords :fs and :rs but it should be :col_sep and :row_sep
# in conformity with faster_csv.

:fs and :rs are fine with me. It's consistent with your interface.

Here's a selection of some features from my CHANGELOG that I am not
aware of in CSV:

Thanks. I'll look into this. I hope those features are pluggable into
csv.rb and other modules like DBI, spreadsheet related things, HTML
table formatters, etc. I think some of these features are table
specific, not CSV.

This leads me naturally to the question: is there any good reason to reinvent FasterCSV, when we could just use FasterCSV?

James Edward Gray II

···

On Jun 4, 2006, at 7:55 PM, NAKAMURA, Hiroshi wrote:

Daniel_N1 · 29 May 2006 01:30

Sorri to stray off topic here, but is there any tutorials for FasterCSV?

I could not find one from Mr Google and I would like to use it for a small
project I'm working on.

Thanx
Dan

···

On 5/28/06, James Edward Gray II <james@grayproductions.net> wrote:

On May 27, 2006, at 9:09 PM, NAKAMURA, Hiroshi wrote:

>>> Just replace csv.rb with faster_csv.rb.
>>
>> I just don't want to break a lot of software.
>
> I understand that it's a compensation of speed.

Only if you go through that interface. The FasterCSV interface is
still quite quick.

Let me rethink it a little. It was optimized for developer
productivity when I built it. I might be able to do better looking
at it from the idea of easy transitioning for the users.

Of course, I handle open() quite differently, so we're going to have
problems merging both models of that method in the CSV class. Hmm...

James Edward Gray II

NAKAMURA_Hiroshi4 · 6 June 2006 01:42

Hi,

James Edward Gray II wrote:

That's why we have foreach(). Better to use that and gain all the
familiarity of Ruby programmers who are use to things working that way.

I still can't think of any good reason not to just follow Ruby's
interface as much as is possible and natural. To do anything else
forces programmers to adapt their expectations for no reason I can
understand.

I think I still have not been able to explain well what's the difference
of our viewpoint I think. You think a CSV object is an IO. But I don't
think so and I defined Writer and Reader in csv.rb. It's not 'natural'
not be added.

I feel a sentence "Comma Separated Value is an IO" strange. What do you
think about it? FasterCSV should be CSVIO or CSV::IO, no?

I know you are considering that IO-ish methods are important. But I
don't think CSV object should handle IO methods like fcntl, fileno,
seek, tell, tty?, and so on. Would you please tell me typical and
pragmatic examples of reader style, except 'each'?

If people only did what I could think of, programming would be very
boring. It took me five or ten minutes to make all those methods
available and now they are there if someone needs them.

Agreed to the first sentence. But I don't think we should do all we can
do even if it's easy.

I can tell you that it has already come in handy. I got a bug report
that the line numbers in errors were off, because CSV allows embedded \n
characters in fields. To fix it, I overrode IO's lineno() method with
correct behavior. This seems very natural and the added bonus is that
you can now get a CSV aware line number.

Thank you for the example. CSVIO#lineno or CSV::IO#lineno seems
reasonable for me.

But half of methods you defined as a delegator still seems not
meaningful for me.

  # * binmode()
  # * close()
  # * close_read()
  # * close_write()
  # * closed?()
  # * eof()
  # * eof?()
  # * fcntl()
  # * fileno()
  # * flush()
  # * fsync()
  # * ioctl()
  # * isatty()
  # * pid()
  # * pos()
  # * reopen()
  # * rewind()
  # * seek()
  # * stat()
  # * sync()
  # * sync=()
  # * tell()
  # * to_i()
  # * to_io()
  # * tty?()

# above is excerpted from faster_csv.rb/0.2.0

* I always have to think, "Now do I want the *_line() method or the
*_row() method here..."

Users don't need to use *_line and *_row methods I think. When do you
use generate_line?

I'm pretty sure we want to have our CSV library support data not in
files. Am I missing something? Is there a better way to get a CSV
string with your library?

Please use CSV::Writer for that.

str = ''
writer = CSV::Writer.create(str)
writer << [1,2,3]
...
writer << [x,y,z]
writer.close
puts str

* Most methods take a field separator and a row separator, but
foreach() and readlines() only take the row separator.

See IO.foreach and IO.readlines.

That's comparing apples and oranges. IO.foreach() doesn't need to be
aware of fields, but CSV.foreach() does. IO.open() doesn't support a
field separator or a row separator, but your CSV.open() does because it
is needed.

Hmm. I think "same name and different method arguments" is a bad design
because it confuses users. But you already use (pseudo) keyword
argument style so you are thinking "but just adding arguments could be a
good design", right?

It could be. I need more time to think about it.

Here's a selection of some features from my CHANGELOG that I am not
aware of in CSV:

Thanks. I'll look into this. I hope those features are pluggable into
csv.rb and other modules like DBI, spreadsheet related things, HTML
table formatters, etc. I think some of these features are table
specific, not CSV.

This leads me naturally to the question: is there any good reason to
reinvent FasterCSV, when we could just use FasterCSV?

I wrote 'introduce' and meant 'I won't reinvent table specific
implementations. I'll just get it from faster_csv, if it is pluggable'.

Regards,
// NaHi

···

from my viewpoint. That's why I think 'foreach' and 'readlines' should

Logan_Capaldo · 29 May 2006 02:14

It's not exactly a tutorial, but the examples in the docs [1] should be enough to get you started.

[1] http://fastercsv.rubyforge.org/classes/FasterCSV.html

···

On May 28, 2006, at 9:30 PM, Daniel N wrote:

Sorri to stray off topic here, but is there any tutorials for FasterCSV?

I could not find one from Mr Google and I would like to use it for a small
project I'm working on.

Thanx
Dan

James_Edward_Gray_II · 6 June 2006 13:48

Yeah, to me CSV is just another data source I want to read from/write to with slightly special handling of the lines.

The good news is that our users probably don't care what we think. If we give them a quick and convenient way to read and write CSV, I think they'll be happy.

Best of luck with your upgrades!

James Edward Gray II

···

On Jun 5, 2006, at 8:42 PM, NAKAMURA, Hiroshi wrote:

I think I still have not been able to explain well what's the difference
of our viewpoint I think. You think a CSV object is an IO. But I don't
think so and I defined Writer and Reader in csv.rb. It's not 'natural'
from my viewpoint. That's why I think 'foreach' and 'readlines' should
not be added.

Daniel_N1 · 29 May 2006 03:39

Thanx Logan,

Sorry I should have been a bit clearer. I read those but what I had trouble
with was when I receive the csv file from a web form. It gives me a
StringIO object and I don't know what to do with it.

Any help is greatly appreciated.

···

On 5/29/06, Logan Capaldo <logancapaldo@gmail.com> wrote:

On May 28, 2006, at 9:30 PM, Daniel N wrote:

> Sorri to stray off topic here, but is there any tutorials for
> FasterCSV?
>
> I could not find one from Mr Google and I would like to use it for
> a small
> project I'm working on.
>
> Thanx
> Dan

It's not exactly a tutorial, but the examples in the docs [1] should
be enough to get you started.

[1] http://fastercsv.rubyforge.org/classes/FasterCSV.html

James_Edward_Gray_II · 29 May 2006 04:08

Thanx Logan,

Sorry I should have been a bit clearer. I read those but what I had trouble
with was when I receive the csv file from a web form. It gives me a
StringIO object and I don't know what to do with it.

Any help is greatly appreciated.

FasterCSV handles StringIO objects just fine:

>> require "stringio"
=> true
>> require "fastercsv"
=> true
>> data = StringIO.new(%Q{1,2,"3,4",5})
=> #<StringIO:0x6ce300>
>> FasterCSV.parse(data)
=> [["1", "2", "3,4", "5"]]

Hope that helps.

James Edward Gray II

···

On May 28, 2006, at 10:39 PM, Daniel N wrote:

Daniel_N1 · 29 May 2006 04:12

Cheers. thanx so much for that.

I'll get out of your thread now

···

On 5/29/06, James Edward Gray II <james@grayproductions.net> wrote:

On May 28, 2006, at 10:39 PM, Daniel N wrote:

> Thanx Logan,
>
> Sorry I should have been a bit clearer. I read those but what I
> had trouble
> with was when I receive the csv file from a web form. It gives me a
> StringIO object and I don't know what to do with it.
>
> Any help is greatly appreciated.

FasterCSV handles StringIO objects just fine:

>> require "stringio"
=> true
>> require "fastercsv"
=> true
>> data = StringIO.new(%Q{1,2,"3,4",5})
=> #<StringIO:0x6ce300>
>> FasterCSV.parse(data)
=> [["1", "2", "3,4", "5"]]

Hope that helps.

James Edward Gray II

Topic		Replies	Views
[ANN] FasterCSV 0.1.6 -- With Header Support! ruby-talk	25	300	10 March 2006
[ANN] FasterCSV 0.2.0 -- The user requests release! ruby-talk	1	123	2 April 2006
[ANN] FasterCSV 0.1.3--CSV parsing without the wait! ruby-talk	0	122	16 November 2005
Faster CSV parsing ruby-talk	10	90	30 October 2005
Fastest CSV parsing? ruby-talk	8	94	20 August 2007

FasterCSV RCR?

Related topics