How do you get the rows out of FasterCSV?

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

The examples in the FasterCSV documentation at
http://fastercsv.rubyforge.org/ show class methods that provide
file-like operations. I want the file read once while the data is read,
then closed. While reading the first time, the column sum is calculated.
But then I want to go through the csv rows again, this time writing out
the rows with their new column.

Here's a sketch of what I had in mind. It doesn't work as intended...

  # read csv file, summing the values
  sum = 0
  csv =
FasterCSV.new(open(csv_filename),{:headers=>true}).each_with_index do |
row, c |
    sum += row["VAL"].to_f
    end

  # now write
  FasterCSV.open("test_csv_file.csv", "w", {:headers=>true}) do |csvout|
    csv.each_with_index do | row, c |
            row["NORMED"] = row["VAL"].to_f / sum
        csvout << row.headers if c==0
        csvout << row
    end
  end

Also, is there a more graceful way to have the headers written out?

···

--
Posted via http://www.ruby-forum.com/.

...snip...

csvData = FasterCSV.read('/path/to/infile.csv', :headers=>true) ##
read all data into an array of FasterCSV::Rows

##, may run into memory issues
sumColX = 0
csvData.each{|row| sum += row['ColX']}

FasterCSV.open("path/to/outfile.csv", "w") do |csv|
    csvData.each{ |row|
        csv << row << row['ColX'].to_f / sum ## calc and append norm
value and output row of CSV DATA
    }
end

Not tested, or even executed but I think its close 9^)

Cheers
Chris

···

On Feb 5, 11:44 pm, Gary <gb3...@excite.com> wrote:

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

harp:~ > cat a.rb
require 'rubygems'
require 'fastercsv'

csv = <<-csv
f,x
a,0
b,1
c,2
d,3
csv

fcsv = FasterCSV.new csv, :headers => true
table = fcsv.read

xs = table['x'].map{|x| x.to_i}
sum = Float xs.inject{|sum,i| sum += i}
norm = xs.map{|x| x / sum}

table['n'] = norm
puts table

harp:~ > ruby a.rb
f,x,n
a,0,0.0
b,1,0.166666666666667
c,2,0.333333333333333
d,3,0.5

-a

···

On Tue, 6 Feb 2007, Gary wrote:

Hi. I want to add a normalized column to a csv file. That is, I want to
read the file, sum all of a column X, then add another column in which
each X is divided by the sum. So do I use the CSV rows twice without
reading the file twice?

The examples in the FasterCSV documentation at
http://fastercsv.rubyforge.org/ show class methods that provide
file-like operations. I want the file read once while the data is read,
then closed. While reading the first time, the column sum is calculated.
But then I want to go through the csv rows again, this time writing out
the rows with their new column.

Here's a sketch of what I had in mind. It doesn't work as intended...

# read csv file, summing the values
sum = 0
csv =
FasterCSV.new(open(csv_filename),{:headers=>true}).each_with_index do |
row, c |
   sum += row["VAL"].to_f
   end

# now write
FasterCSV.open("test_csv_file.csv", "w", {:headers=>true}) do |csvout|
   csv.each_with_index do | row, c |
           row["NORMED"] = row["VAL"].to_f / sum
       csvout << row.headers if c==0
       csvout << row
   end
end

Also, is there a more graceful way to have the headers written out?

--
we can deny everything, except that we have the possibility of being better.
simply reflect on that.
- the dalai lama

require 'rubygems'
require 'fastercsv'

csv = <<-csv
f,x
a,0
b,1
c,2
d,3
csv

fcsv = FasterCSV.new csv, :headers => true
table = fcsv.read

Or just:

table = FCSV.parse(csv, :headers => true)

xs = table['x'].map{|x| x.to_i}

When you want to convert a field, ask FasterCSV to do it for you while reading. This changes the above to:

table = FCSV.parse(
   csv,
   :headers => true,
   :converters => lambda { |f, info| info.header == "x" ? f.to_i : f }
)

Or using a built-in converter:

table = FCSV.parse(csv, :headers => true, :converters => :integer)

sum = Float xs.inject{|sum,i| sum += i}

This is now simplified to:

sum = Float table['x'].inject{|sum,i| sum += i}

norm = xs.map{|x| x / sum}

table['n'] = norm
puts table

James Edward Gray II

···

On Feb 5, 2007, at 11:58 PM, ara.t.howard@noaa.gov wrote:

> Hi. I want to add a normalized column to a csv file. That is, I want to
> read the file, sum all of a column X, then add another column in which
> each X is divided by the sum. So do I use the CSV rows twice without
> reading the file twice?

> The examples in the FasterCSV documentation at
>http://fastercsv.rubyforge.org/show class methods that provide
> file-like operations. I want the file read once while the data is read,
> then closed. While reading the first time, the column sum is calculated.
> But then I want to go through the csv rows again, this time writing out
> the rows with their new column.

> Here's a sketch of what I had in mind. It doesn't work as intended...

> # read csv file, summing the values
> sum = 0
> csv =
> FasterCSV.new(open(csv_filename),{:headers=>true}).each_with_index do |
> row, c |
> sum += row["VAL"].to_f
> end

> # now write
> FasterCSV.open("test_csv_file.csv", "w", {:headers=>true}) do |csvout|
> csv.each_with_index do | row, c |
> row["NORMED"] = row["VAL"].to_f / sum
> csvout << row.headers if c==0
> csvout << row
> end
> end

> Also, is there a more graceful way to have the headers written out?

harp:~ > cat a.rb
require 'rubygems'
require 'fastercsv'

csv = <<-csv
f,x
a,0
b,1
c,2
d,3
csv

fcsv = FasterCSV.new csv, :headers => true
table = fcsv.read

xs = table['x'].map{|x| x.to_i}
sum = Float xs.inject{|sum,i| sum += i}

Should be sum + i.

norm = xs.map{|x| x / sum}

table['n'] = norm
puts table

harp:~ > ruby a.rb
f,x,n
a,0,0.0
b,1,0.166666666666667
c,2,0.333333333333333
d,3,0.5

array = "\
f,x
a,0
b,1
c,2
d,3
".split.map{|s| s.split ","}

headers = array.shift << 'n'

array = array.transpose
sum = array[1].inject{|a,b| a.to_i + b.to_i}.to_f
array << array[1].map{|x| x.to_i / sum}
array = array.transpose
([headers] + array).each{|x| puts x.join(',')}

--- output -----
f,x,n
a,0,0.0
b,1,0.166666666666667
c,2,0.333333333333333
d,3,0.5

···

On Feb 5, 11:58 pm, ara.t.how...@noaa.gov wrote:

On Tue, 6 Feb 2007, Gary wrote:

Thanks, works great!!

If I want to parse some columns as floats, some as ints, and leave the
rest alone, is this the best way to do it?

  table = FCSV.parse(open(csv_filename),
  :headers => true,
  :converters => lambda { |f, info|
    case info.header
    when "NUM"
      f.to_i
    when "RATE"
      f.to_f
    else
      f
    end
  })

How do I write the finished file? My attempts appear without headers in
the csv file?

···

--
Posted via http://www.ruby-forum.com/.

ThisForum IsSpammable wrote:

How do I write the finished file? My attempts appear without headers in
the csv file?

open("test_csv_file.csv", "w") << table

worked and included the headers. Is there a faster way for large csv
tables?

Thanks

···

--
Posted via http://www.ruby-forum.com/\.

Thanks, works great!!

If I want to parse some columns as floats, some as ints, and leave the
rest alone, is this the best way to do it?

  table = FCSV.parse(open(csv_filename),
  :headers => true,
  :converters => lambda { |f, info|
    case info.header
    when "NUM"
      f.to_i
    when "RATE"
      f.to_f
    else
      f
    end
  })

Instead of:

   FCSV.parse(open(csv_filename) ... )

Use:

   FCSV.read(csv_filename ... )

Your converters work fine though, yes. If you numbers can be recognized by some built-in converters, you might even be able to get away with

   FCSV.read(csv_filename, :headers => true, :converters => :numeric)

How do I write the finished file? My attempts appear without headers in
the csv file?

I would use:

   File.open("path/to/file", "w") { |f| f.puts table }

Hope that helps.

James Edward Gray II

···

On Feb 6, 2007, at 12:52 PM, Thisforum Isspammable wrote:

alias to FCSV.table

perhaps?

-a

···

On Wed, 7 Feb 2007, James Edward Gray II wrote:

Instead of:

FCSV.parse(open(csv_filename) ... )

Use:

FCSV.read(csv_filename ... )

Your converters work fine though, yes. If you numbers can be recognized by some built-in converters, you might even be able to get away with

FCSV.read(csv_filename, :headers => true, :converters => :numeric)

--
we can deny everything, except that we have the possibility of being better.
simply reflect on that.
- the dalai lama

Alias read() to table() or read() with those options?

James Edward Gray II

···

On Feb 6, 2007, at 1:57 PM, ara.t.howard@noaa.gov wrote:

On Wed, 7 Feb 2007, James Edward Gray II wrote:

FCSV.read(csv_filename, :headers => true, :converters => :numeric)

alias to FCSV.table

perhaps?

the latter. i hate typing! :wink:

actually, read with those __default__ options. so

   def FCSV.table opts = {}

     ....

     headers = opts[:headers] || opts['headers'] || true
     converters = opts[:converters] || opts['converters'] || :converters

     ....

   end

thoughts??

-a

···

On Wed, 7 Feb 2007, James Edward Gray II wrote:

On Feb 6, 2007, at 1:57 PM, ara.t.howard@noaa.gov wrote:

On Wed, 7 Feb 2007, James Edward Gray II wrote:

FCSV.read(csv_filename, :headers => true, :converters => :numeric)

alias to FCSV.table

perhaps?

Alias read() to table() or read() with those options?

--
we can deny everything, except that we have the possibility of being better.
simply reflect on that.
- the dalai lama

Yes, FasterCSV doesn't support goofy Rails-like Hashes. :smiley:

Beyond that though, I released FasterCSV 1.2.0 today with the addition of:

   def self.table(path, options = Hash.new)
     read( path, { :headers => true,
                   :converters => :numeric,
                   :header_converters => :symbol }.merge(options) )
   end

Enjoy.

James Edward Gray II

···

On Feb 6, 2007, at 2:42 PM, ara.t.howard@noaa.gov wrote:

On Wed, 7 Feb 2007, James Edward Gray II wrote:

On Feb 6, 2007, at 1:57 PM, ara.t.howard@noaa.gov wrote:

On Wed, 7 Feb 2007, James Edward Gray II wrote:

FCSV.read(csv_filename, :headers => true, :converters => :numeric)

alias to FCSV.table
perhaps?

Alias read() to table() or read() with those options?

the latter. i hate typing! :wink:

actually, read with those __default__ options. so

  def FCSV.table opts = {}

    ....

    headers = opts[:headers] || opts['headers'] || true
    converters = opts[:converters] || opts['converters'] >> :converters

    ....

  end

thoughts??