CVS parsing, counting rows and row items

I have some CSV data which looks like the following (it is the output
from the RMTrack defect management tool):

Column titles:
Issue # Date & Time Opened Summary Created by User Assigned To
Resolution Date & Time Closed

example data:
1074 16/05/2006 Something is broken import bob Ignore 26/03/2007
1807 17/07/2006 Another thing doesn't work rsmith hmaguire Ignore
27/03/2007

Basically, I'm finding the CSV documentation
http://www.ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
of very little help...

What is the best way of going about parsing the data to get the row
count?

From that I feel I could work out the unique item counts I'm looking
for.

I was thinking about something like this...

  rowcount = 0
    CSV::Reader.parse(filehandle) do |row|
      rowcount =+ 1
      return rowcount
    end

but I'm getting tangled up here.

···

--
Posted via http://www.ruby-forum.com/.

If all you want to do is count rows in a CSV file, you're just counting
lines in a file and don't need the CSV library. That's as easy as:

$ cat test.txt
1
2
3
4
5
6
$ irb
irb(main):001:0> File.open('test.txt', 'r') do |file|
irb(main):002:1* lines = 0
irb(main):003:1> file.each_line do |line|
irb(main):004:2* lines += 1
irb(main):005:2> end
irb(main):006:1> puts "Total lines: #{lines}"
irb(main):007:1> end
Total lines: 6
=> nil

If you do want to do further operations on each row that do require the row
to be parsed into its fields, you can use CVS like so:

irb(main):008:0> require 'csv'
=> true
irb(main):008:0> lines = 0
=> 0
irb(main):010:0> CSV.open('test.txt', 'r') do |row|
irb(main):011:1* lines += 1
irb(main):012:1> # do something else that requires the CSV library
irb(main):013:1* end
=> nil
irb(main):014:0> puts "Total lines: #{lines}"
Total lines: 6
=> nil

Hope that helps,

Felix

···

-----Original Message-----
From: list-bounce@example.com
[mailto:list-bounce@example.com] On Behalf Of Max Russell
Sent: Wednesday, September 12, 2007 7:06 AM
To: ruby-talk ML
Subject: CVS parsing, counting rows and row items

I have some CSV data which looks like the following (it is the output
from the RMTrack defect management tool):

Column titles:
Issue # Date & Time Opened Summary Created by User Assigned To
Resolution Date & Time Closed

example data:
1074 16/05/2006 Something is broken import bob Ignore 26/03/2007
1807 17/07/2006 Another thing doesn't work rsmith hmaguire Ignore
27/03/2007

Basically, I'm finding the CSV documentation
http://www.ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
of very little help...

What is the best way of going about parsing the data to get the row
count?

From that I feel I could work out the unique item counts I'm looking
for.

I was thinking about something like this...

  rowcount = 0
    CSV::Reader.parse(filehandle) do |row|
      rowcount =+ 1
      return rowcount
    end

but I'm getting tangled up here.
--
Posted via http://www.ruby-forum.com/\.

CSV stands for comma-separated values. Where are the commas?

To get the number of lines in a file:

IO.readlines('my_file').size

···

On Sep 12, 9:05 am, Max Russell <thedoss...@gmail.com> wrote:

I have some CSV data which looks like the following (it is the output
from the RMTrack defect management tool):

Column titles:
Issue # Date & Time Opened Summary Created by User Assigned To
Resolution Date & Time Closed

example data:
1074 16/05/2006 Something is broken import bob Ignore 26/03/2007
1807 17/07/2006 Another thing doesn't work rsmith hmaguire Ignore
27/03/2007

You have the return statement in the wrong place. If you just want to count lines in a file then you can just do "wc -l <file>". If you want to do it in Ruby you can do "ruby -ne 'END{puts $.}' <file>". If you want to do it inside a script, an efficient variant is this:
count=File.open(f){|io| c=0;io.each { c+=1 }; c}

Kind regards

  robert

···

On 12.09.2007 16:05, Max Russell wrote:

I have some CSV data which looks like the following (it is the output
from the RMTrack defect management tool):

Column titles:
Issue # Date & Time Opened Summary Created by User Assigned To
Resolution Date & Time Closed

example data:
1074 16/05/2006 Something is broken import bob Ignore 26/03/2007
1807 17/07/2006 Another thing doesn't work rsmith hmaguire Ignore
27/03/2007

Basically, I'm finding the CSV documentation
http://www.ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html
of very little help...

What is the best way of going about parsing the data to get the row
count?

From that I feel I could work out the unique item counts I'm looking
for.

I was thinking about something like this...

  rowcount = 0
    CSV::Reader.parse(filehandle) do |row|
      rowcount =+ 1
      return rowcount
    end

but I'm getting tangled up here.

CSV fields can contain line ending characters which could throw of your counts. Use a CSV parser unless you are sure about the data content.

James Edward Gray II

···

On Sep 12, 2007, at 9:57 AM, Felix Windt wrote:

If all you want to do is count rows in a CSV file, you're just counting
lines in a file and don't need the CSV library.

James Gray wrote:

If all you want to do is count rows in a CSV file, you're just
counting
lines in a file and don't need the CSV library.

CSV fields can contain line ending characters which could throw of
your counts. Use a CSV parser unless you are sure about the data
content.

James Edward Gray II

def parsecv(filehandle)
  csvarrays = CSV.read(filehandle)
  numrows = (csvarrays.length) - 1
  return numrows
end

I ended up using a CSV.read method because it returns an array of
arrays. This makes it quite easy to lop of the titles row returning a
row count.

Now to start getting the count of individual terms from a column...

···

On Sep 12, 2007, at 9:57 AM, Felix Windt wrote:

--
Posted via http://www.ruby-forum.com/\.