Parsing CSV file with ruby

Drew_Olson · 30 August 2006 15:57

I'm currently trying to do something that seems rather simple but I'm
slightly new to ruby. I want to read in a cvs file, find rows that are
distinct with respect to one of the elements in the row (for example,
all rows in which the first element is "A") and then do something with
these rows (in this case, parse them, build some XML and write it to a
file). I'm not familiar enough with iterators in ruby but I seem to
remember there being functionality that will allow me to get distinct
rows based on some element in the row. Let me know if this is possible
and how I should approach it.

Thanks,
Drew

···

--
Posted via http://www.ruby-forum.com/.

Drew_Olson · 30 August 2006 16:01

Let me be more specific: essentially I want to find the groups of rows
that share an element. Let's say each row in my CVS doc has 3 elements.
I want to iterate across every group of rows that share the same value
for the first element. Hope this makes sense.

···

--
Posted via http://www.ruby-forum.com/.

Paul_Lutus · 30 August 2006 16:55

Drew Olson wrote:

Let me be more specific: essentially I want to find the groups of rows
that share an element. Let's say each row in my CVS doc has 3 elements.
I want to iterate across every group of rows that share the same value
for the first element. Hope this makes sense.

#!/usr/bin/ruby -w

row_hash = {}

File.open("data.txt").each { |record|
   fields = record.split(",")
   row_hash[fields.first] = unless row_hash[fields.first]
   row_hash[fields.first] << record
}

row_hash.keys.sort.each { |key|
   puts "Group: #{key}"
   row_hash[key].each { |record|
      puts "\t#{record}"
   }
}

data.txt:

a,this,is,one,record
a,this,is,another,record
b,this,is,one,record
b,this,is,another,record
c,this,is,one,record
c,this,is,another,record

output:

Group: a
        a,this,is,one,record
        a,this,is,another,record
Group: b
        b,this,is,one,record
        b,this,is,another,record
Group: c
        c,this,is,one,record
        c,this,is,another,record

···

--
Paul Lutus
http://www.arachnoid.com

James_Edward_Gray_II · 30 August 2006 17:06

Let me be more specific: essentially I want to find the groups of rows
that share an element. Let's say each row in my CVS doc has 3 elements.
I want to iterate across every group of rows that share the same value
for the first element. Hope this makes sense.

I'm assuming you meant CSV (not CVS).

See if this gets you going:

Firefly:~/Desktop$ cat data.csv
one,1,A
one,2,B
one,3,C
two,1,A
two,2,B
three,1,A
Firefly:~/Desktop$ irb -r csv
>> rows = CSV.read("data.csv")
=> [["one", "1", "A"], ["one", "2", "B"], ["one", "3", "C"], ["two", "1", "A"], ["two", "2", "B"], ["three", "1", "A"]]
>> groups = rows.map { |row| row.first }.uniq
=> ["one", "two", "three"]
>> groups.each do |group|
?> puts group
>> rows.select { |row| row.first == group }.each { |row| puts " #{row.inspect}" }
>> end
one
   ["one", "1", "A"]
   ["one", "2", "B"]
   ["one", "3", "C"]
two
   ["two", "1", "A"]
   ["two", "2", "B"]
three
   ["three", "1", "A"]
=> ["one", "two", "three"]

James Edward Gray II

···

On Aug 30, 2006, at 11:01 AM, Drew Olson wrote:

Drew_Olson · 30 August 2006 18:47

Thank you both for the responses. Both seem to be EXTREMELY helpful.
I'll be sure to post issues I have in the form in the future.

Thanks,
Drew

···

--
Posted via http://www.ruby-forum.com/.

Topic		Replies	Views
Seeking the Ruby way ruby-talk	9	80	3 February 2006
CVS parsing, counting rows and row items ruby-talk	5	104	12 September 2007
Parsing a CSV file having multiple records in RUBYp ruby-talk	7	125	27 December 2006
Count distinct values in csv ruby-talk	3	106	23 March 2007
#help in reading from csv file ruby-talk	2	110	7 December 2006

Parsing CSV file with ruby

Related topics