Use mulit-dim. Arrays? [Questions from a Ruby Newbie (file io and d ata structures)...]

I’ve got the file IO hammered out. I think I can read each line into a string
and then parse into an array using:
arr = IO.readlines(“printers.txt”)
columns_ar = Array.new()
columns_ar = arr[0].split("\t")
p columns_ar
p columns_ar.length

I know I can read each line and split it after I read it. But I’m not sure how
to populate my arrays? Should I use one array for each column? If so, how do I
create the correct number of arrays (and name them appropriately) at runtime?

Should I use 1 multidimensional array? Or an hash of some kind? The order of
these things is important.

Thanks very much for everyone’s help. This list is great. I want to try to
introduce Ruby at all my client sites.

Christopher

Christopher J. Meisenzahl CPS, CSTE
Senior Software Testing Consultant
Spherion
christopher.j.meisenzahl@citicorp.com
(585)-248-7749

···

-----Original Message-----
From: Meisenzahl, Christopher J.
Sent: Tuesday, January 28, 2003 7:49 AM
To: 'ruby-talk@ruby-lang.org’
Subject: Questions from a Ruby Newbie (file io and data structures)…

I’ve decided on a small project to attempt to learn Ruby beyond just flipping
through the book. By the way, I can’t say enough good things about the pickaxe
book. I picked it up in spite having the electronic version.

I would like to read in a tab (or comma) delimited text file one line or one
item at a time. The input file might look like this:

col1,col2,col3
a,b,c
d,e,f
g,h
j

Note that there is not the same amount of data in each column. It could vary.

What is the best way to read this in from a file? When I’m done I would like to
have all the items in col1 in an array, all the items in col2 in an array, and
so on.

I’m sure this is a rudimentary task, but I would like to see the most elegant
ways in which Ruby permits something like this to be done.

Any other thoughts very much appreciated!

Thanks very much in advance!
Christopher

Christopher J. Meisenzahl CPS, CSTE
Senior Software Testing Consultant
Spherion
christopher.j.meisenzahl@citicorp.com

one of the things that i LOVE, LOVE, LOVE about ruby is that it’s so easy to
write an object to encapsulate common scripting tasks. there were some slides
floating around somewhere in which matz said something to the effect that
‘what’s good for programming is good for scripting’, refering to
object-oriented programming. for example, i have a similar problem to yours,
where we have all these silly config files floating around, each with a
slightly different syntax - you could parse each one out in every script, or
you could write a class which does this for you, for example (overly simplistic)

say you have two config files

FILE : ‘in.csv’
----CUT----

a comment

col1,col2,col3
a,b,c
d,e,f
g,h # another comment

j

----CUT----

and

FILE : ‘printers.txt’
----CUT----

type name

hp bugs
xerox fg-color
----CUT----

‘in.csv’ is a comma-separated values file, while ‘printers.txt’ is a
tab-delimited file. both could contain comments and empty lines. here’s a
silly parser which handles both types

#!/usr/bin/env ruby

class Cfg < Array

DEFAULT_DELIM = ‘,’
DEFAULT_COMMENT = ‘#’

attr_reader :max_cols
attr_accessor :comment
attr_accessor :delim

def initialize path, delim = DEFAULT_DELIM, comment = DEFAULT_COMMENT
@max_cols = 0
@delim = delim
@comment = comment
comment_pat = %r{#{comment}.*$}
lines = IO.readlines path

lines.each do |line|
  line.strip!
  next if line.size == 0
  line[comment_pat] = ''
  row = line.split delim
  next unless row.size > 0
  @max_cols = [row.size, @max_cols].max
  self << row
end

end

def to_s
(map {|r| r.join (delim)}).join ($/) << $/
end

def pad value = nil
map! {|r| r.fill value, r.size…max_cols}
end
end

if $0 == FILE
csv = Cfg.new ‘in.csv’
print csv
csv.pad ‘!’
print csv

printers = Cfg.new ‘printers.txt’, “\t”
print printers
end

the program outputs this :

/usr/home/howardat/eg/ruby > ./Csv.rb

col1,col2,col3
a,b,c
d,e,f
g,h
j
col1,col2,col3
a,b,c
d,e,f
g,h ,!
j,!,!
hp bugs
xerox fg-color

my point is simply that, although you can do things really tersely and slickly
in ruby (as you can in perl) i think it’s ability to use OO programming for
every day tasks is where it really shines. if you wrote a class similar to
the one above, and made it recognize which type of config file it was reading
based on file extension (for example), and saved this class into your
site_ruby/ directory, all your scripts which require cfg information start
looking really short and sweet, something like :

require ‘cfg.rb’

cfgpath = ARGV.shift

cfg = Cfg.new cfgpath

cfg.each do |row|
… do something
end

which is exactly what i’ve done at our site.

just thought i’d point all that out since, IMHO, it’s the OO aspect of ruby
which is the most usefull, and yet also the aspect which requires the biggest
paradigm shift.

-a

···

On Wed, 29 Jan 2003 christopher.j.meisenzahl@citicorp.com wrote:

I’ve got the file IO hammered out. I think I can read each line into a string
and then parse into an array using:

====================================

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

Have not you read my previous mail? I answered all your question with 2 lines of
ruby code.

I have forgotten chomp, so that is it:

a =
IO.foreach { |l| l.chomp.split(“,”).each_with_index{ |s,i| (a[i] ||= ) << s } }

a[i] is an array conatining the strings in the i-th column.

···

christopher.j.meisenzahl@citicorp.com wrote:

I know I can read each line and split it after I read it. But I’m not sure how
to populate my arrays? Should I use one array for each column? If so, how do I
create the correct number of arrays (and name them appropriately) at runtime?

Thank you for posting that script. I learned a fair bit from it.
I’ll keep it for my own use.

Ruby rocks.

···

On Wed, Jan 29, 2003 at 01:41:55AM +0900, ahoward wrote:

#!/usr/bin/env ruby

class Cfg < Array

DEFAULT_DELIM = ‘,’
DEFAULT_COMMENT = ‘#’

attr_reader :max_cols
attr_accessor :comment
attr_accessor :delim

def initialize path, delim = DEFAULT_DELIM, comment = DEFAULT_COMMENT
@max_cols = 0
@delim = delim
@comment = comment
comment_pat = %r{#{comment}.*$}
lines = IO.readlines path

lines.each do |line|
  line.strip!
  next if line.size == 0
  line[comment_pat] = ''
  row = line.split delim
  next unless row.size > 0
  @max_cols = [row.size, @max_cols].max
  self << row
end

end

def to_s
(map {|r| r.join (delim)}).join ($/) << $/
end

def pad value = nil
map! {|r| r.fill value, r.size…max_cols}
end
end

if $0 == FILE
csv = Cfg.new ‘in.csv’
print csv
csv.pad ‘!’
print csv

printers = Cfg.new ‘printers.txt’, “\t”
print printers
end

the program outputs this :

/usr/home/howardat/eg/ruby > ./Csv.rb

col1,col2,col3
a,b,c
d,e,f
g,h
j
col1,col2,col3
a,b,c
d,e,f
g,h ,!
j,!,!
hp bugs
xerox fg-color

my point is simply that, although you can do things really tersely and slickly
in ruby (as you can in perl) i think it’s ability to use OO programming for
every day tasks is where it really shines. if you wrote a class similar to
the one above, and made it recognize which type of config file it was reading
based on file extension (for example), and saved this class into your
site_ruby/ directory, all your scripts which require cfg information start
looking really short and sweet, something like :

require ‘cfg.rb’

cfgpath = ARGV.shift

cfg = Cfg.new cfgpath

cfg.each do |row|
… do something
end

which is exactly what i’ve done at our site.

just thought i’d point all that out since, IMHO, it’s the OO aspect of ruby
which is the most usefull, and yet also the aspect which requires the biggest
paradigm shift.

-a

====================================

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================


Daniel Carrera
Graduate Teaching Assistant. Math Dept.
University of Maryland. (301) 405-5137