Feature request: Array.to_h

I use the following snippet a lot, and think it’s worth including in
the Ruby built-in Array class.

class Array

def to_h(default = nil)
result = Hash.new(default)
each_with_index do |value, index|
result[index] = value
end
result
end

end

Example usage:

[:a, :b, :c].to_h # => {0=>:a, 1=>:b, 2=>:c}

I often use it in processing comma-separated variable (CSV) files with
a header row, for example:

header_row = "year,month,day,price\n"
index_of = header_row.chomp.split(’,’).to_h.invert

=> {“month”=>1, “price”=>3, “day”=>2, “year”=>0}

data_row = "2002,1,17,1.5\n"
cells = data_row.chomp.split(’,’)

Hi –

I use the following snippet a lot, and think it’s worth including in
the Ruby built-in Array class.

class Array

def to_h(default = nil)
result = Hash.new(default)
each_with_index do |value, index|
result[index] = value
end
result
end

end

Example usage:

[:a, :b, :c].to_h # => {0=>:a, 1=>:b, 2=>:c}

My inclination would be for a basic #to_h to operate more like:

%w{a b c d} => {“a”=>“b”, “c”=>“d”}

and for the method that added in the array indices to be called
something else. Also, I think you’re doing it backwards :slight_smile: For one
thing, if you’ve got duplicate values in your array, your hash will
end up truncated. Also, I think an indexed #to_h sort of grows out of
the idea that an array is, or can be viewed as, a hash whose keys
happen to be positive integers. That means that you’d want the
indices to be the keys of the new hash, not the values.

There’s a bunch of versions of this floating around… See the
thread starting at http://www.ruby-talk.org/6582, also an RCR at
http://www.rubygarden.com/article.php?sid=61 (which was posted by
me, though my name isn’t on it; I take full responsibility for the
idea of the name “hashify”, though I can’t really remember why I
proposed that name).

David

···

On Fri, 17 Jan 2003, Tom Payne wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hello dblack,

Friday, January 17, 2003, 3:26:01 PM, you wrote:

Example usage:

[:a, :b, :c].to_h # => {0=>:a, 1=>:b, 2=>:c}

My inclination would be for a basic #to_h to operate more like:

%w{a b c d} => {“a”=>“b”, “c”=>“d”}

p Hash[* %w{a b c d} ] # => {“a”=>“b”, “c”=>“d”}

···


Best regards,
Bulat mailto:bulatz@integ.ru

dblack@candle.superlink.net wrote in message news:Pine.LNX.4.44.0301170722590.17053-100000@candle.superlink.net

Also, I think you’re doing it backwards :slight_smile: For one
thing, if you’ve got duplicate values in your array, your hash will
end up truncated. Also, I think an indexed #to_h sort of grows out of
the idea that an array is, or can be viewed as, a hash whose keys
happen to be positive integers. That means that you’d want the
indices to be the keys of the new hash, not the values.

The code I posted does use the indicies as the keys:

irb(main):014:0> h = [:a, :b, :c].to_h
=> {0=>:a, 1=>:b, 2=>:c}
irb(main):015:0> h.keys
=> [0, 1, 2]

Glad we’re otherwise in agreement :slight_smile:

Tom

Hi –

dblack@candle.superlink.net wrote in message news:Pine.LNX.4.44.0301170722590.17053-100000@candle.superlink.net

Also, I think you’re doing it backwards :slight_smile: For one
thing, if you’ve got duplicate values in your array, your hash will
end up truncated. Also, I think an indexed #to_h sort of grows out of
the idea that an array is, or can be viewed as, a hash whose keys
happen to be positive integers. That means that you’d want the
indices to be the keys of the new hash, not the values.

The code I posted does use the indicies as the keys:

irb(main):014:0> h = [:a, :b, :c].to_h
=> {0=>:a, 1=>:b, 2=>:c}
irb(main):015:0> h.keys
=> [0, 1, 2]

Whoops, sorry, I inverted your inversion.

Glad we’re otherwise in agreement :slight_smile:

Almost :slight_smile: See http://www.ruby-talk.org/6663. I seem to evade
naming any of the variants just “to_h” (they’re to_h_raw, to_h_ikeys),
etc. I’m inclined to think “to_h” would be what I called “to_h_raw”,
i.e.: [1,2,3,4].to_h => { 1 => 2, 3 => 4 }, but I’m not sure.

David

···

On Sun, 19 Jan 2003, Tom Payne wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

dblack@candle.superlink.net wrote in message news:Pine.LNX.4.44.0301181254230.23130-100000@candle.superlink.net

Almost :slight_smile: See http://www.ruby-talk.org/6663. I seem to evade
naming any of the variants just “to_h” (they’re to_h_raw, to_h_ikeys),
etc. I’m inclined to think “to_h” would be what I called “to_h_raw”,
i.e.: [1,2,3,4].to_h => { 1 => 2, 3 => 4 }, but I’m not sure.

I think of Arrays as mapping integers >= 0 to values, and of Hashes of
mapping arbitrary keys to values. So, for me, the POLS for Array.to_h
would be to produce a Hash that maps the same keys as the array to the
same values, i.e [:a, :b].to_h => {0 => :a, 1 => b}.

Your to_h ([1,2,3,4].to_h => { 1 => 2, 3 => 4 }) might be more
familiar to a PERL programmer. IIRC, in PERL hashes can be written
[key0, value0, key1, value1, …], but I might be wrong.

N.B. Your implementation of to_h_raw from
http://www.ruby-talk.org/6663:

module Enumerable

The simplest to_h: array elements are taken two at a time.

(Not even indices – just elements.)

def to_h_raw
a = dup
h = {}
h[a.shift] = a.shift || nil until a.empty?
h
end
end

should really be in the Array class, not the Enumerable module,
because it uses the shift method: Arrays have this, but Enumerables
don’t.

An implementation using each (provided by Enumerable):

module Enumerable
def to_h_raw(default = nil)
result = Hash.new(default)
key = nil
index = 0
each do |element|
case index % 2
when 0 then key = element
when 1 then result[key] = element
end
index += 1
end
result[key] = nil if index % 2 == 1
result
end
end

[1, 2, 3, 4].to_h_raw => {1=>2, 3=>4}
[1, 2, 3, 4, 5].to_h_raw => {5=>nil, 1=>2, 3=>4}

Regards,

Tom

Hi –

dblack@candle.superlink.net wrote in message news:Pine.LNX.4.44.0301181254230.23130-100000@candle.superlink.net

Almost :slight_smile: See http://www.ruby-talk.org/6663. I seem to evade
naming any of the variants just “to_h” (they’re to_h_raw, to_h_ikeys),
etc. I’m inclined to think “to_h” would be what I called “to_h_raw”,
i.e.: [1,2,3,4].to_h => { 1 => 2, 3 => 4 }, but I’m not sure.

I think of Arrays as mapping integers >= 0 to values, and of Hashes of
mapping arbitrary keys to values. So, for me, the POLS for Array.to_h
would be to produce a Hash that maps the same keys as the array to the
same values, i.e [:a, :b].to_h => {0 => :a, 1 => b}.

Your to_h ([1,2,3,4].to_h => { 1 => 2, 3 => 4 }) might be more
familiar to a PERL programmer. IIRC, in PERL hashes can be written
[key0, value0, key1, value1, …], but I might be wrong.

(s/PERL/Perl/ :slight_smile: Yes, there’s something like that in Perl, but I’m
thinking more about what’s familiar to a Ruby programmer:

Hash[1,2,3,4] # => { 1 => 2, 3 => 4 }

and taking that as the model for the simplest case of creating a hash
from a list (literal or array-derived). It’s possible to argue that,
since we already have the above, we don’t need Array#to_h to emulate
it. I suppose I like the sort of organic quality of a to_h that does
what Hash[*arr] does, and then variants that build on that simplest
construct.

I don’t think POLS is a big issue here. Since there are several
versions of this in circulation already, whoever uses them pretty much
has to be willing to read the code and/or documentation anyway :slight_smile:

N.B. Your implementation of to_h_raw from
http://www.ruby-talk.org/6663:
should really be in the Array class, not the Enumerable module,
because it uses the shift method: Arrays have this, but Enumerables
don’t.

Whoops. Glitches from 60000 messages ago come back to haunt
me… :slight_smile:

An implementation using each (provided by Enumerable):

module Enumerable
def to_h_raw(default = nil)
result = Hash.new(default)
key = nil
index = 0
each do |element|
case index % 2
when 0 then key = element
when 1 then result[key] = element
end
index += 1
end
result[key] = nil if index % 2 == 1
result
end
end

[1, 2, 3, 4].to_h_raw => {1=>2, 3=>4}
[1, 2, 3, 4, 5].to_h_raw => {5=>nil, 1=>2, 3=>4}

You can trim this down a bit. Accessing an uninitialized array
element will give you nil, so you don’t have to test for that. Here’s
a version on the very-short-and-probably-inefficient end of the scale
– there’s probably a nice middle ground somewhere :slight_smile:

module Enumerable
def to_h_raw(default=nil)
Hash[*(if (size % 2).zero? then to_a else to_a + [default] end)]
end
end

p [1, 2, 3, 4].to_h_raw # => {1=>2, 3=>4}
p [1, 2, 3, 4, 5].to_h_raw # => {5=>nil, 1=>2, 3=>4}

h = [1,2,3].to_h_raw(10)
p h[3] # => 10

David

···

On Mon, 20 Jan 2003, Tom Payne wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav