Hash keys

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?
2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

Thanks

···

--
Posted via http://www.ruby-forum.com/.

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?

   Symbol keys
         + :a is one less keystroke than "a"
         +? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
         - More symbols get interned which can't get garbage
collected, even after the hash is.

   String keys

        + keys can be GCed when removed from hash, or when the hash is GCed.

2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

Rails is probably most responsible for popularizing symbol keys.
ActiveSupport implements a HashWithIndifferentAccess which can use
either symbols or strings interchangeably in access methods.
Internally it uses string keys.

···

On 2/11/08, J. Cooper <nefigah@gmail.com> wrote:

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?

----------------- 8< -----------------
Symbols, look at this code

Symbol.send( :define_method, :to_proc ){
  lambda{|x| x.send self }
} unless RUBY_VERSION === /^1\.9/

string_keys = %w{a b c}
symbol_keys = string_keys.map(&:to_sym)

string_hash = Hash[ *string_keys.zip([42]*3).to_a.flatten ]
symbol_hash = Hash[ *symbol_keys.zip([42]*3).to_a.flatten ]

p [:symbol, symbol_hash]
p [:string, string_hash]
puts "So far everything looks fine"
string_hash.each_pair do |k,v| k << "..." end
------------------------ 8< ----------------------
Ruby does a fine job by freezing keys of hashes, but I prefer to use
immutable objects as keys whenever it is possible
for that very reason, in our case that favors Symbols

2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

Handle them? If I pretended to be even more stupid than I actually
believe to be I would say yes sure
a_hash.clear :wink:

But I guess that you want to change from one to the other, let me show
you from String to Symbol
------------------------ 8< ----------------------
### Do not do this at home :slight_smile:
Array.send :define_method, :each_with_index do
  count = 0
  inject(){|iwi,e| count+=1; iwi << [e,count=count+1]}
end unless RUBY_VERSION === /^1\.9/
string_hash = %w{A Brave New
World}.each_with_index.inject({}){|h,(v,i)| h.update v => i }
p string_hash

symbol_hash = Hash[ *string_hash.to_a.map{|k,v|[k.to_sym,v]}.flatten ]
p symbol_hash
------------------------ 8< ----------------------
HTH
Robert

···

On Feb 11, 2008 8:32 PM, J. Cooper <nefigah@gmail.com> wrote:

Thanks
--
Posted via http://www.ruby-forum.com/\.

--
http://ruby-smalltalk.blogspot.com/

---
Whereof one cannot speak, thereof one must be silent.
Ludwig Wittgenstein

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?

It depends: my personal convention is this: use symbols if the set of keys is limited and probably known beforehand; use strings if the data is read from an external resource (e.g. a file) and there could be arbitrary key values.

2) Is there a smooth way to handle hashes that may have been keyed in
either fashion?

I do not think there is a smooth one size fits all way. You could of course convert a Hash containing on set of keys to the other one. I don't think it is worthwhile though and haven't seen it so far.

Kind regards

  robert

···

On 11.02.2008 20:32, J. Cooper wrote:

Hmm very interesting but is this not rather an implementation choice?
Which does not make the information less valuable of course, just
curious?

Cheers
Robert

···

On Feb 11, 2008 9:51 PM, Rick DeNatale <rick.denatale@gmail.com> wrote:

On 2/11/08, J. Cooper <nefigah@gmail.com> wrote:
> I ran into something I hadn't realized (common occurrence). Keying a
> hash with a symbol is not the same as using a string. I guess it makes
> sense, but I've seen it done both ways, and I had been always using
> symbols for my keys. But then I ran into an issue on loading an external
> YAML object into a hash, and didn't realize it was keyed with strings so
> I got errors the first time around.
>
> So two questions:
> 1) What is the preferred method of keying hashes? Symbols, strings,
> other?

   Symbol keys
         + :a is one less keystroke than "a"
         +? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
         - More symbols get interned which can't get garbage
collected, even after the hash is.

Alright, so in general if the hash is going to interact with the outside
world, I should use string keys, and it's not worth it particularly to
worry about handling mismatch (unless I'm embarking on a Rails-sized
framework)?

I guess I had figured symbols made more sense, as a key is kinda just an
identifier and there isn't a reason to perform string functions on it.
But I didn't realize the deal with the GC

···

--
Posted via http://www.ruby-forum.com/.

Rick DeNatale wrote:

  

I ran into something I hadn't realized (common occurrence). Keying a
hash with a symbol is not the same as using a string. I guess it makes
sense, but I've seen it done both ways, and I had been always using
symbols for my keys. But then I ran into an issue on loading an external
YAML object into a hash, and didn't realize it was keyed with strings so
I got errors the first time around.

So two questions:
1) What is the preferred method of keying hashes? Symbols, strings,
other?
    
   Symbol keys
         + :a is one less keystroke than "a"
         +? performance of Hash with symbol keys MIGHT be slightly
faster, but probably insignificant.
         - More symbols get interned which can't get garbage
collected, even after the hash is.

   String keys

        + keys can be GCed when removed from hash, or when the hash is GCed.
  
Another thing to look at is symbols are unique, whereas each time you use a string literal to access the hash, you are creating a new object:

h1 = { :a => 1, :b => 2 }
h2 = { "a" => 1, "b" => 2 }

h2["a"] # Creates a one-time use string "a"
h1[:a] #No new object created, :a already exists

But if you are creating lots and lots of symbols, then that's lots of unique objects being created which are not going to be garbage collected.

Of course, strings and symbols are not the only choices for hash keys, any object can be used. What you want may depend on the circumstances.

-Justin

···

On 2/11/08, J. Cooper <nefigah@gmail.com> wrote:

http://api.rubyonrails.org/classes/HashWithIndifferentAccess.html

http://facets.rubyforge.org/rdoc/core/classes/Hash.html#M000070

(Neither are a refutation of your statements, just throwing a few data
points into a discussion that I'm too busy to formally join at the
moment.)

···

On Feb 11, 2:27 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

I do not think there is a smooth one size fits all way. You could of
course convert a Hash containing on set of keys to the other one. I
don't think it is worthwhile though and haven't seen it so far.

Alright, so in general if the hash is going to interact with the outside world, I should use string keys, and it's not worth it particularly to worry about handling mismatch (unless I'm embarking on a Rails-sized framework)?

I am not sure what you mean by "handling mismatch". If by mismatch you mean access with symbols and strings: I usually do not worry about this,
because I write the code that puts data into the Hash and reads it - so I know what happens or can control. Personally I prefer uniform access.

If by "outside world" you mean, a "data source that you do not control" (e.g. web server logfiles, CSV data) then yes, in those cases I would use Strings, namely the strings I read from that source.

I guess I had figured symbols made more sense, as a key is kinda just an identifier and there isn't a reason to perform string functions on it. But I didn't realize the deal with the GC

Well, if there is a limited set (e.g. states of an object like :open and :closed for an IO stream) then it makes perfectly sense to use symbols.

Kind regards

  robert

···

On 11.02.2008 22:58, J. Cooper wrote: