Symbol vs string for hash keys

What are the important factors to consider when deciding whether to
use symbols or strings for hash keys?

I ask b/c I noticed that fileutils.rb uses strings[1], though the keys
represent method names. Seems to me that symbols would be more
appropriate.

Even so it would be good to know the general criteria to consider.

[1]https://github.com/ruby/ruby/blob/trunk/lib/fileutils.rb
(OPT_TABLE)

Even so it would be good to know the general criteria to consider.

Conceptually, I like to use these rules of thumb (I think Jim Weirich
noted it originally, but I am not sure):

1.) If the content and exact sequence of characters is the important
part, use a string.

2.) If the identity is the important part, use a symbol.

~ jf

···

--
John Feminella
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On Sun, Jul 3, 2011 at 11:28, Intransition <transfire@gmail.com> wrote:

What are the important factors to consider when deciding whether to
use symbols or strings for hash keys?

I ask b/c I noticed that fileutils.rb uses strings[1], though the keys
represent method names. Seems to me that symbols would be more
appropriate.

Even so it would be good to know the general criteria to consider.

[1]https://github.com/ruby/ruby/blob/trunk/lib/fileutils.rb
(OPT_TABLE)

Strings are mutable; symbols are not.

There is only one instance of any given literal symbol; there can be many
instances of a given literal string.

Basically, if it is more important to be able to operate on the names of
your hash keys, strings are appropriate; if performance is more
important, or you are sure no such operation on the names will be
necessary, symbols are appropriate.

It also makes sense to consider the fact that if you're selecting hash
keys based on some kind of input, strings are easier -- because inputs
tend to be strings rather than symbols, and would thus need to be
translated into symbols if you use symbols as your hash keys.

That's the ugly, hand-wavy, implementation-aware answer, I guess. The
more conceptual answer is that symbols are identities and strings are
data.

I suspect strings are used in the example you provided because it is
expected that when dealing with OPT_TABLE one might operate on a string
in some manner before using it as a hash key. I'm really not sure,
though.

···

On Mon, Jul 04, 2011 at 12:28:03AM +0900, Intransition wrote:

What are the important factors to consider when deciding whether to
use symbols or strings for hash keys?

I ask b/c I noticed that fileutils.rb uses strings[1], though the keys
represent method names. Seems to me that symbols would be more
appropriate.

Even so it would be good to know the general criteria to consider.

[1]https://github.com/ruby/ruby/blob/trunk/lib/fileutils.rb
(OPT_TABLE)

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Basically, if it is more important to be able to operate on the names of
your hash keys, strings are appropriate; if performance is more
important, or you are sure no such operation on the names will be
necessary, symbols are appropriate.

And, of course, you can convert to and from symbols/strings. Beware
the spaces, though:

"a space".to_sym

=> :"a space"

:"a space".to_s

=> "a space"

:symbolic_snake.to_s

=> "symbolic_snake"

···

On Sun, Jul 3, 2011 at 6:33 PM, Chad Perrin <code@apotheon.net> wrote:

--
Phillip Gawlowski

twitter.com/phgaw
phgaw.posterous.com

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
-- Leibniz

Hi all,

First, thanks to Jeremy for the Ruby Koans link.

Now I'm puzzled, Sensei!

I'm trying to write the code for the cases on lines 12 & 13 (see below) and I don't understand why they should be expected to raise exceptions. They appear to be isosceles triangles and pass the isosceles test.

Am I missing something subtle? Or something obvious?

**Leigh

"Things are not as they appear; nor are they otherwise."

···

---------------

The Master says:
  You have not yet reached enlightenment.
  I sense frustration. Do not be afraid to ask for help.

The answers you seek...
  TriangleError expected but nothing was raised.

Please meditate on the following code:
  /Users/leigh/koans/about_triangle_project_2.rb:12:in `test_illegal_triangles_throw_exceptions'

  def test_illegal_triangles_throw_exceptions
    assert_raise(TriangleError) do triangle(0, 0, 0) end
    assert_raise(TriangleError) do triangle(3, 4, -5) end
    assert_raise(TriangleError) do triangle(1, 1, 3) end # line 12
    assert_raise(TriangleError) do triangle(2, 4, 2) end # line 13
end

These tests all pass, including two from the illegal exceptions test:

  def test_isosceles_triangles_have_exactly_two_sides_equal
    assert_equal :isosceles, triangle(3, 4, 4)
    assert_equal :isosceles, triangle(4, 3, 4)
    assert_equal :isosceles, triangle(4, 4, 3)
    assert_equal :isosceles, triangle(10, 10, 2)
    
    assert_equal :isosceles, triangle(1, 1, 3) # these two are from
    assert_equal :isosceles, triangle(2, 4, 2) # test_illegal_triangles_throw_exceptions
  end

Here's the triangle method:

def triangle(a, b, c)
  if (a <= 0) || (b <= 0) || (c <= 0)
    raise TriangleError
  end
  case
  when (a == b) && (a == c) && (b == c)
    return :equilateral
  when (b == c) || (a == c) || (a == b)
    return :isosceles
  when !(a == b) && !(a == c) && !(b == c)
    return :scalene
  else
    return :oops!
  end
end

<snip>

Strings are mutable; symbols are not.

And, IIRC, strings are GCed, while syms are not. I still use symbols
whenever it seems to make sense though.
But in case of enormous transient data that might become a problem.
Cheers
Robert

···

--
I'm not against types, but I don't know of any type systems that
aren't a complete pain, so I still like dynamic typing.
--
Alain Kay

I use a different rule of thumb which has been circulated before as
well - I can't really remember who came up with this.

1. If the set is not fixed (typically because it depends on input
obtained from somewhere) use String.

2. If the set of values is fixed (typically because the program
includes just a fixed number of keys) use Symbols.

According to that standards Fileutils should really be using Symbols
as keys and not Strings since OPT_TABLE is completely used internally
only (contains valid options of methods).

Btw, if at all conversion between the two should be employed I'd do it
as part of input processing. This means, typically, if you have a
configuration file which may select a few items then a conversion from
String to Symbol would be part of the process which reads the
configuration file. Internally I would make the program always work
with Symbols.

Kind regards

robert

···

On Sun, Jul 3, 2011 at 5:43 PM, John Feminella <johnf@bitsbuilder.com> wrote:

Even so it would be good to know the general criteria to consider.

Conceptually, I like to use these rules of thumb (I think Jim Weirich
noted it originally, but I am not sure):

1.) If the content and exact sequence of characters is the important
part, use a string.

2.) If the identity is the important part, use a symbol.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Are these really triangles? Try to draw them.

-- Bill

···

On 2011-07-03 1:32 PM, Leigh Daniels wrote:

Hi all,

First, thanks to Jeremy for the Ruby Koans link.

Now I'm puzzled, Sensei!

I'm trying to write the code for the cases on lines 12& 13 (see below) and I don't understand why they should be expected to raise exceptions. They appear to be isosceles triangles and pass the isosceles test.

Am I missing something subtle? Or something obvious?

**Leigh

     assert_equal :isosceles, triangle(1, 1, 3) # these two are from
     assert_equal :isosceles, triangle(2, 4, 2) # test_illegal_triangles_throw_exceptions

Yea, that's what I was wondering about. In the case of fileutils, I
don't think they are ever going to be GCed, or don't need to be. So I
thought, well maybe there is an upper limit to the number symbols? And
so symbols were avoided simply to not add to that count? But that
seems unlikely. Probably this fileutils code was originally written
before 1.9 moved all method lists to symbols, and it's just stayed
that way.

···

On Jul 3, 2:31 pm, Robert Dober <robert.do...@gmail.com> wrote:

<snip>> Strings are mutable; symbols are not.

And, IIRC, strings are GCed, while syms are not. I still use symbols
whenever it seems to make sense though.
But in case of enormous transient data that might become a problem.

Thanks robert (et al.). Makes perfect sense. If my current patch is
accepted I will make a new one using symbols.

···

On Jul 4, 12:08 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On Sun, Jul 3, 2011 at 5:43 PM, John Feminella <jo...@bitsbuilder.com> wrote:
>> Even so it would be good to know the general criteria to consider.

> Conceptually, I like to use these rules of thumb (I think Jim Weirich
> noted it originally, but I am not sure):

> 1.) If the content and exact sequence of characters is the important
> part, use a string.

> 2.) If the identity is the important part, use a symbol.

I use a different rule of thumb which has been circulated before as
well - I can't really remember who came up with this.

1. If the set is not fixed (typically because it depends on input
obtained from somewhere) use String.

2. If the set of values is fixed (typically because the program
includes just a fixed number of keys) use Symbols.

According to that standards Fileutils should really be using Symbols
as keys and not Strings since OPT_TABLE is completely used internally
only (contains valid options of methods).

Btw, if at all conversion between the two should be employed I'd do it
as part of input processing. This means, typically, if you have a
configuration file which may select a few items then a conversion from
String to Symbol would be part of the process which reads the
configuration file. Internally I would make the program always work
with Symbols.

I love the pencil!

Thanks, Bill.

**Leigh

···

Am I missing something subtle? Or something obvious?

**Leigh

     assert_equal :isosceles, triangle(1, 1, 3) # these two are from
     assert_equal :isosceles, triangle(2, 4, 2) #

Are these really triangles? Try to draw them.

-- Bill

The pencil test is great, and now you understand the reasoning to learn the
generalization, the triangle inequality:

Since 1 + 1 < 3, you know that this can't be a valid triangle. Likewise, 2
+ 2 = 4, which is still not good enough. There are exceptions to this rule,
but those exceptions are in non-Euclidean, non-spherical coordinate
systems. You're probably not going to encounter triangles in such a
coordinate system unless you're working in advanced math/physics.

TLDR: Any triangle with sides A, B and C must satisfy this property

A + B > C (you can rearrange any of the letters and this should be true)

···

On Sun, Jul 3, 2011 at 11:13 AM, comcast.net Leigh Daniels <lcdpublic@>wrote:

I love the pencil!

Thanks, Bill.

**Leigh

>> Am I missing something subtle? Or something obvious?
>>
>> **Leigh
>
>> assert_equal :isosceles, triangle(1, 1, 3) # these two are from
>> assert_equal :isosceles, triangle(2, 4, 2) #
>
>Are these really triangles? Try to draw them.
>
>-- Bill