Why was the "Symbol is a String"-idea dropped?

Robert Klemme schrieb:

Or, in other words: if the decision to unify Symbol and String would
have been taken at early stages of Ruby development, then the
general usage would have adapted to this, and ...
we might be happier with the result today.

I am in no way unhappy with the way it is today. Strings and symbols serve different purposes although there is some overlap. I rarely feel the need to convert between the two.

I see.
And I am quite surprised. Because judging from your online activity
you seem to have some experience.
Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

I am not aware of a situation where you would need to mix them as hash keys. And to make the distinction is pretty easy most of the time IMHO.

Not aware? I mean Rails mixes them, right?

Frankly, I believe there is an inherent advantage that you can use symbols vs. strings in code. And I mean not only performance wise but also readability wise.

Readability-wise: precisely what advantage?
The only thing that comes to my mind just now, is
that a separated Symbol class easily provides
distinct special values for a parameter that would normally carry a String.

Note though, that all these issues have nothing to do with the question
whether String and Symbol should be connected inheritance wise. IMHO that's mostly an implementation decision in Ruby.

Yes, I agree.
I am actually interested in the implications for the programmer.
My original question just arised out of the notion
that this implementation decision could have been a move
in a (to my mind) favourable direction.

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

Personally I believe it creates more expressiveness. If you view this as impurity, there are a lot of them in Ruby because Ruby's focus
has always been on pragmatism and not purity

1. The core structure must of course be large enough, and a large structure may look impure.
2. But regarding this particular question: My original notion was that keeping
     Symbol and String too separate is not pragmatic.
     (I may change my mind on that, if I read more posts like yours, though.)

So, I'll just have to come to terms with it. :slight_smile:
(And I will, of course -- there are enough other fascinating issues... :slight_smile: )

The capability to adjust to reality is a useful one IMHO. :slight_smile:

Well, yes, sometimes I'm glad someone tells me that. :slight_smile:

create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike

Why not? StringLike could even be a module that relies solely on [] and length to do all the non mutating stuff.

Ah, interesting. Can't follow the implications right now.

Given the fact that I don't mix symbols and strings as Hash keys I wouldn't benefit -
but it would not hurt me either. :slight_smile: YMMV

Yes that was the idea behind it: to benefit some and not to hurt the others.

Credits also go to the community that is still among the most civilized online communities I know so far!

Indeed, I'm experiencing it right now!
Thanks a lot!

Sven

···

On 15.05.2007 03:07, enduro wrote:

Ooops!

sorry if I came across rude in any way.

I don't want to "own" the thread.
But I am interested in my question,
so I was glad that someone repeated it,
at a time when all the answers up to that point had not yet answered it.

Robert Dober schrieb:

The fact that the original idea is a big paradigm shift does not
answer your question?

Sorry, no. If someone had told me that this fact was the basis for the decision of the core team,
that would have answered my question.
(Because the fact alone is not compelling: If a paradigm shift is possible and good then why not shift?)

And also, I thought that this was the right place for posting the question.
(Actually, until yesterday I didn't know that I could post on ruby-core,
I thought it was just for "cracks", because it's read-only on ruby-forum.com)

Kind Regards
Sven

And here again, Robert Dober's full text:

Thank you all for your replies.

And thank you, Xavier, for keeping the focus on my original intention.
Yes, I was not asking about general arguments for designing a class
hierarchy, but for the reasons for this particular decision of the
ruby-core team.

I really have not taken offense. However if you are interested in that
only you might post to ruby-core only.
I am kind of surprised that the considerations of Rick and YHS are
considered as OT.
If you do not like them maybe it would be polite to ignore them. But
talking about the topic on *this* list and ignoring all background
information about what symbols are and have been is kind of weird.
Please remember that Ruby has its inheritance in other languages
owning symbols as I believe to have pointed out.
The fact that the original idea is a big paradigm shift does not
answer your question?

I honestly do not understand that.

Threads just evolve I do not feel that they belong to OP :).
They do not belong to me either of course ;).
Cheers
Robert

Another question:
Who is

YHS

?

Regards, Sven

···

On 5/15/07, enduro <sven715rt@suska.org> wrote:

Robert Klemme schrieb:

Or, in other words: if the decision to unify Symbol and String would
have been taken at early stages of Ruby development, then the
general usage would have adapted to this, and ...
we might be happier with the result today.

I am in no way unhappy with the way it is today. Strings and symbols serve different purposes although there is some overlap. I rarely feel the need to convert between the two.

I see.
And I am quite surprised. Because judging from your online activity
you seem to have some experience.
Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

Yeah, maybe. So where are you using symbols where one normally would use strings?

I am not aware of a situation where you would need to mix them as hash keys. And to make the distinction is pretty easy most of the time IMHO.

Not aware? I mean Rails mixes them, right?

I don't use Rails. :-)))

Frankly, I believe there is an inherent advantage that you can use symbols vs. strings in code. And I mean not only performance wise but also readability wise.

Readability-wise: precisely what advantage?

If I see a symbol being used as a Hash key I immediately know (or rather guess) that there is only a limited amount of them and they are known beforehand, like with options.

# silly example
opts = {
   :length => 12,
   :width => 30,
}
# other code
resize( opts[:length] )

Whereas when strings are used it's typically stuff that is read from somewhere, like (another silly example):

ruby -aF: -ne 'BEGIN { $c=Hash.new(0) }; $c[$F[1]]+=1; END { $c.each {|k,v| print k, "=", v, "\n"}}' /etc/passwd

The only thing that comes to my mind just now, is
that a separated Symbol class easily provides
distinct special values for a parameter that would normally carry a String.

Don't forget the optical distinction between using 'string', "string" and :symbol.

Note though, that all these issues have nothing to do with the question
whether String and Symbol should be connected inheritance wise. IMHO that's mostly an implementation decision in Ruby.

Yes, I agree.
I am actually interested in the implications for the programmer.
My original question just arised out of the notion
that this implementation decision could have been a move
in a (to my mind) favourable direction.

As we all have different habits what may be favorable for one may be regrettable for the other. :slight_smile:

Yes, I sometimes think of that separation of Symbol from String
as a tiny impurity in the Ruby crystal.

Personally I believe it creates more expressiveness. If you view this as impurity, there are a lot of them in Ruby because Ruby's focus
has always been on pragmatism and not purity

1. The core structure must of course be large enough, and a large structure may look impure.

This somehow reminds me of http://en.wikipedia.org/wiki/Gödel's_incompleteness_theorem

2. But regarding this particular question: My original notion was that keeping
    Symbol and String too separate is not pragmatic.
    (I may change my mind on that, if I read more posts like yours, though.)

Just reread mine a few times - then you don't need the other postings any more. That's more efficient - you'll save bandwidth and reading is actually faster if you know the text already. :-))

So, I'll just have to come to terms with it. :slight_smile:
(And I will, of course -- there are enough other fascinating issues... :slight_smile: )

The capability to adjust to reality is a useful one IMHO. :slight_smile:

Well, yes, sometimes I'm glad someone tells me that. :slight_smile:

:-)) No sweat - following visions is useful as well. As always it's the mix...

create a class hierarchy similar to the Float/Integer hierarchy?
String < Stringlike
Symbol < Stringlike

Why not? StringLike could even be a module that relies solely on [] and length to do all the non mutating stuff.

Ah, interesting. Can't follow the implications right now.

For example regexp matching might be implemented similarly for both (i.e. just in one place). But then again, since RX functionality is highly integrated into the language that might not be a good idea - or the C code needs to become more complex to react differently if it sees a String or Symbol vs. some custom class that includes this module. Hm...

Given the fact that I don't mix symbols and strings as Hash keys I wouldn't benefit -
but it would not hurt me either. :slight_smile: YMMV

Yes that was the idea behind it: to benefit some and not to hurt the others.

The next best thing to a win win situation. :-))

Credits also go to the community that is still among the most civilized online communities I know so far!

Indeed, I'm experiencing it right now!
Thanks a lot!

You're welcome. Thank /you/!

Kind regards

  robert

···

On 15.05.2007 12:31, enduro (Sven Suska) wrote:

On 15.05.2007 03:07, enduro wrote:

Ooops!

sorry if I came across rude in any way.

I don't want to "own" the thread.
But I am interested in my question,
so I was glad that someone repeated it,
at a time when all the answers up to that point had not yet answered it.

Robert Dober schrieb:

> The fact that the original idea is a big paradigm shift does not
> answer your question?

Sorry, no. If someone had told me that this fact was the basis for the
decision of the core team,
that would have answered my question.
(Because the fact alone is not compelling: If a paradigm shift is
possible and good then why not shift?)

Sure that was exactly the thing I wanted to discuss and suddenly
someone told me hey stay On Topic. That was strange but not rude at
all. I mean neither Xavier nor you, you are very civilized and polite
people- maybe much more than YHS :wink:
I just had the feeling that the answers you will get on this list will
never correspond to your exact question, and I was wrong as Matz
stepped by.

I admit that personally I have a big problem with "A symbol is a
string", but brighter people than me like Tom and Matz have not or did
not have, so maybe indeed I am making too much noise while thinking
:(.

But please remember too that there are only complicated answers to
simple questions ;).

And also, I thought that this was the right place for posting the question.
(Actually, until yesterday I didn't know that I could post on ruby-core,
I thought it was just for "cracks", because it's read-only on
ruby-forum.com)

I definitely should have pointed that out first and than I could have
taken all the time to rant/argue/discuss the technical points, oh boy
how difficult communication can be sometimes!

Kind Regards
Sven

Cheers
Robert

···

On 5/15/07, enduro (Sven Suska) <sven715rt@suska.org> wrote:

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

>The programs for which it makes sense to convert strings (received from
>some
>external source, e.g. a database) to symbols for optimisation purposes,
>i.e.
>where the benefits are measurable, will be pretty few.
>
Yes, I agree.
(That's what I tried to address by the two lines after the quote above,
perhaps I should have put a smiley in there :slight_smile: )

>And you also open yourself to a symbol exhaustion denial-of-service.
>
>
Yes, of course.
But my point is: Let the system take care of that.
I want a Ruby that just works - crystal-clear, transparently, reliably.
:slight_smile:
And it already does in most cases. And there is a lot that can be improved.
And one such improvements could be a garbage collection for symbols. (I
think.)

But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

Now, there have been arguments that *all* strings in Ruby should have been
immutable in the first place, and I can sympathise with them. After all,
numbers are immutable, and so are certain other classes. But pragmatically,
there are cases where it is just so *useful* to append to a string. Besides,
maintaining the singleton property is hard for large binary objects - i.e.
when I create another 10MB binary dump, I have to check whether it's the
same as any other object which already exists.

(And of course, very large numbers are Bignums, which are not singletons)

>That is, as far as I know, the symbol table is never garbage collected.
>Once
>a symbol, always a symbol.
>
I'm not a core programmer, maybe i am asking to much,
but I think it should be possible without slowing anything down.
One very simple idea I can think of, is the following:
Set a limit to the number of symbols and if it is reached
the GC wil be invoked in a special symbol-mode, marking all symbols that are
still in use and completely re-generates the symbol-table from scratch.

Yes, but why??? In real life, real world programs, only a few hundred unique
method names are used. So let them be symbols.

If you are going to create a million different symbols, or symbols which are
millions of bytes long, then use a String. That's what they are there for!

"Doctor, it hurts when I do this" -- "Then don't do that!"

What you seem to be saying is "I don't want there to be two different types
of object, one for method names and one for holding blobs of data", but I
don't understand this. Symbols work, are fast, and personally I find them
aesthetically pleasing: one is a sort of tag for method names, and one is a
holder of blobs of data which may come from the outside world or from my own
computations.

Yes, I really must admit, I also like the cleanness of current Symbols.
But then, my experience is that this clearness is not worth a lot,
because the border towards "dirty" strings must be crossed often.
(That's why I called sticking to the clearness "temping" in my last post.)

I don't think so. The examples I've seen so far are:

(1) Method names which are created algorithmically. That is, you know you
have a method called "foo" and you want to call another method called
"foo=". It works, where's the problem?

    send("#{mname}=")

Yes, you've made a conversion to a string, and back again. Big deal. The
only way to improve this would be to have symbol algebra, e.g.
    (:foo + :=) == :foo=

But internally it would almost certainly be implemented the same way,
because you'd have to look up the symbol ID to convert it into its character
representation, manipulate the characters, and then lookup back into a
symbol.

Or, you'd have to drop symbols entirely and make *every* method call use a
string of characters as the method name - which would be very expensive.

Or, you'd have to make all Strings immutable, so that the the string ID
could be used as a method call tag. See above for reasons why that is
undesirable.

(2) Rails, which allows you to be inconsistent between :foo=>:bar and
:foo=>"bar" and "foo"=>:bar and "foo"=>"bar" (at least sometimes - not
always). IMO it would have been better if Rails had stuck to one or the
other, but that's too late to undo.

Rails has introduced its own bast^H^H^H^Hextensions to the language anyway.

Ruby is not yet good in many other aspects:
speed, threads, documentation.

There is really *excellent* documentation for Ruby. You have to pay for it,
but the books I am thinking of are well worth the money.

You may not like the idea that the language designer and contributors are
not getting any money directly for their work, whilst book publishers are. I
can live with that.

I find that speed is good enough, and threads are better than most (have you
tried writing threaded programs in Perl?)

The language is the crystal. It must be good in the beginning,
it becomes more solid with every project written in that language.

Many people don't seem to realise that Ruby is, what, 15 years old now?

Regards,

Brian.

···

On Tue, May 15, 2007 at 06:42:04PM +0900, enduro (Sven Suska) wrote:

Hello again,

Robert Klemme schrieb:

Perhaps it is also my programming style: I may use symbols where one
normally would use strings.

Yeah, maybe. So where are you using symbols where one normally would use strings?

Let me guess, because I don't know if I am really the only one:
1. Multipurpose-names:
  Like option-names, used as hash keys but also as names and labels for the corresponding graphics control etc.
2. Logging:
  Giving a brief hint in the form of a symbol (not the log level), well just because it is easier to type and looks nice

Not aware? I mean Rails mixes them, right?

I don't use Rails. :-)))

Oops :-), offending agian, am I? :slight_smile:
:slight_smile:

Frankly, I believe there is an inherent advantage that you can use symbols vs. strings in code.
And I mean not only performance wise but also readability wise.

Readability-wise: precisely what advantage?

If I see a symbol being used as a Hash key I immediately know (or rather guess)
that there is only a limited amount of them and they are known beforehand,
like with options.

# silly example
opts = {
  :length => 12,
  :width => 30,
}
# other code
resize( opts[:length] )

Sorry, don't get me wrong:

I DID NOT MEAN TO REMOVE the Symbol class.
Nor Symbol literals.

Thus, your examples would be valid and semantically equivalent code
after a "unification" of the classes (regardless if Symbol < String or not).
Or I'd better not call it "unification", I don't have a good word,
perhaps "joining" would be better.

Don't forget the optical distinction between using 'string', "string" and :symbol.

Also, this won't be affected, see above.

[...] on pragmatism and not purity

1. The core structure must of course be large enough, and a large structure may look impure.

This somehow reminds me of http://en.wikipedia.org/wiki/Gödel's_incompleteness_theorem

... mystery will always remain ...

2. But regarding this particular question: My original notion was that keeping
    Symbol and String too separate is not pragmatic.
    (I may change my mind on that, if I read more posts like yours, though.)

Just reread mine a few times - then you don't need the other postings any more. That's more efficient - you'll save bandwidth and reading is actually faster if you know the text already. :-))

Well,
as Ruby-users,
we don't sacrifice our fun to the god of efficiency, do we... :slight_smile:

Cheers,
Sven

···

On 15.05.2007 12:31, enduro (Sven Suska) wrote:

<snip>

But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

But of course we have immutable strings already :)))

class IString < String
   def initialize str
     super(str)
     freeze
   end
end

HTIOI (Hope this is of interest :wink:

<snip>
Cheers
Robert
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

···

On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:

Not responding to any particular posting.

One of the false memes that some folks on this thread seem to hold is
that Symbols are integers.

They aren't.

Any more than they are strings.

A given ruby symbol has both a string and an integer representation,
which can be obtained by using the to_s, and to_i But one would't say
that the object 1.2 is a string because it has a string
representation, or that the object "123" was an integer because it has
an integer representation.

The essential fact about symbols is that if two symbols have the same
string representation they are the same object, and that two different
symbols have two different integer representations. Or more formally

     sym1.to_s == sym2.to_s iff sym1.object_id == sym2.object_id
     sym1.to_i == sym2.to_i iff sym1.object_id == sym2.object_id

One way to implement this is to keep internal tables which map the
string and integer representations of symbols to each other, and to
have functional mappings between the object_ids and integer
representations of symbols. This is how ruby does it. Creating a
symbol from a string consists of looking for the string in the mapping
from strings to integer representations, and if it's not found
assigning the next integer rep and adding the string and integer rep
to the internal tables. This operation, called interning, happens
either at parse time when :foo is encountered, or later when an
expression like 'foo'.to_sym is executed.

The meme that "Symbols are Integers" probably lingers from an earlier
version of Ruby before there was an actual Symbol class. Back then,
symbols really were instances of Fixnum, but no more. This lives on
vestigially in that Symbol does have a to_int method as well as to_i,
but to_int is deprecated, using it produces a warning :

rick@frodo:~$ ruby -w -e"p :sym.to_int"
-e:1: warning: treating Symbol as an integer
10409

while to_i does not.
rick@frodo:~$ ruby -w -e"p :sym.to_i"
10409

Other languages, like Smalltalk, with similar concepts don't associate
integer representations with Symbols, in these languages the internal
mapping simply maps string representations to object id's, or to the
symbol objects themselves. I suspect that this feature of Ruby symbols
is simply due to the earlier implementation.

Now what are the useful properties of Symbols:

    1. Detecting whether or not two symbols are equal is as fast as
comparing their object_ids. This is an O(1) operation.
Detecting whether or not two strings are equal requires a scan of
both strings until either an unequal character is found or the end of
both strings is reached. This is an O(n) operation.
    2. Having 1000 'instances' of a symbol with a particular string
representation takes no more space than having 1

Property 1 means that things like hashes with symbol keys are somewhat
faster than hashes with string keys. This is why symbols are used as
method selectors, since dispatching a method call requires repeated
lookup in the method tables going up the inheritance chain. This is a
win if the key is looked up multiple times, there is an initial cost
of interning the symbol (which essentially consists of looking for the
string representation in an internal global symbol table) but this
cost is amortized over subsequent lookups.

It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys, doesn't actually take advantage of this since
it uses symbols converted to strings as the actual keys rather than
the other way around. This provides a bit of syntactic sugar, without
getting either the performance or space advantages of using symbols.

As for incompatibilies caused by the experiment, I'm not sure exactly
what Matz and the core team ran into but certainly this would break
code like:

case arg
when String
    # do something
when Symbol
    # do something else
end

Code like this exhibits the fragility of doing discrimination based on
classes in the face of refactoring.

···

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

<snip>

But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

But of course we have immutable strings already :)))

class IString < String
  def initialize str
    super(str)
    freeze
  end
end

What advantages does this have over using "freeze" directly?

str = "foo".freeze

It seems using a new class will increase the likelihood of things to break.

HTIOI (Hope this is of interest :wink:

LOL

You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Greetings to George, btw. :slight_smile:

  robert

···

On 15.05.2007 15:54, Robert Dober wrote:

On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:

<lots of interesting stuff snipped>

It seems that the HashWithIndifferentAccess class added by Rails in
ActiveSupport, which allows symbols and strings to be used
interchangeably as keys, doesn't actually take advantage of this since
it uses symbols converted to strings as the actual keys rather than
the other way around. This provides a bit of syntactic sugar, without
getting either the performance or space advantages of using symbols.

Is this whole String vs. Symbol idea motivated by Rails stuff?
I just do not know Rails but I would guess it is a dangerous thing if
paradigms that are useful in an application framework - even if it is
such a Great One as Rails - are to be applied to a General Purpose
Language.

I will rephrase OP's question now, why the h[ae]ck did the Core team
think about unifying Strings and Symbols in the first place ???
That is for sure something very interesting.
<more stuff snipped>

Robert

···

On 5/15/07, Rick DeNatale <rick.denatale@gmail.com> wrote:
--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Yes, but it's not a singleton.

It would only be of interest as a Symbol replacement if IString.new("foo")
always returned the same object. You could implement this using the Multiton
pattern I think.

Then you could safely use IString#object_id as a method name key.

Regards,

Brian.

···

On Tue, May 15, 2007 at 10:54:05PM +0900, Robert Dober wrote:

On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:
<snip>
>But then what you want are not symbols, but true immutable strings. By that
>I mean: some object where I can write 10MB of binary dump. If I want to add
>one character to the end of it, then I create another object containing
>10MB+1byte of binary dump, and the old 10MB object is garbage-collected.
But of course we have immutable strings already :)))

class IString < String
  def initialize str
    super(str)
    freeze
  end
end

> <snip>
>> But then what you want are not symbols, but true immutable strings. By
>> that
>> I mean: some object where I can write 10MB of binary dump. If I want
>> to add
>> one character to the end of it, then I create another object containing
>> 10MB+1byte of binary dump, and the old 10MB object is garbage-collected.
> But of course we have immutable strings already :)))
>
> class IString < String
> def initialize str
> super(str)
> freeze
> end
> end

What advantages does this have over using "freeze" directly?

Dunno :slight_smile:

x = IString.new("Hello World") # Not even tested yet
vs.
x="HelloWorld".freeze

Well the first one has the advantage that I thought about it :wink:

Now I reckon that the subclass stuff is baaad

def blah str
    raise ArgumentError unless IString === str
    ...
end

but now someone does
class MString < IString
   get rid of the freeze (by calling superclass.superclass.new in
self.class.new e.g)
end

and my code is broken, while in

def blah str
    raise ArgumentError unless str.respond_to? :frozen && str.frozen?
   ...
end

frozen is frozen forever.

So do what Robert told you and beware of what Robert told you;)

str = "foo".freeze

It seems using a new class will increase the likelihood of things to break.

> HTIOI (Hope this is of interest :wink:

LOL

> You see things; and you say Why?
> But I dream things that never were; and I say Why not?
> -- George Bernard Shaw

Greetings to George, btw. :slight_smile:

Well last time I met him he was admiring your posts to the list :stuck_out_tongue:

        robert

idem

···

On 5/15/07, Robert Klemme <shortcutter@googlemail.com> wrote:

On 15.05.2007 15:54, Robert Dober wrote:
> On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:

P.S. I'm aware of Symbol#to_i, but to_i and object_id appear to be
intimately related:

irb(main):001:0> :foo.to_i
=> 14817
irb(main):002:0> :foo.object_id
=> 148178
irb(main):003:0> :bar.to_i
=> 16081
irb(main):004:0> :bar.object_id
=> 160818
irb(main):005:0> :zzzzzzzzzzzzzzzz.to_i
=> 16089
irb(main):006:0> :zzzzzzzzzzzzzzzz.object_id
=> 160898
irb(main):007:0> :puts.to_i
=> 7345
irb(main):008:0> :puts.object_id
=> 73458
irb(main):009:0>

i.e. I don't think the symbol table maintains an explicit integer key for
each symbol.

···

On Tue, May 15, 2007 at 11:53:08PM +0900, Brian Candler wrote:

Yes, but it's not a singleton.

It would only be of interest as a Symbol replacement if IString.new("foo")
always returned the same object. You could implement this using the Multiton
pattern I think.

Then you could safely use IString#object_id as a method name key.

You've stated or implied a couple of times in this discussion that
symbols are 'singletons', but I thought the conventional definition
of 'singleton' was of a class with only a single instance, where the
instance is called a singleton. That doesn't describe Ruby's symbols.

I think what you are getting at is the idea that identity and
equality are one and the same for symbols. Fixnum instances also
have this property but floats don't. Is there a standard term for
that characteristic? I think in mathematics it would be an equivalence
relation ~ such that If x ~ y then x = y for all x, y in the set.
In this case ~ represents Ruby's == and = represents Ruby's equal?.

···

On May 15, 2007, at 10:53 AM, Brian Candler wrote:

On Tue, May 15, 2007 at 10:54:05PM +0900, Robert Dober wrote:

On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:
<snip>

But then what you want are not symbols, but true immutable strings. By that
I mean: some object where I can write 10MB of binary dump. If I want to add
one character to the end of it, then I create another object containing
10MB+1byte of binary dump, and the old 10MB object is garbage-collected.

But of course we have immutable strings already :)))

class IString < String
  def initialize str
    super(str)
    freeze
  end
end

Yes, but it's not a singleton.

<lots of interesting stuff snipped>
>
> It seems that the HashWithIndifferentAccess class added by Rails in
> ActiveSupport, which allows symbols and strings to be used
> interchangeably as keys,

Is this whole String vs. Symbol idea motivated by Rails stuff?

I will rephrase OP's question now, why the h[ae]ck did the Core team
think about unifying Strings and Symbols in the first place ???

I don't know. Probably not motivated, but on the other hand it no
doubt stimulated a reconsideration of the relationship between String
and Symbol.

Whether or not Strings and Symbols have an inheritance relationship is
a bit of an accidental design choice. Keeping in mind that in a
language like Ruby or Smalltalk, the class hierarchy is really about
implementation factoring and not type specification, as a first
approximation, it doesn't matter that much. In Smalltalk-80 Symbol is
a subclass of String, but I believe that Symbol overrode the methods
which mutate the instance to cause errors.

But once the decision was made, secondary effects ensue. If
programmers write code which depends on a particular inheritance
relationship like the case statement in my earlier post, then changes
to the decision will break things. It's like the story about how
Stewart Feldman decided to use tab as a lexical element in makefiles
and treat them differently from the equivalent whitespace. He
realized that this was a bad decision, but too late.

  "No discussion of make(1) would be complete without an
   acknowledgement that it includes one of the worst design botches
   in the history of Unix. The use of tab characters as a required leader
   for command lines associated with a production means that the
   interpretation of a makefile can change drastically on the basis of
invisible
   differences in whitespace.
   
        Why the tab in column 1? Yacc was new, Lex was brand new. I hadn't
        tried either, so I figured this would be a good excuse to learn. After
       getting myself snarled up with my first stab at Lex, I just did something
       simple with the pattern newline-tab. It worked, it stayed. And then a
       few weeks later I had a user population of about a dozen, most of them
       friends, and I didn't want to screw up my embedded base. The rest,
       sadly, is history.
                                               -- Stuart Feldman

Not that I'm saying that Matz's decision on Symbol not being a
subclass of String was a bad one, I'm not, and it's certainly not in
the class of the tab/whitespace 'decision' in make. What I am saying
is that once made these decisions can quickly generate their own
requirements to exist once a user base has been established.

···

On 5/15/07, Robert Dober <robert.dober@gmail.com> wrote:

On 5/15/07, Rick DeNatale <rick.denatale@gmail.com> wrote:

From: http://www.faqs.org/docs/artu/ch15s04.html

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

> <snip>
>> But then what you want are not symbols, but true immutable strings. By
>> that
>> I mean: some object where I can write 10MB of binary dump. If I want
>> to add
>> one character to the end of it, then I create another object containing
>> 10MB+1byte of binary dump, and the old 10MB object is garbage-collected.
> But of course we have immutable strings already :)))
>
> class IString < String
> def initialize str
> super(str)
> freeze
> end
> end

What advantages does this have over using "freeze" directly?

Dunno :slight_smile:

x = IString.new("Hello World") # Not even tested yet
vs.
x="HelloWorld".freeze

Well the first one has the advantage that I thought about it :wink:

Now I reckon that the subclass stuff is baaad

def blah str
   raise ArgumentError unless IString === str
   ...
end

but now someone does
class MString < IString
  get rid of the freeze (by calling superclass.superclass.new in
self.class.new e.g)
end

and my code is broken, while in

def blah str
   raise ArgumentError unless str.respond_to? :frozen && str.frozen?
  ...
end

frozen is frozen forever.

Corrent. And since #frozen? is defined in Kernel you can skip the first test.

So do what Robert told you and beware of what Robert told you;)

:slight_smile:

> You see things; and you say Why?
> But I dream things that never were; and I say Why not?
> -- George Bernard Shaw

Greetings to George, btw. :slight_smile:

Well last time I met him he was admiring your posts to the list :stuck_out_tongue:

Wow! So he didn't die but just went home like this other guy who invented a vi clone (or at least provided his name for the operation)... :slight_smile:

        robert

idem

:slight_smile:

While we're at it: *if* you want to define something (and are a fan of C++) you can do this:

irb(main):001:0> module Kernel
irb(main):002:1> private
irb(main):003:1> def const(*a) a.each {|x| x.freeze } end
irb(main):004:1> end
=> nil
irb(main):005:0> nil
=> nil
irb(main):006:0> foo, bar = const "foo", "bar"
=> ["foo", "bar"]
irb(main):007:0> ["foo", "bar"]
=> ["foo", "bar"]
irb(main):008:0> foo << bar
TypeError: can't modify frozen string
         from (irb):8:in `<<'
         from (irb):8
irb(main):009:0> bar << foo
TypeError: can't modify frozen string
         from (irb):9:in `<<'
         from (irb):9
irb(main):010:0>

Hihi...

Kind regards

  robert

···

On 15.05.2007 16:34, Robert Dober wrote:

On 5/15/07, Robert Klemme <shortcutter@googlemail.com> wrote:

On 15.05.2007 15:54, Robert Dober wrote:
> On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:

         from :0

P.S. I'm aware of Symbol#to_i, but to_i and object_id appear to be
intimately related:

irb(main):001:0> :foo.to_i
=> 14817
irb(main):002:0> :foo.object_id
=> 148178
irb(main):003:0> :bar.to_i
=> 16081
irb(main):004:0> :bar.object_id
=> 160818
irb(main):005:0> :zzzzzzzzzzzzzzzz.to_i
=> 16089
irb(main):006:0> :zzzzzzzzzzzzzzzz.object_id
=> 160898

Here's part of the ruby1.8.5 code which computes an objects object_id
from its reference value.

if (TYPE(obj) == T_SYMBOL) {
        return (SYM2ID(obj) * sizeof(RVALUE) + (4 << 2)) | FIXNUM_FLAG;
    }

where SYM2ID is a c macro which shifts the value right 8 bits.

And here's the code for Symbol#to_i
static VALUE
sym_to_i(sym)
    VALUE sym;
{
    ID id = SYM2ID(sym);

    return LONG2FIX(id);
}

i.e. I don't think the symbol table maintains an explicit integer key for
each symbol.

Actually it does, based on having recently read the ruby 1.8.5 code.

It keeps two internal hashes, one maps the string representation to
the integer representation, and the other maps the other way around.

The code for String#to_sym basically does this:

    it calls rb_intern to get the integer representation called id, and returns
    ID2SYM(id) which just returns id shifted left 8 bits, in other
words it's the inverse of SYM2ID.

   rb_intern searches for the string in the symbol table and returns
the id found there if it finds it.

  otherwise, it calculates the integer representation by shifting the
next available id left by 3 bits and oring in some flag bits which
depend on the contents of the string, for example if the string starts
with a single "@" it's flagged as an instance variable name,

It then makes a copy of the string and does the equivalent of
    sym_table[stringcopy] = newly_computed_id
    sym_rev_table[newly_computed_id] = stringcopy

Although these two aren't ruby hash objects but c hash tables.

FWIW, Ruby hash object use the same c hash code internally.

What's interesting is that a reference to a symbol doesn't actually
point to an allocated object.

···

On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

No, that's not exactly what I meant, but sorry for not being more precise.
What I meant was: there is only ever one symbol object in existence for a
particular sequence of characters. :foo.object_id in one part of the program
is always the same as :foo.object_id elsewhere.

If it were Symbol.new("foo") always returning the same object then I guess
it would probably be called the multiton pattern.

Regards,

Brian.

···

On Wed, May 16, 2007 at 12:23:09AM +0900, Gary Wright wrote:

On May 15, 2007, at 10:53 AM, Brian Candler wrote:

>On Tue, May 15, 2007 at 10:54:05PM +0900, Robert Dober wrote:
>>On 5/15/07, Brian Candler <B.Candler@pobox.com> wrote:
>><snip>
>>>But then what you want are not symbols, but true immutable
>>>strings. By that
>>>I mean: some object where I can write 10MB of binary dump. If I
>>>want to add
>>>one character to the end of it, then I create another object
>>>containing
>>>10MB+1byte of binary dump, and the old 10MB object is garbage-
>>>collected.
>>But of course we have immutable strings already :)))
>>
>>class IString < String
>> def initialize str
>> super(str)
>> freeze
>> end
>>end
>
>Yes, but it's not a singleton.

You've stated or implied a couple of times in this discussion that
symbols are 'singletons', but I thought the conventional definition
of 'singleton' was of a class with only a single instance, where the
instance is called a singleton. That doesn't describe Ruby's symbols.

I think what you are getting at is the idea that identity and
equality are one and the same for symbols.

<snip>

> frozen is frozen forever.

Corrent. And since #frozen? is defined in Kernel you can skip the first
test.

No, you are an optimist Robert :wink:

irb(main):003:0> Kernel.send :remove_method, :frozen?
=> Kernel
irb(main):004:0> "a".frozen?
NoMethodError: undefined method `frozen?' for "a":String
        from (irb):4

But maybe we should not worry too much about that kind of meta-hackery
in our design, because one could trick as anyway, e.g.

class String; def frozen?; true end end

So you are right after all :wink:

Cheers
Robert

···

On 5/15/07, Robert Klemme <shortcutter@googlemail.com> wrote:
        from :0

<snip>

>> > You see things; and you say Why?
>> > But I dream things that never were; and I say Why not?
>> > -- George Bernard Shaw
>>
>> Greetings to George, btw. :slight_smile:
> Well last time I met him he was admiring your posts to the list :stuck_out_tongue:

Wow! So he didn't die but just went home like this other guy who
invented a vi clone (or at least provided his name for the operation)... :slight_smile:

Your conclusions are jumped :wink:
But sure would have liked to talk to this guy. As to Gödel or
Hemingway, well maybe I am OT *now*.

<snip>

While we're at it: *if* you want to define something (and are a fan of
C++) you can do this:

irb(main):001:0> module Kernel
irb(main):002:1> private
irb(main):003:1> def const(*a) a.each {|x| x.freeze } end
irb(main):004:1> end

hey that is quite nice!!!
<snip>

···

On 5/15/07, Robert Klemme <shortcutter@googlemail.com> wrote:

Hihi...

Kind regards

        robert

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw