Symbols and Strings

Hello, all...

A modest proposal here... I think (tentatively) that Symbol should inherit
from String, and a symbol should basically be an immutable string.

Reply here and/or on my blog as you see fit: http://rubyhacker.wordpress.com

Cheers,
Hal Fulton

This happened (see thread at [ruby-core:9188]) but was reverted. I can't think of the proper keywords to find the reason for the reversion.

···

On Jun 18, 2012, at 15:09, Hal Fulton wrote:

Hello, all...

A modest proposal here... I think (tentatively) that Symbol should inherit
from String, and a symbol should basically be an immutable string.

Reply here and/or on my blog as you see fit: http://rubyhacker.wordpress.com

I've liked the distinction between symbols and strings since I first
encountered it in LISP. Symbols to me are abstract entities which
happen to be typically represented with characters, while strings are
explicitly series of characters. With this perspective, it makes sense
that "b" > "a", but :b > :a gives an error, and :a + :b gives an error.
If symbols became immutable strings, this distinction would be lost.

But in the end, the proof is in the code. I suspect functionally
there'd not be much difference, but I'm just a beginner.

···

--
Posted via http://www.ruby-forum.com/.

A discussion on this just came up on Stack Overflow:

···

On Mon, Jun 18, 2012 at 5:09 PM, Hal Fulton <rubyhacker@gmail.com> wrote:

Hello, all...

A modest proposal here... I think (tentatively) that Symbol should inherit
from String, and a symbol should basically be an immutable string.

Reply here and/or on my blog as you see fit: http://rubyhacker.wordpress.com

Cheers,
Hal Fulton

Silly of me. I not recall this at all, and even believed it was an
original idea...

Those who cannot remember history are condemned to Google it...

Hal

···

On Mon, Jun 18, 2012 at 5:31 PM, Eric Hodel <drbrain@segment7.net> wrote:

On Jun 18, 2012, at 15:09, Hal Fulton wrote:
> Hello, all...
>
> A modest proposal here... I think (tentatively) that Symbol should
inherit
> from String, and a symbol should basically be an immutable string.
>
> Reply here and/or on my blog as you see fit:
http://rubyhacker.wordpress.com

This happened (see thread at [ruby-core:9188]) but was reverted. I can't
think of the proper keywords to find the reason for the reversion.

I one has Symbol inherit String then "is a" relationship is violated.
You cannot use the sub class where you can use the super class (e.g.
try to append to a frozen String). See David's remark
http://ruby.11.n6.nabble.com/Bikeshed-No-more-Symbol-String-tp3558662p3558699.html

Cheers

robert

···

On Tue, Jun 19, 2012 at 12:31 AM, Eric Hodel <drbrain@segment7.net> wrote:

This happened (see thread at [ruby-core:9188]) but was reverted. I can't think of the proper keywords to find the reason for the reversion.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I'm pretty sure I'm not a beginner (although programming changes so
fast it's almost impossible to be old-hat in anything relevant) but I
agree pretty much in whole with what you're saying. In a
compiled/pre-parsed/JIT environment there _could_ be a dramatic
difference between symbols and immutable strings – symbols could
easily be replaced with some enumerator type, or integers, or even
completely optimised away, depending on context.

To my way of thinking symbols, when they're called symbols, are
"valueless data" – their worth is in their name. Immutable strings
_do_ contain valuable data, even if you're not allowed to modify it.
(If that wasn't the case, Java would be an even harder place to get
anything useful done.)

By extension I'd argue that a symbol is atomic; the whole name is
valuable, but no part of it is. As such an .each_char iterator could
make perfect sense for an immutable string, but not for a symbol.

I always thought the :"foo" syntax was handy (for :"foo#{bar}baz"
cases), but it only ever served to confuse the issue for me.
Personally I tend to use String#to_sym for those cases that I want to
dynamically generate a symbol; it's an explicit cast, and provides a
clear boundary between "creating the name" and then "using it". Note:
it may be more or less optimal at runtime to do it this way, but I
don't care; I'm after maintainability and easing my own understanding
here.

I should get back to work.

···

On 20 June 2012 02:58, Dan Connelly <lists@ruby-forum.com> wrote:

I've liked the distinction between symbols and strings since I first
encountered it in LISP. Symbols to me are abstract entities which
happen to be typically represented with characters, while strings are
explicitly series of characters. With this perspective, it makes sense
that "b" > "a", but :b > :a gives an error, and :a + :b gives an error.
If symbols became immutable strings, this distinction would be lost.

But in the end, the proof is in the code. I suspect functionally
there'd not be much difference, but I'm just a beginner.

--
Matthew Kerwin, B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

"You'll never find a programming language that frees
you from the burden of clarifying your ideas." - xkcd

For the record, this:
<http://ruby.11.n6.nabble.com/Bikeshed-No-more-Symbol-lt-String-td3558662.html&gt;

···

On 19 June 2012 09:00, Hal Fulton <rubyhacker@gmail.com> wrote:

Silly of me. I not recall this at all, and even believed it was an
original idea...

Those who cannot remember history are condemned to Google it...

Hal

On Mon, Jun 18, 2012 at 5:31 PM, Eric Hodel <drbrain@segment7.net> wrote:

On Jun 18, 2012, at 15:09, Hal Fulton wrote:
> Hello, all...
>
> A modest proposal here... I think (tentatively) that Symbol should
> inherit
> from String, and a symbol should basically be an immutable string.
>
> Reply here and/or on my blog as you see fit:
> http://rubyhacker.wordpress.com

This happened (see thread at [ruby-core:9188]) but was reverted. I can't
think of the proper keywords to find the reason for the reversion.

--
Matthew Kerwin, B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

"You'll never find a programming language that frees
you from the burden of clarifying your ideas." - xkcd

Personally, I don't have a problem with "reducing the contract"
of a String. Freezing an object also reduces its contract.

Hal

···

On Tue, Jun 19, 2012 at 8:35 AM, Robert Klemme <shortcutter@googlemail.com>wrote:

On Tue, Jun 19, 2012 at 12:31 AM, Eric Hodel <drbrain@segment7.net> wrote:
> This happened (see thread at [ruby-core:9188]) but was reverted. I
can't think of the proper keywords to find the reason for the reversion.

I one has Symbol inherit String then "is a" relationship is violated.
You cannot use the sub class where you can use the super class (e.g.
try to append to a frozen String). See David's remark

http://ruby.11.n6.nabble.com/Bikeshed-No-more-Symbol-String-tp3558662p3558699.html

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Personally, I don't have a problem with "reducing the contract"
of a String.

It's just that Ruby is not particularly suited to using inheritance in
different ways. In Eiffel you can do all this and do it visibly. In
Ruby there's just mixing in modules and class inheritance.

Freezing an object also reduces its contract.

Kind of. But I consider that a special case because freeze prevents
all mutations but not other operations. It also does not restrict the
range of valid state - it just freezes it.

Kind regards

robert

···

On Tue, Jun 19, 2012 at 5:17 PM, Hal Fulton <rubyhacker@gmail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

> Freezing an object also reduces its contract.

Kind of. But I consider that a special case because freeze prevents
all mutations but not other operations. It also does not restrict the
range of valid state - it just freezes it.

Hmm. If a Symbol is-a frozen String, how does that reduce the range
of valid state any more than a "real" frozen String? What operations
other than mutations are prohibited?

Hal

> Freezing an object also reduces its contract.

Kind of. But I consider that a special case because freeze prevents
all mutations but not other operations. It also does not restrict the
range of valid state - it just freezes it.

Hmm. If a Symbol is-a frozen String, how does that reduce the range
of valid state any more than a "real" frozen String?

Freezing does _not_ reduce the state, which is what I said above.
With inheritance Symbol is-a String - not "frozen String". If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

What operations
other than mutations are prohibited?

For example: String has method << which is part of the public contract
and that will be broken by _all_ Symbol instances. So you give
someone something and say "this is a String" but in reality it is not
because all instances lack many of the functionality of String.

While with actually freezing Strings << works most of the time and
there are just some instances which are in a state (frozen) which does
not allow successful execution of the method.

Kind regards

robert

···

On Tue, Jun 19, 2012 at 10:12 PM, Hal Fulton <rubyhacker@gmail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

This is a very good point. String should be a sub class of Symbol. I wonder if this was considered.

Henry

···

On 20/06/2012, at 7:12 PM, Robert Klemme wrote:

If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

This is an extremely bad idea.

Symbol is just that – a symbol. Internally it is stored as a *number*
and in C code passed by-value, like Fixnum, instead of by-ref, like
all other objects. It is only shown as a bit of text for display.

Symbols are not immutable strings, and Strings are not mutable
symbols. You should not confuse the two, and they should not inherit
from each other. They are extremely different entities. Think of
symbols as a crossover between enums, magic numeric constants and – as
the last place – strings.

-- Matma Rex

···

2012/6/20 Henry Maddocks <hmaddocks@me.com>:

On 20/06/2012, at 7:12 PM, Robert Klemme wrote:

If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

This is a very good point. String should be a sub class of Symbol. I wonder if this was considered.

If at all
the inheritance would be the other way round: a String is-a Symbol and
it would extend the contract by mutating methods.

This is a very good point. String should be a sub class of Symbol. I wonder if this was considered.

This is an extremely bad idea.

Well a lot of people, including Matz, disagree with you.

Symbol is just that – a symbol. Internally it is stored as a *number*
and in C code passed by-value, like Fixnum, instead of by-ref, like
all other objects. It is only shown as a bit of text for display.

The internal representation is of no concern to the programmer, only it's utility.
There seems to be a desire to be able to use String and Symbol interchangeably, hence this discussion.

Henry

···

On 21/06/2012, at 8:50 AM, Bartosz Dziewoński wrote:

2012/6/20 Henry Maddocks <hmaddocks@me.com>:

On 20/06/2012, at 7:12 PM, Robert Klemme wrote:

This is an extremely bad idea.

Well a lot of people, including Matz, disagree with you.

I'd like to see where he explicitly disagrees with this.

The internal representation is of no concern to the programmer, only it's utility.

Wrong. It's as if you said that a linked list and an array are the
same thing and the difference in the implementation is of no concern
to the programmer, since both can support the same interface. Ruby
Strings and Symbols are fundamentally different on every level (as I
explained), and should be used in different contexts, for both code
clarity and performance.

There seems to be a desire to be able to use String and Symbol interchangeably, hence this discussion.

If Symbol and String have the same function, then one of them should
probably be removed. (Except they don't, in most cases; in the few
cases where they do the programmer should probably suck it up, choose
one representation, stick to it and convert input data to it himself.)

-- Matma Rex

···

2012/6/20 Henry Maddocks <hmaddocks@me.com>:

On 21/06/2012, at 8:50 AM, Bartosz Dziewoński wrote:

Given that Strings and Symbols are different entities (as has been
discussed at length here and other places), it's not possible to expect
them to be interchangeable in *all* cases. This even holds true if one
class inherits from the other. After all, the point of descending one
class from another is to make something related but at least a little
different. Therefore, interchangeability must be limited to some extent
as long as Symbol and String are different things.

So then, in what cases would it be helpful to use them interchangeably?
Is it possible to define a common, useful interface that both Symbol
and String could implement. If so, perhaps a class or module could be
defined for that interface and incorporated into both classes as
appropriate.

From what I've seen, the most frequently voiced desire is to be able to
use apparently equivalent symbols and strings as fully equivalent hash
keys. What else is there?

-Jeremy

···

On 06/20/2012 03:58 PM, Henry Maddocks wrote:

On 21/06/2012, at 8:50 AM, Bartosz Dziewoński wrote:

Symbol is just that – a symbol. Internally it is stored as a *number*
and in C code passed by-value, like Fixnum, instead of by-ref, like
all other objects. It is only shown as a bit of text for display.

The internal representation is of no concern to the programmer, only it's utility.
There seems to be a desire to be able to use String and Symbol interchangeably, hence this discussion.

I suppose that's the biggest area of concern for me.

I do sometimes want to perform stringlike operations, e.g., on the parameter
into method_missing, but that merely requires a single to_s. Likewise I
sometimes
want to add an equal sign onto a symbol in metaprogramming:

    (name.to_s << "=").to_sym

Hal

···

On Wed, Jun 20, 2012 at 5:06 PM, Jeremy Bopp <jeremy@bopp.net> wrote:

So then, in what cases would it be helpful to use them interchangeably?
Is it possible to define a common, useful interface that both Symbol
and String could implement. If so, perhaps a class or module could be
defined for that interface and incorporated into both classes as
appropriate.

From what I've seen, the most frequently voiced desire is to be able to
use apparently equivalent symbols and strings as fully equivalent hash
keys. What else is there?

Even that would not be solved by inheritance because instance of
different classes can never be equivalent.

Kind regards

robert

PS: I said "If at all..." - so I am not a fan of String as subclass of
Symbol either.

···

On Thu, Jun 21, 2012 at 12:25 AM, Hal Fulton <rubyhacker@gmail.com> wrote:

On Wed, Jun 20, 2012 at 5:06 PM, Jeremy Bopp <jeremy@bopp.net> wrote:

So then, in what cases would it be helpful to use them interchangeably?
Is it possible to define a common, useful interface that both Symbol
and String could implement. If so, perhaps a class or module could be
defined for that interface and incorporated into both classes as
appropriate.

From what I've seen, the most frequently voiced desire is to be able to
use apparently equivalent symbols and strings as fully equivalent hash
keys.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

With regard to metaprogramming, perhaps the problem could be better
solved by allowing the things that currently consume only symbols to
also consume strings or really anything that provides a to_sym method.
That would also better follow the notion of duck typing. I'm not sure
if this would help you in your particular case very much though.

I haven't heard any arguments for or against such an approach before.
As long as the related documentation was clear about the requirement of
a to_sym implementation that returns a Symbol instance, this wouldn't
appear to be a problem right off (... he says naively...).

-Jeremy

···

On 06/20/2012 05:25 PM, Hal Fulton wrote:

On Wed, Jun 20, 2012 at 5:06 PM, Jeremy Bopp <jeremy@bopp.net > <mailto:jeremy@bopp.net>> wrote:

    So then, in what cases would it be helpful to use them interchangeably?
     Is it possible to define a common, useful interface that both Symbol
    and String could implement. If so, perhaps a class or module could be
    defined for that interface and incorporated into both classes as
    appropriate.

    From what I've seen, the most frequently voiced desire is to be able to
    use apparently equivalent symbols and strings as fully equivalent hash
    keys. What else is there?

I suppose that's the biggest area of concern for me.

I do sometimes want to perform stringlike operations, e.g., on the parameter
into method_missing, but that merely requires a single to_s. Likewise I
sometimes
want to add an equal sign onto a symbol in metaprogramming:

    (name.to_s << "=").to_sym