Hash Surprises with Fixnum, #hash, and #eql?

Clifford_Heath5 · 7 April 2011 04:05

Folk,

I have a class which delegates for Integer, and wants to behave as much
like a real Integer as possible (except for being able to be subclassed).
It *mostly* works... but falls foul of Ruby's various hacks, errors, and
internal optimisations in the Fixnum and Hash classes.

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It's documented to use only #hash and #eql?,
but that's not always true (sometimes these have hard-wired optimsations).

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

Please peruse this code: <https://gist.github.com/906998>, try it on the
various Ruby versions, and also try it with the Fixnum monkey-patches
removed.

You'll see that the behaviour is very unpredictable.

Clifford Heath.

Robert_K1 · 7 April 2011 09:19

I have a class which delegates for Integer, and wants to behave as much
like a real Integer as possible (except for being able to be subclassed).

There's still a lot missing for a number replacement. Please see
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

It *mostly* works... but falls foul of Ruby's various hacks, errors, and
internal optimisations in the Fixnum and Hash classes.

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It's documented to use only #hash and #eql?,
but that's not always true (sometimes these have hard-wired optimsations).

When you violate contracts you cannot expect code to work properly.

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

But that is the contract as far as I can see. Having different
results for both violates the equivalence relation which means all
bets are off.

Please peruse this code: <https://gist.github.com/906998>, try it on the
various Ruby versions, and also try it with the Fixnum monkey-patches
removed.

You'll see that the behaviour is very unpredictable.

Yes, because of your violation of the contract. You have there a nice
demonstration why it is a bad idea most of the time to fiddle with
core class method implementations. They are used everywhere and you
cannot foresee the effects of changing their implementation on other
code.

Kind regards

robert

···

On Thu, Apr 7, 2011 at 6:05 AM, Clifford Heath <no@spam.please.net> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Mazi_Ayisigi · 8 April 2011 00:22

Halk,

Ben Tamsayı için delegeler ve çok gibi davranmasını isteyen bir sınıf varmümkün olduğu kadar gerçek bir Tamsayı (subclassed olmak için güçlü olmak hariç) gibi.O * çoğunlukla * çalışıyor ...ancak, Ruby'nin çeşitli kesmek, hataların faul düşüyorFixnum ve Hash sınıflarda iç optimizasyonlar.

Özellikle, Hash uygulamaları iş (ve mola!) farklı olarakMRG, Rubinius ve JRuby.Sadece # hash ve # eql kullanmak belgelenir?,ama (bazen bu kablolu optimsations var) her zaman doğru değil.

Hash belgeleri isteyip # eql demiyor?Sadece adı verilecektuşları, ya da sadece karma öğeleri hash soruşturma için kullanılıyor.Olması gerektiğiolarak bir veya diğer, a.eql (b) her zaman b.eql anlamına gelmiyor olabilir? beri? (a).

<: Bu kodu incelemek Lütfengist.github.com/906998 https: / /> üzerinde dene,çeşitli Ruby sürümleri ve ayrıca Fixnum maymun-yamalar ile deneyinkaldırıldı.

Bu davranışa çok öngörülemez olduğunu görürsünüz.

Clifford Heath.

···

On Apr 7, 7:01 am, Clifford Heath <n...@spam.please.net> wrote:

Phil · 7 April 2011 11:20

Complex numbers come to mind:

3 - 2j [+|*] 45^(j * e * 44°).

Very different semantics for addition and multiplication of those than
for your normal space numbers, including conversion from Cartesian to
polar form. Bit of a textbook case for the benefits of inheritance and
function overloading.

It'd be better to have those be a sub-class of Float, though.

···

On Thu, Apr 7, 2011 at 11:19 AM, Robert Klemme <shortcutter@googlemail.com> wrote:

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Clifford_Heath5 · 8 April 2011 07:30

I have a class which delegates for Integer, and wants to behave as much
like a real Integer as possible (except for being able to be subclassed).

There's still a lot missing for a number replacement. Please see
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

Yes, you wrote that about the time we discussed it last time.

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

What's wrong with the case you use in that blog post? But as it turns out,
I'm implementing a fact-based modeling DSL, where it's sensible to have
classes like "AgeInYears" being a subclass of an integer like class.
The formalism for this comes directly from sorted first-order logic,
which makes a good deal more sense than the broken O-O paradigm discussed
elsewhere in this thread.

I suspect that you "doubt it is a good idea" only because Ruby's object
model for numbers is inconsistent, and you're defensive about that. Not
because Ruby 2.0 shouldn't move in the direction of fixing it, where
possible. (BTW, I tried to join Ruby Core to discuss this, but all
possible means of subscription are silently failing me).

Note that I'm not actually subclassing any core integer class. I'm just
defining a new base class "Int" which contains an integer, and so far
as is possible, acts like one, including being found in a Hash using a
Fixnum/Bignum key.

If Fixnum and Bignum can act like Integer subclasses, why can't my class?

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It's documented to use only #hash and #eql?,
but that's not always true (sometimes these have hard-wired optimsations).

When you violate contracts you cannot expect code to work properly.

I have not violated that (unstated!) contract. Read again; I redefine
Fixnum#eql? as self.orig_eql?(i.to_i) - the to_i makes it symmetrical.
(Debate the wisdom if you wish, it's just for demonstration purposes.)

However the Ruby interpreters do not honor that. In short, *all three*
mentioned Ruby interpreters violate the Hash contract, which states that
hash and eql? are used for Hash lookups. Not just sometimes, but all the
time, including for integers.

MRI uses a Fixnum as its own hash value, even if you've monkey-patched a
hash method into Fixnum. This optimisation should not be always-on. Instead,
Ruby should detect when Fixnum has been patched, and bypass the optimisation.
That would require a single test and branch, with insignificant impact on
performance. MRI does however use a monkey-patched Fixnum#eql? method.

Rubinius does the opposite. It calls a patched Fixnum#hash, but not Fixnum#eql?

JRuby calls neither.

The Ruby interpreters should behave the way the Hash documentation says they do.

The Ruby documentation should explicitly state that eql? must be defined
symmetrically, or should require that the Hash implementation uses it only
in a known direction or both.

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

But that is the contract as far as I can see.

That's not documented anywhere I can see. Certainly not in TRPL, see sections
3.4.2 on page 68, and section 3.8.5.3 page 77. It makes sense, but it's not
stated.

Having different
results for both violates the equivalence relation which means all
bets are off.

No. I can fix the asymmetry. I can't make the interpreters honor that fix.

You'll see that the behaviour is very unpredictable.

Yes, because of your violation of the contract.

No. Because the Ruby interpreters don't honor the Hash contract.

Please try to read more carefully.

Clifford Heath.

···

On 04/07/11 19:19, Robert Klemme wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford Heath<no@spam.please.net> wrote:

Robert_K1 · 7 April 2011 11:53

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

Complex numbers come to mind:

Why bother, it has been done already.

irb(main):006:0> x = Complex(0,-1)
=> (0-1i)
irb(main):007:0> x * x
=> (-1+0i)
irb(main):008:0> (x * x)+0
=> (-1+0i)
irb(main):009:0> (x * x).to_int
=> -1

And they do play nicely as ints - as long as it's possible:

irb(main):011:0> %w{foo bar baz}[x*x]
=> "baz"
irb(main):012:0> %w{foo bar baz}
RangeError: can't convert 0-1i into Integer
        from (irb):12:in `to_i'
        from (irb):12:in `to_int'
        from (irb):12:in `'
        from (irb):12
        from /opt/bin/irb19:12:in `<main>'

3 - 2j [+|*] 45^(j * e * 44°).

Very different semantics for addition and multiplication of those than
for your normal space numbers, including conversion from Cartesian to
polar form. Bit of a textbook case for the benefits of inheritance and
function overloading.

It'd be better to have those be a sub-class of Float, though.

Actually it's Numeric which is correct because not every Complex _is a_ Float!

irb(main):010:0> Complex.ancestors
=> [Complex, Numeric, Comparable, Object, Kernel, BasicObject]

Cheers

robert

···

On Thu, Apr 7, 2011 at 1:20 PM, Phillip Gawlowski <cmdjackryan@googlemail.com> wrote:

On Thu, Apr 7, 2011 at 11:19 AM, Robert Klemme > <shortcutter@googlemail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert_K1 · 8 April 2011 10:12

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

What's wrong with the case you use in that blog post?

You mean, make HexNum a subclass of Integer? Yes, actually that's
what I had attempted at the time but failed for technical reasons
(explained in the blog). As it turns out it's generally not necessary
to inherit Integer in Ruby to create a class which behaves like an
integer (most of the time).

But as it turns out,
I'm implementing a fact-based modeling DSL, where it's sensible to have
classes like "AgeInYears" being a subclass of an integer like class.
The formalism for this comes directly from sorted first-order logic,
which makes a good deal more sense than the broken O-O paradigm discussed
elsewhere in this thread.

I suspect that you "doubt it is a good idea" only because Ruby's object
model for numbers is inconsistent, and you're defensive about that.

Where exactly do you see the inconsistency? I can see that a few
things in that area do not match common expectations. But I don't
think it's really inconsistent.

Your problem is not so much with numeric classes IMHO but rather with
implementations of class Hash in different versions of Ruby. Namely
do they have issues treating instances from different class as
equivalent.

Note that I'm not actually subclassing any core integer class. I'm just
defining a new base class "Int" which contains an integer, and so far
as is possible, acts like one, including being found in a Hash using a
Fixnum/Bignum key.

If Fixnum and Bignum can act like Integer subclasses, why can't my class?

Fixnum and Bignum do not share common values so you never have
instances of different classes representing the same numeric integer
value:

irb(main):003:0> (1<<100).class
=> Bignum
irb(main):004:0> (1<<100)>>99
=> 2
irb(main):005:0> ((1<<100)>>99).class
=> Fixnum

So that situation is a bit different.

In particular, the Hash implementations work (and break!) differently in
MRI, Rubinius and JRuby. It's documented to use only #hash and #eql?,
but that's not always true (sometimes these have hard-wired
optimsations).

When you violate contracts you cannot expect code to work properly.

I have not violated that (unstated!) contract. Read again; I redefine
Fixnum#eql? as self.orig_eql?(i.to_i) - the to_i makes it symmetrical.
(Debate the wisdom if you wish, it's just for demonstration purposes.)

Yes, you're right. I probably mixed in a discussion about equals() in
Java needing to test for the same class (and not instanceof) to
achieve real equivalence. At least we had a nice discussion about OO
and inheritance because of that.

However the Ruby interpreters do not honor that. In short, *all three*
mentioned Ruby interpreters violate the Hash contract, which states that
hash and eql? are used for Hash lookups. Not just sometimes, but all the
time, including for integers.

Apparently there are optimizations done under the hood (similarly to
duping an unfrozen String as key) which is probably OK from a
pragmatic point of view (what you attempt seems rather seldom done).

The Ruby interpreters should behave the way the Hash documentation says they
do.

Well, they do - most of the time.

The Ruby documentation should explicitly state that eql? must be defined
symmetrically, or should require that the Hash implementation uses it only
in a known direction or both.

Right, there is certainly room for improvement.

The Hash documentation does not say whether #eql? will be called only on
items in the hash, or only on keys being used to probe the hash. It
should
be one or the other, since a.eql?(b) might not always mean b.eql?(a).

But that is the contract as far as I can see.

That's not documented anywhere I can see. Certainly not in TRPL, see
sections
3.4.2 on page 68, and section 3.8.5.3 page 77. It makes sense, but it's not
stated.

Right again. Maybe the requirement can be inferred from other
properties but it would certainly make sense to stress it.

Having different
results for both violates the equivalence relation which means all
bets are off.

No. I can fix the asymmetry. I can't make the interpreters honor that fix.

But since you already embarked in monkey patching core classes you can
easily extend that a bit to Hash# and Hash#fetch. That should work
on all platforms. Another possible remedy would be to wrap Hash
instances in something else which adds logic to and fetch() to
convert types if necessary.

You'll see that the behaviour is very unpredictable.

Yes, because of your violation of the contract.

No. Because the Ruby interpreters don't honor the Hash contract.

As said, they do it most of the time. You introduced a corner case
here by fiddling with a core class which is known to lead into deep
water. Treating instances from different classes equivalent does work
for other classes:

12:07:22 Temp$ allruby ha.rb
CYGWIN_NT-5.1 padrklemme2 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin

···

On Fri, Apr 8, 2011 at 9:30 AM, Clifford Heath <no@spam.please.net> wrote:

On 04/07/11 19:19, Robert Klemme wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford Heath<no@spam.please.net> wrote:

========================================
ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
[[#<B:0x7ff9faa4 @v=1>, 3, 1],
[#<A:0x7ff9fa90 @v=2>, 5, 2],
[1, 3, nil],
[2, 5, nil]]

ruby 1.9.2p180 (2011-02-18 revision 30909) [i386-cygwin]
[[#<B:0x1003a290 @v=1>, 129497438, 1],
[#<A:0x1003a27c @v=2>, 602294680, 2],
[1, 129497438, nil],
[2, 602294680, nil]]

jruby 1.6.0 (ruby 1.8.7 patchlevel 330) (2011-03-15 f3b6154) (Java
HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0xfa8 @v=1>, 1, 1], [#<A:0xfb0 @v=2>, 2, 2], [1, 1, nil], [2, 2, nil]]

jruby 1.6.0 (ruby 1.9.2 patchlevel 136) (2011-03-15 f3b6154) (Java
HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0x000000 @v=1>, 1, 1],
[#<A:0x000000 @v=2>, 2, 2],
[1, 1, nil],
[2, 2, nil]]

12:07:36 Temp$ cat -n ha.rb
     1
     2 require 'pp'
     3
     4 A, B = 2.times.map do
     5 Class.new do
     6 def initialize(x)
     7 @v = x.to_i
     8 end
     9
    10 def to_i
    11 @v
    12 end
    13
    14 def hash
    15 @v.hash
    16 end
    17
    18 def eql? o
    19 case o
    20 when A, B
    21 @v == o.to_i
    22 end
    23 end
    24
    25 alias == eql?
    26 end
    27 end
    28
    29 h = {A.new(1) => 1, B.new(2) => 2}
    30
    31 keys = [B.new(1), A.new(2), 1, 2]
    32
    33 pp keys.map {|k| [k, k.hash, h[k]]}
    34
    35
12:07:40 Temp$

Please try to read more carefully.

Will do.

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Xavier_Noria · 9 April 2011 00:00

El 8 Apr 2011, a les 09:30, Clifford Heath <no@spam.please.net> va escriure:

I suspect that you "doubt it is a good idea" only because Ruby's object
model for numbers is inconsistent, and you're defensive about that. Not
because Ruby 2.0 shouldn't move in the direction of fixing it, where
possible. (BTW, I tried to join Ruby Core to discuss this, but all
possible means of subscription are silently failing me).

Just in case, that happened also to me. Problem was all confirmation mails were in the spam folder.

Phil · 7 April 2011 13:21

Unless you convert from polar form* to Cartesian form:

Let's take "45e^(j * 44°)":

32.37029101523930127102246035055203587038080515238845970760... +
31.25962667065487789953828347402088034639160866937838052053... i

From <45*e^(44°i) - Wolfram|Alpha.

* Which you are much more likely to encounter than the Cartesian form,
considering complex numbers are most useful when dealing with wave
forms.

···

On Thu, Apr 7, 2011 at 1:53 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

Actually it's Numeric which is correct because not every Complex _is a_ Float!

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Clifford_Heath5 · 9 April 2011 01:25

Thanks, that was the problem (/me hides face).
A few more attempts and I'm subscribed.

···

On 04/09/11 10:00, Xavier Noria wrote:

El 8 Apr 2011, a les 09:30, Clifford Heath<no@spam.please.net> va escriure:

(BTW, I tried to join Ruby Core to discuss this, but all
possible means of subscription are silently failing me).

Just in case, that happened also to me. Problem was all confirmation mails were in the spam folder.

Clifford_Heath5 · 9 April 2011 01:25

I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

What's wrong with the case you use in that blog post?

You mean, make HexNum a subclass of Integer? Yes, actually that's
what I had attempted at the time but failed for technical reasons

No, I don't mean making HexNum a subclass of Integer, but making it an
"integer like class" which can be subclassed.

(explained in the blog). As it turns out it's generally not necessary
to inherit Integer in Ruby to create a class which behaves like an
integer (most of the time).

Right. I'd like to see that work *more* of the time :). Or at least,
that each Ruby interpreter should fail in the same way.

I suspect that you "doubt it is a good idea" only because Ruby's object
model for numbers is inconsistent, and you're defensive about that.

Where exactly do you see the inconsistency? I can see that a few
things in that area do not match common expectations. But I don't
think it's really inconsistent.

By inconsistent, I mean that Ruby doesn't make it possible to make
subclasses of Integer that play nicely with other Integers. Fixnum
and Bignum are mutually compatible and automatically and invisibly
convert back and forth, but it's not possible for an user's class to
do the same. That's inconsistent. A few more calls to coerce and some
more circumspect interpreter optimisations and it would all be pretty
ok.

Note that I expect there will still be a need for Java-style boxed
and unboxed integer values. C# makes the boxing even more transparent
than Java, but Ruby doesn't even try.

Your problem is not so much with numeric classes IMHO but rather with
implementations of class Hash in different versions of Ruby. Namely
do they have issues treating instances from different class as
equivalent.

Yes. It's documented to use #hash and #eql?, so that's what it should do.
If it also has invisible optimisations, fine. So long as they're invisible.

Fixnum and Bignum do not share common values so you never have
instances of different classes representing the same numeric integer
value:

Yes. But I never need to know where the cut-over is, and it can be different
with different Ruby build targets. It's almost completely transparent.

Apparently there are optimizations done under the hood (similarly to
duping an unfrozen String as key)

Except that the case of String is documented, and works the same in all
interpreters.

which is probably OK from a
pragmatic point of view (what you attempt seems rather seldom done).

Mainly because it doesn't work

Clifford Heath.

···

On 04/08/11 20:12, Robert Klemme wrote:

On Fri, Apr 8, 2011 at 9:30 AM, Clifford Heath<no@spam.please.net> wrote:

On 04/07/11 19:19, Robert Klemme wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford Heath<no@spam.please.net> wrote:

Robert_K1 · 7 April 2011 13:43

My math is a bit rusty in that area, but I don't think your argument
holds: cartesian and polar are just two ways to represent a complex
number. But this does not change numeric properties of the class of
complex numbers. No matter what representation you use, (0+1i) is
neither a real number (what float conceptually models) nor a rational
number (what float technically implements). Instead, real and
rational numbers are both subsets of complex.

If you see inheritance as "is a" relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Further links:

Kind regards

robert

···

On Thu, Apr 7, 2011 at 3:21 PM, Phillip Gawlowski <cmdjackryan@googlemail.com> wrote:

On Thu, Apr 7, 2011 at 1:53 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

Actually it's Numeric which is correct because not every Complex _is a_ Float!

Unless you convert from polar form* to Cartesian form:

Let's take "45e^(j * 44°)":

32.37029101523930127102246035055203587038080515238845970760... +
31.25962667065487789953828347402088034639160866937838052053... i

From <45*e^(44°i) - Wolfram|Alpha.

* Which you are much more likely to encounter than the Cartesian form,
considering complex numbers are most useful when dealing with wave
forms.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Charles_Nutter · 11 April 2011 00:02

Top-replying with a general observation: you can't please everyone all the time.

The special-cased logic for Fixnums and Symbols in hashes is obviously
done for performance purposes. No matter what you do, checking for
method redefinitions every single time will have a performance impact.
Even checking an inline cache has an impact. When you look at how
frequently hashes are used with Fixnum or Symbol keys, you'd basically
be asking everyone to take a perf hit to do it the "right way" for a
tiny minority of use cases.

There are also plenty of other cases in all the implementations where
modifying critical core classes does not get reflected during
execution. For example, some impls treat operator calls against Fixnum
as always being the Fixnum version, regardless of modifications. This
allows using a fast type-identity check rather than a cache check or
class-modification check, and it can make a *huge* difference for raw
numeric performance.

In this case I think there's a fine line between consistency and
zealotry. The *vast* majority of Ruby users will never reopen and
modify Fixnum or Symbol, so it's a 99%-safe assumption that "fast"
logic for those types is just fine, especially if it's a noticeable
perf boost for the 99% of users. We're talking about the lowest-level
values in the system...if they can't be made fast, everything else
suffers.

JRuby follows MRI largely because of the perf improvement, but also
partially because MRI does it this way. If MRI always dispatched, we'd
do what we need to do to always dispatch (and we do have other ways
internally to reduce -- but not eliminate -- the modification check).

A side note on JRuby's optimization strategy over the years:

1. We find a largely-invariant piece of logic that could be optimized,
like fixnum operators or hashes of symbols
2. We come up with an optimization that may diverge slightly from
"pure" behavior and add an opt-in flag for that optimization
3. Based on user reports, test runs, and so on, we may eventually turn
the optimization on all the time and make the flag be opt-out

We've been more conservative than other impls, even.

- Charlie

···

On Fri, Apr 8, 2011 at 8:25 PM, Clifford Heath <no@spam.please.net> wrote:

On 04/08/11 20:12, Robert Klemme wrote:

On Fri, Apr 8, 2011 at 9:30 AM, Clifford Heath<no@spam.please.net> wrote:

On 04/07/11 19:19, Robert Klemme wrote:

On Thu, Apr 7, 2011 at 6:05 AM, Clifford Heath<no@spam.please.net> >>>> wrote:
I also doubt whether it is a good idea to allow for subclassing of an
integer like class. What use case do you have in mind which would
make this necessary?

What's wrong with the case you use in that blog post?

You mean, make HexNum a subclass of Integer? Yes, actually that's
what I had attempted at the time but failed for technical reasons

No, I don't mean making HexNum a subclass of Integer, but making it an
"integer like class" which can be subclassed.

(explained in the blog). As it turns out it's generally not necessary
to inherit Integer in Ruby to create a class which behaves like an
integer (most of the time).

Right. I'd like to see that work *more* of the time :). Or at least,
that each Ruby interpreter should fail in the same way.

I suspect that you "doubt it is a good idea" only because Ruby's object
model for numbers is inconsistent, and you're defensive about that.

Where exactly do you see the inconsistency? I can see that a few
things in that area do not match common expectations. But I don't
think it's really inconsistent.

By inconsistent, I mean that Ruby doesn't make it possible to make
subclasses of Integer that play nicely with other Integers. Fixnum
and Bignum are mutually compatible and automatically and invisibly
convert back and forth, but it's not possible for an user's class to
do the same. That's inconsistent. A few more calls to coerce and some
more circumspect interpreter optimisations and it would all be pretty
ok.

Note that I expect there will still be a need for Java-style boxed
and unboxed integer values. C# makes the boxing even more transparent
than Java, but Ruby doesn't even try.

Your problem is not so much with numeric classes IMHO but rather with
implementations of class Hash in different versions of Ruby. Namely
do they have issues treating instances from different class as
equivalent.

Yes. It's documented to use #hash and #eql?, so that's what it should do.
If it also has invisible optimisations, fine. So long as they're invisible.

Fixnum and Bignum do not share common values so you never have
instances of different classes representing the same numeric integer
value:

Yes. But I never need to know where the cut-over is, and it can be different
with different Ruby build targets. It's almost completely transparent.

Apparently there are optimizations done under the hood (similarly to
duping an unfrozen String as key)

Except that the case of String is documented, and works the same in all
interpreters.

which is probably OK from a
pragmatic point of view (what you attempt seems rather seldom done).

Mainly because it doesn't work

Clifford Heath.

Phil · 7 April 2011 14:02

My math is a bit rusty in that area, but I don't think your argument
holds: cartesian and polar are just two ways to represent a complex
number. But this does not change numeric properties of the class of
complex numbers. No matter what representation you use, (0+1i) is
neither a real number (what float conceptually models) nor a rational
number (what float technically implements). Instead, real and
rational numbers are both subsets of complex.

Given that there are infinitely more irrational than rational numbers,
it's much more common to represent a complex number with irrational
numbers than rational ones (i.e. floats instead of integers). Thus,
most (for want of a better word) real mathematical operations done
with complex numbers are done with irrational numbers. Cartesian and
polar forms make certain mathematical operations easier, but that's
more or less it (the truth is more complex, but I CBA to look into
trascendental numbers, Euler's number, &c.).

And yes, both rational and irrational numbers are subsets of complex,
obviously (with rational numbers being a subset of irrational numbers,
to simplify extremely).

If you see inheritance as "is a" relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Though, does the "is a" relationship hold up? I think it's more of a
"kind of" relationship, where subsequent classes are defined in ever
more detail (so, you'd inherit Floats from Integers, and Complex from
Float).

Of course, the clean world of maths doesn't map 1:1 to computational
systems, so I see the value in both approaches.

But, frankly, given the differences and additional properties of
complex numbers, I'd derive it from Numeric as well, simply to limit
the side effects the other numeric classes introduce (Floats and their
CPU-internal representation give me nightmares :P).

···

On Thu, Apr 7, 2011 at 3:43 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Clifford_Heath5 · 11 April 2011 03:20

The special-cased logic for Fixnums and Symbols in hashes is obviously
done for performance purposes. No matter what you do, checking for
method redefinitions every single time will have a performance impact.

Yes.

Even checking an inline cache has an impact.

You are mixing up the situations where there is no sane
case for allowing modifications, from the many fewer
ones where there is. No sane person would want to change
the implementation of 1+1; but someone can and has
implemented Fixnum+Complex. That works because Fixnum
will always call coerce where needed, so there's no need
to guard the optimisations.

In the few remaining cases where there is a good case
for supporting modifications, the minuscule cost of
a check would be justified. A single variable (saying
"this class has been modified from its standard form")
would take up a cache line, but the test would play
into branch prediction, so the actual effect would be
tiny.

I know you guys have done amazing thing to achieve the
performance that we now have, but please don't forget
why people choose Ruby; it's clean and consistent.

Another thing that could be done to assist; make sure
that a Hash only ever calls eql? on objects in the hash,
not on lookup keys. At least that way, if a non-standard
object is in the hash, it can still be found using values
that *it* considers equivalent. This change would cost
*nothing*, it would just make Ruby more consistent.

In this case I think there's a fine line between consistency and
zealotry. The *vast* majority of Ruby users will never reopen and
modify Fixnum or Symbol,

People don't do it because it doesn't work, not because
it wouldn't be useful. Inability to make classes that act
like numbers is perhaps the biggest wart on an otherwise
clean language, on a par with Javascript using float for
all numbers.

JRuby follows MRI largely because of the perf improvement, but also
partially because MRI does it this way.

But JRuby does it differently from MRI. Try the code in
the gist I previously sent, you'll see that's true.

We've been more conservative than other impls, even.

and yet JRuby's Hash optimises both eql? and hash for Fixnums,
where MRI only optimises hash.

Clifford Heath.

···

On 04/11/11 10:02, Charles Oliver Nutter wrote:

Robert_K1 · 7 April 2011 14:28

If you see inheritance as "is a" relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Though, does the "is a" relationship hold up? I think it's more of a
"kind of" relationship, where subsequent classes are defined in ever
more detail (so, you'd inherit Floats from Integers, and Complex from
Float).

Well, even with technical inheritance ("kind of") sub often add state
(i.e. member variables) but do only restrict valid values of
superclass state if at all. The cannot do otherwise because then
superclass methods may break. Silly example: superclass holds an
index which must be >= 0. All superclass methods use that index for
some kind of lookup. Assuming a sub class would suddenly set that
value to -13 the superclass contract would be violated. Now, if you
let Complex inherit from Real (trying to avoid "irrational" :-)) you
would add another field for imaginary part. So far so good, but
method to_f would sometimes throw an exception in Complex which it
would never do in Real. So suddenly Complex breaks Real's contract.

Of course, all those considerations are far less important in a nicely
duck typed language like Ruby compared to a statically typed language.
Assuming you would do the same in Java you would have to declare the
exception (if you use checked exceptions) on Real class but state at
the same time that this class would never throw it. Even worse, all
code using Real would have to deal with this exception by either
catching or propagating it. Not nice.

That's why I prefer to look at inheritance as "is a" relationship:
after all OO is about better abstraction capabilities and to be able
to hide implementation details behind a clearly defined clean
interface. If you let yourself get dragged too much into technical
issues chances are that the design comes out awful. Only languages
which allow to inherit without publishing all features of the
inherited class (private inheritance e.g. in Eiffel) do not
necessarily suffer from these issues. But then, inheritance is just
an implementation detail in such cases.

Of course, the clean world of maths doesn't map 1:1 to computational
systems, so I see the value in both approaches.

That's true. I remember debates about the very question how to model
inheritance hierarchies for numeric types. Unfortunately I can't
produce a reference right now. Maybe someone else can.

But, frankly, given the differences and additional properties of
complex numbers, I'd derive it from Numeric as well, simply to limit
the side effects the other numeric classes introduce (Floats and their
CPU-internal representation give me nightmares :P).

Also, with Ruby's concept of coercion inheritance between numeric
types is probably less of an issue.

Kind regards

robert

···

On Thu, Apr 7, 2011 at 4:02 PM, Phillip Gawlowski <cmdjackryan@googlemail.com> wrote:

On Thu, Apr 7, 2011 at 3:43 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Brian_Candler · 7 April 2011 16:06

Phillip Gawlowski wrote in post #991471:

If you see inheritance as "is a" relationship then inheritance would
be Rational < Real < Complex but not the other way round. Otherwise
you cannot use a subclass instance everywhere you were using a
superclass instance.

Though, does the "is a" relationship hold up? I think it's more of a
"kind of" relationship, where subsequent classes are defined in ever
more detail (so, you'd inherit Floats from Integers, and Complex from
Float).

Isn't this the old "ellipse is_a circle, or vice versa" debate?

If you make Circle the top class, then Ellipse reimplements pretty much
everything (draw, area, etc); there's no useful code sharing. If you
make Ellipse the top class, then Circle is just a special constrained
case of Ellipse.

Translating to the current discussion, substitute Float for Circle and
Ellipse for Complex.

Ruby's answer is: neither is a subclass of the other. Both inherit from
Numeric. That is, Circle and Ellipse are both a Shape. Or in other
words, "who cares"?

Eventually you come to realise that a lot of what is taught in object
oriented classes and textbooks is tosh

···

--
Posted via http://www.ruby-forum.com/\.

Robert_K1 · 12 April 2011 09:09

The special-cased logic for Fixnums and Symbols in hashes is obviously
done for performance purposes. No matter what you do, checking for
method redefinitions every single time will have a performance impact.

Yes.

Even checking an inline cache has an impact.

You are mixing up the situations where there is no sane
case for allowing modifications, from the many fewer
ones where there is. No sane person would want to change
the implementation of 1+1; but someone can and has
implemented Fixnum+Complex. That works because Fixnum
will always call coerce where needed, so there's no need
to guard the optimisations.

I think Charly got it exactly right.

In the few remaining cases where there is a good case
for supporting modifications, the minuscule cost of
a check would be justified. A single variable (saying
"this class has been modified from its standard form")
would take up a cache line, but the test would play
into branch prediction, so the actual effect would be
tiny.

Frankly, since what you are attempting seems a rather rare case I'd
say it should be the way it is. After all, you can easily monkeypatch
Hash etc. to get the behavior you desire. That way not 99% of
usages of Hash have to suffer for 1% needing to code less. I think
that is a fair balance.

I don't understand why you insist on changing a core class for your
rare case of making different classes equivalent and causing potential
harm for many, many users of Ruby instead of just going ahead and also
monkey patch Hash since you did already so for Fixnum. On one hand
you use Ruby's openness to change core classes to achieve what you
want but on the other you seem to refuse to change another to make
your change complete. The only reason for this that I can detect is
that you were surprised and your expectations were not met. But now
since you have learned otherwise what stops you from dealing with the
situation in the pragmatic way that is so typical for Ruby?

I know you guys have done amazing thing to achieve the
performance that we now have, but please don't forget
why people choose Ruby; it's clean and consistent.

... most of the time. But it also tries to balance reasonable
usability with above than awful performance.

Another thing that could be done to assist; make sure
that a Hash only ever calls eql? on objects in the hash,
not on lookup keys. At least that way, if a non-standard
object is in the hash, it can still be found using values
that *it* considers equivalent. This change would cost
*nothing*, it would just make Ruby more consistent.

But it also would not have any positive impact for all others plus
that a change always brings a certain amount of risk of introducing
errors. Note though, that there might be situations where you want
the exact opposite: you have a Fixnum key and pass a
SomethingFixnumLinke lookup key and want the match to succeed. You
can only have it one way. You happen to need the way that is not
possible right now.

Apart from that it feels more natural to me to let the key passed in
compare internal key for equivalence because this key is what should
determine whether I have a match or not.

In this case I think there's a fine line between consistency and
zealotry. The *vast* majority of Ruby users will never reopen and
modify Fixnum or Symbol,

People don't do it because it doesn't work, not because
it wouldn't be useful.

How do you know?

Inability to make classes that act
like numbers is perhaps the biggest wart on an otherwise
clean language, on a par with Javascript using float for
all numbers.

We *can* make classes that act like numbers (as has been demonstrated
often enough). And this works remarkably well.

We've been more conservative than other impls, even.

and yet JRuby's Hash optimises both eql? and hash for Fixnums,
where MRI only optimises hash.

It is in the nature of optimizations that they are done differently on
different platforms. Actually they have to because characteristics of
all platforms are different.

Cheers

robert

···

On Mon, Apr 11, 2011 at 5:20 AM, Clifford Heath <no@spam.please.net> wrote:

On 04/11/11 10:02, Charles Oliver Nutter wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Phil · 7 April 2011 14:49

Well, even with technical inheritance ("kind of") sub often add state
(i.e. member variables) but do only restrict valid values of
superclass state if at all. The cannot do otherwise because then
superclass methods may break. Silly example: superclass holds an
index which must be >= 0. All superclass methods use that index for
some kind of lookup. Assuming a sub class would suddenly set that
value to -13 the superclass contract would be violated. Now, if you
let Complex inherit from Real (trying to avoid "irrational" :-)) you
would add another field for imaginary part. So far so good, but
method to_f would sometimes throw an exception in Complex which it
would never do in Real. So suddenly Complex breaks Real's contract.

But that's a failure of implementation, isn't it?

If I were to implement my own class Complex, I'd have to deal with the
edge-cases that my sub-class has and can produce.

Thus, I either undefine #to_f, or redefine it so that it throws an
Exception. the value of inheritance is, after all, generalization, so
that I don't have to reimplement the wheel all the time, instead
making the wheel bigger or smaller, as the implementation requires.
That conversely also means that that more specialized sub-classes
derived from a generic-er super-class, *has* to implement an interface
that works, and works consistently.

To stay with Complex as an example:
#to_f would require an additional argument to work properly: Either
convert the real, or the imaginary part into a Float, and so would
anything derived from Complex, whatever that may be, if it has the
same properties.

That's why I prefer to look at inheritance as "is a" relationship:
after all OO is about better abstraction capabilities and to be able
to hide implementation details behind a clearly defined clean
interface. If you let yourself get dragged too much into technical
issues chances are that the design comes out awful. Only languages
which allow to inherit without publishing all features of the
inherited class (private inheritance e.g. in Eiffel) do not
necessarily suffer from these issues. But then, inheritance is just
an implementation detail in such cases.

But isn't it always?

Regarding technical issues: Design is a bit of an art; knowing when to
stop abstracting is important.

···

On Thu, Apr 7, 2011 at 4:28 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Vincent_Manis · 7 April 2011 22:42

And a lot of what is done by practitioners using object-oriented languages
is tosh as well, I know, I've seen it. (You haven't lived until you've had
to review a C++ class with 9-way multiple inheritance, without using rude
words!)

The Circle and Ellipse example is a good one. In fact, a Circle is no more
than an Ellipse with a constraint (eccentricity = 0, or, equivalently, the
two foci (`focuses') of the Ellipse are at the same point). So in almost
all cases, I wouldn't have two separate classes, but one, Ellipse.

In the case of Complex and Float, the operative design principle is the
Liskov Substitution Principle, which can be roughly stated in OO form as
`you can derive class Sub from class Super if and only if every instance
of Sub can be regarded as an instance of Super'.

Thus it's perfectly reasonable to derive JetPlane from Airplane, because
every JetPlane should be able to respond to all Airplane operations.
However, you can't derive Airplane from Wheel, or Wheel from Airplane,
even though there is some connection between wheels and planes. Like all
design principles, there are exceptional cases where the LSP doesn't apply,
but it seems to be the best heuristic for permissible subclassing.

In the Complex/Float case, the LSP tells us that we _could_ consider Float
a subclass of Complex (because every Float is, as has been pointed out, a
Complex with an imaginary part of zero), but that Complex can't reasonably
be considered a subclass of Float. A good designer would go further and say
`yes, the LSP allows me to derive Float from Complex, but that's a waste of
storage, because it means I must store imaginary parts that are always zero.'
Thus a better design (which Ruby follows) derives Complex from Numeric.

That's the OO theory, and it's not tosh

-- vincent manis

···

On 2011-04-07, at 09:06, Brian Candler wrote:

Eventually you come to realise that a lot of what is taught in object
oriented classes and textbooks is tosh

Topic		Replies	Views
Re: Hash ruby-talk	32	482	10 June 2022
Faster integer arithmetics & arbitrary precision floating number ruby-talk	41	244	14 January 2004
What’s the standard way of implementing #hash for value objects in Ruby? ruby-talk	22	281	2 January 2012
How to ducktype a Hash? ruby-talk	17	125	8 June 2004
PATCH to make internal Hash class retain order ruby-talk	20	179	13 August 2006

Hash Surprises with Fixnum, #hash, and #eql?

======================================== ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin] [[#<B:0x7ff9faa4 @v=1>, 3, 1], [#<A:0x7ff9fa90 @v=2>, 5, 2], [1, 3, nil], [2, 5, nil]]

ruby 1.9.2p180 (2011-02-18 revision 30909) [i386-cygwin] [[#<B:0x1003a290 @v=1>, 129497438, 1], [#<A:0x1003a27c @v=2>, 602294680, 2], [1, 129497438, nil], [2, 602294680, nil]]

jruby 1.6.0 (ruby 1.8.7 patchlevel 330) (2011-03-15 f3b6154) (Java HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java] [[#<B:0xfa8 @v=1>, 1, 1], [#<A:0xfb0 @v=2>, 2, 2], [1, 1, nil], [2, 2, nil]]

jruby 1.6.0 (ruby 1.9.2 patchlevel 136) (2011-03-15 f3b6154) (Java HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java] [[#<B:0x000000 @v=1>, 1, 1], [#<A:0x000000 @v=2>, 2, 2], [1, 1, nil], [2, 2, nil]]

Related topics

========================================
ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
[[#<B:0x7ff9faa4 @v=1>, 3, 1],
[#<A:0x7ff9fa90 @v=2>, 5, 2],
[1, 3, nil],
[2, 5, nil]]

ruby 1.9.2p180 (2011-02-18 revision 30909) [i386-cygwin]
[[#<B:0x1003a290 @v=1>, 129497438, 1],
[#<A:0x1003a27c @v=2>, 602294680, 2],
[1, 129497438, nil],
[2, 602294680, nil]]

jruby 1.6.0 (ruby 1.8.7 patchlevel 330) (2011-03-15 f3b6154) (Java
HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0xfa8 @v=1>, 1, 1], [#<A:0xfb0 @v=2>, 2, 2], [1, 1, nil], [2, 2, nil]]

jruby 1.6.0 (ruby 1.9.2 patchlevel 136) (2011-03-15 f3b6154) (Java
HotSpot(TM) Client VM 1.6.0_24) [Windows XP-x86-java]
[[#<B:0x000000 @v=1>, 1, 1],
[#<A:0x000000 @v=2>, 2, 2],
[1, 1, nil],
[2, 2, nil]]