CompareByValue

I put a challenge up at:
http://www.rubygarden.org/ruby?CompareByValue

Nathaniel Talbott and I were pairing a while back, and he said,
“When I’m test-firsting a class, I always start with .==” This
made me think: Three Strikes and you Automate… so how do you
automate this?

We set off to write a simple mixin that would allow us to forget
about writing simple versions of .== for the rest of our lives.

Unfortunately, this turns out to be a little bit tougher than we
thought. To make a long story short, after several versions,
none pass all the tests on
http://www.rubygarden.org/ruby?CompareByValueTests (which isn’t
even a thorough test suite, necessarily).

I’m curious to see if this list can come up with something that’s
elegant and that works.

If we came up with something good enough, might this ever find
its way into the standard distribution?

Thanks, all!

  • Ryan King

I put a challenge up at:
http://www.rubygarden.org/ruby?CompareByValue

I don’t think this is the /best/ solution, but it’s one
that makes all ten tests pass:

module CompareByValue
  def ==(other)
    # return true if self.id == other.id
    return false unless self.type == other.type
    return false unless self.instance_variables == other.instance_variables
    self.instance_variables.each { |name|
      next if self.id == self.instance_eval(name).id
      next if other.id == other.instance_eval(name).id
      next if self.id == other.instance_eval(name).id
      next if other.id == self.instance_eval(name).id
      return false unless self.instance_eval(name) ==
        other.instance_eval(name)
    }
    true
  end
end

The commented-out line is not required to make the tests pass (that’s
why it’s commented out) but it’s an optimization that might make the
comparison go faster.

http://www.rubygarden.org/ruby?CompareByValueTests (which isn’t
even a thorough test suite, necessarily).

What kind or class of tests might be missing?

I’m curious to see if this list can come up with something that’s
elegant and that works.

I’d rather see the list come up with a comprehensive list of tests that
would really give confidence that whatever code passes the test will
work. :wink:

– Dossy

···

On 2002.08.25, Ryan King rking@panoptic.com wrote:


Dossy Shiobara mail: dossy@panoptic.com
Panoptic Computer Network web: http://www.panoptic.com/
“He realized the fastest way to change is to laugh at your own
folly – then you can let go and quickly move on.” (p. 70)

Curiousity struck me some and I discovered what I
posted wasn’t the minimal set to make all tests pass. I could
comment out two more lines without making the tests break:

module CompareByValue
def ==(other)
# return true if self.id == other.id
return false unless self.type == other.type
return false unless self.instance_variables == other.instance_variables
self.instance_variables.each { |name|
next if self.id == self.instance_eval(name).id
next if self.id == other.instance_eval(name).id
# next if other.id == other.instance_eval(name).id
# next if other.id == self.instance_eval(name).id
return false unless self.instance_eval(name) ==
other.instance_eval(name)
}
true
end
end

– Dossy

···


Dossy Shiobara mail: dossy@panoptic.com
Panoptic Computer Network web: http://www.panoptic.com/
“He realized the fastest way to change is to laugh at your own
folly – then you can let go and quickly move on.” (p. 70)

“Dossy” wrote

I don’t think this is the /best/ solution, but it’s one
that makes all ten tests pass:

module CompareByValue
  def ==(other)
    # return true if self.id == other.id
    return false unless self.type == other.type
    return false unless self.instance_variables == other.instance_variables
    self.instance_variables.each { |name|
      next if self.id == self.instance_eval(name).id
      next if other.id == other.instance_eval(name).id
      next if self.id == other.instance_eval(name).id
      next if other.id == self.instance_eval(name).id
      return false unless self.instance_eval(name) ==
        other.instance_eval(name)
    }
    true
  end
end

The commented-out line is not required to make the tests pass (that’s
why it’s commented out) but it’s an optimization that might make the
comparison go faster.

Just a stupid style comment - your are using ``self’’
too excessively (for my taste;-) . Also you might
want to use the #equal? method instead of the
comparing the id’s directly?

http://www.rubygarden.org/ruby?CompareByValueTests (which isn’t
even a thorough test suite, necessarily).

What kind or class of tests might be missing?

The recursive test case looks insufficient - it seems to me that
one needs to test that the generated local variables name labeled'' id-graphs are the same’'.

/Christoph

I added another test, which makes this implementation fail, too.

The implementation could be hacked so that this passes. However,
I wrote it as a simple case to represent an arbitrarily recursive
structure.

The recursion is what makes this problem a “Challenge” rather
than a “Trivial Chunk of Code”. =)

  • Ryan King
···

On 2002.08.25, Dossy dossy@panoptic.com wrote:

On 2002.08.25, Ryan King rking@panoptic.com wrote:

I put a challenge up at:
http://www.rubygarden.org/ruby?CompareByValue

I don’t think this is the /best/ solution, but it’s one
that makes all ten tests pass:
[…]

I added another test, which makes this implementation fail, too.

Terrific! I updated the code that makes the new test pass.

The implementation could be hacked so that this passes. However,
I wrote it as a simple case to represent an arbitrarily recursive
structure.

The recursion is what makes this problem a “Challenge” rather
than a “Trivial Chunk of Code”. =)

I don’t think my solution to make the test pass is a “hack” that
just makes the test pass, but the whole class of tests involving
recursion, to pass.

http://www.rubygarden.org/ruby?DossysCompareByValue

Here’s the code for folks to review:

module CompareByValue
def CompareByValue.compare(a, b)
return true if a.equal? b
return false unless a.instance_of? b.type
return false unless a.instance_variables == b.instance_variables
a.instance_variables.each { |name|
next if a.equal? a.instance_eval(name)
next if a.equal? b.instance_eval(name)
return false unless CompareByValue.compare(a.instance_eval(name),
b.instance_eval(name))
}
true
end
def ==(other)
return CompareByValue.compare(self, other)
end
end

– Dossy

···

On 2002.08.27, Ryan King rking@panoptic.com wrote:


Dossy Shiobara mail: dossy@panoptic.com
Panoptic Computer Network web: http://www.panoptic.com/
“He realized the fastest way to change is to laugh at your own
folly – then you can let go and quickly move on.” (p. 70)

It causes them to return “equal”, yes. =)

I added two assert_not_equal’s that I should have had originally.
Maybe I’ll break down and just add some more thorough tests. I
was trying to keep the size down so the intention of the module
was clearer… but I guess that was a silly tradeoff.

Also, I noticed someone added a version that uses a global
variable to keep track of "seen"s. This version passes all
tests, but isn’t thread-safe… which is a limitation I’d like to
overcome. We could easily fix it by using thread-specific
storage, but the more interesting question: How would we devise a
test that verifies the thread-safety? I suppose we could come up
with something involving Thread.pass…

  • Ryan King
···

On 2002.08.29, Dossy dossy@panoptic.com wrote:

[…] http://www.rubygarden.org/ruby?CompareByValue

I don’t think my solution to make the test pass is a “hack” that
just makes the test pass, but the whole class of tests involving
recursion, to pass.

[…] http://www.rubygarden.org/ruby?CompareByValue

I don’t think my solution to make the test pass is a “hack” that
just makes the test pass, but the whole class of tests involving
recursion, to pass.

It causes them to return “equal”, yes. =)

Hee.

I added two assert_not_equal’s that I should have had originally.
Maybe I’ll break down and just add some more thorough tests. I
was trying to keep the size down so the intention of the module
was clearer… but I guess that was a silly tradeoff.

Add more tests. Please. :wink:

This is a good exercise in expressing the minimal tests required
to describe the behavior of CompareByValue.

Also, I noticed someone added a version that uses a global
variable to keep track of "seen"s. This version passes all
tests, but isn’t thread-safe… which is a limitation I’d like to
overcome. We could easily fix it by using thread-specific
storage, but the more interesting question: How would we devise a
test that verifies the thread-safety? I suppose we could come up
with something involving Thread.pass…

Create two threads, run a CompareByValue of the same two objects,
one in each thread … it might require some trickery to expose
the bug (or, multiple test runs, whatnot).

Of course, on a single-CPU machine, I’m not sure even this will
work without cooperation of the code-under-test itself …

– Dossy

···

On 2002.08.29, Ryan King rking@panoptic.com wrote:

On 2002.08.29, Dossy dossy@panoptic.com wrote:


Dossy Shiobara mail: dossy@panoptic.com
Panoptic Computer Network web: http://www.panoptic.com/
“He realized the fastest way to change is to laugh at your own
folly – then you can let go and quickly move on.” (p. 70)

Hi:

Sorry to but in, but I don’t understand exactly what you are
trying to accomplish with CompareByValue.

Do you want to compare any two objects?
Are you including all the contained data as well in the compare?

Thanks

···


Jim Freeze
If only I had something clever to say for my comment…
~

Dossy wrote:

Also, I noticed someone added a version that uses a global
variable to keep track of "seen"s. This version passes all
tests, but isn’t thread-safe… which is a limitation I’d like to
overcome. We could easily fix it by using thread-specific
storage, but the more interesting question: How would we devise a
test that verifies the thread-safety? I suppose we could come up
with something involving Thread.pass…

Create two threads, run a CompareByValue of the same two objects,
one in each thread … it might require some trickery to expose
the bug (or, multiple test runs, whatnot).

Depends on how thread scheduling is done in Ruby. Manually calling #pass
introduces a Heisenberg effect, so ideally you would want to keep the
original code and somehow randomize thread scheduling.

This brings back memories of stress testing an object database engine,
using 32 threads on a dual SMP box, running billions of db operations
over the course of several days, and then hitting an assertion failure
and wondering how the *&%$# you’re ever going to be able to recreate it
because of the indeterminate scheduling…

In Ruby we’d have the advantage of being able to add a pseudo-random
element to the thread scheduler, and avoid this nightmare.

Of course, on a single-CPU machine, I’m not sure even this will
work without cooperation of the code-under-test itself …

AFAIK, dual-CPU won’t make a difference to Ruby.

···

On 2002.08.29, Ryan King rking@panoptic.com wrote:

Forgive me – you’re not the only one to ask this question.

I’ll clarify (and put it on
http://www.rubygarden.org/ruby?CompareByValue for posterity).

When you’re unit-testing a class into existence, you often see
brain-dead .=='s that look like this:

def == other
@a == other.a && @b == other.b …
end

This is time-consuming, and redundant. So, I want to
automate it.

Just simply loop over all the current instance variables,
comparing them via “==”, and making sure you don’t screw anything
up (like get stuck in an infinite loop, or do anything that
wouldn’t work with concurrency).

To me, this is the most DWIMmy .== definition. Maybe I’m
crazy, though.

  • Ryan King
···

On 2002.08.29, Jim Freeze jim@freeze.org wrote:

Sorry to but in, but I don’t understand exactly what you are
trying to accomplish with CompareByValue.

Do you want to compare any two objects?
Are you including all the contained data as well in the compare?

Thanks for the quick overview. I remember reading it early on,
but I guess it didnt sink in.

Concerning unit testing, I don’t seem to use #=='s muuch
on classes. I think I usually test whether functions return
the right thing or not.

However, I suppose that you would have to put some limitations
on testing the instance variables. For example, do you trace
through arrays?

class MyClass
def initialize(some_object)
@var = [some_object]
end
end

m1 = MyClass.new(obj1)
m2 = MyClass.new(obj2)

Now the m1 == m2 has to apply the #== to the contents
of @var. If this was a tree class, it could be a mess.

Do I understand this correctly?

Hmm…I wonder if one could sort the output of Marshall.dump
and compare…

···

On Thu, Aug 29, 2002 at 01:08:13PM +0900, Ryan King wrote:

On 2002.08.29, Jim Freeze jim@freeze.org wrote:

Sorry to but in, but I don’t understand exactly what you are
trying to accomplish with CompareByValue.

Do you want to compare any two objects?
Are you including all the contained data as well in the compare?

Forgive me – you’re not the only one to ask this question.

I’ll clarify (and put it on
http://www.rubygarden.org/ruby?CompareByValue for posterity).

When you’re unit-testing a class into existence, you often see
brain-dead .=='s that look like this:

def == other
@a == other.a && @b == other.b …
end

This is time-consuming, and redundant. So, I want to
automate it.

Just simply loop over all the current instance variables,
comparing them via “==”, and making sure you don’t screw anything
up (like get stuck in an infinite loop, or do anything that
wouldn’t work with concurrency).

To me, this is the most DWIMmy .== definition. Maybe I’m
crazy, though.


Jim Freeze
If only I had something clever to say for my comment…
~

[…] http://www.rubygarden.org/ruby?CompareByValue

Concerning unit testing, I don’t seem to use #=='s muuch
on classes. I think I usually test whether functions return
the right thing or not.

What do you use to compare the return values to the expected
values?

If you use one of the built-in types (Strings, Arrays, Hashes,
etc.), they all have decent .=='s for this purpose. However, the
minute you move onto less primitive types, you end up defining
the == method.

Especially as a system grows, abstractions are built. You end up
seeing fewer and fewer primitives… and hence define more and
more .== methods. =)

However, I suppose that you would have to put some limitations
on testing the instance variables. For example, do you trace
through arrays?

Arrays don’t get special-cased. You just call Array#== the same
as you’d call Burro#==. Fortunately, Array#== is appropriate for
this case, which is that it compares by value.

If this was a tree class, it could be a mess.

Do I understand this correctly?

I’m pretty sure you understand correctly, because you know it
will be a mess. =)

However, it’s normally what I mean, for [contrived] example:

expected = ImaginarySqlStatementClass.new {
:select => [ ‘*’ ],
:from => [ ‘animals’ ],
:where => “name = ‘burro’”
}
actual = “select * from animals where name = ‘burro’”.to_sql
assert_equal expected, actual

Yes, the member variables of expected/actual are trees… but I
want to compare them by value.

Hmm…I wonder if one could sort the output of Marshall.dump
and compare…

Actually, using .inspect was Nathaniel’s first thought. Then, we
got tripped up by the way it puts its id into the string. Then,
we worked around that, only to find out that the order of the
member variables can get dumped in differing orders depending
(IIRC) on which order they’re initialized.

The second attempt was to use Marshall.dump, but I ran into some
minor glitch just like with .inspect.

It might end up being worth hacking around these problems to get
a concise implementation, though.

  • Ryan King
···

On 2002.08.29, Jim Freeze jim@freeze.org wrote:

On Thu, Aug 29, 2002 at 01:08:13PM +0900, Ryan King wrote:

On 2002.08.29, Jim Freeze jim@freeze.org wrote: