Overloading Array Subtraction operator

Hi,

I have two arrays of hashes, and I'd like to subtract them to find the
difference elements between them eg.

···

-----------
array1 = Array.new
array2 = Array.new

tmp = {:name => "fred", :phone => "545334"}
array1.push(tmp)
tmp2 = tmp1.dup

array2.push(tmp2)
tmp3 = {:name => "stan", :phone => "hehe"}
array1.push(tmp3)

arraydiff = array1 - array2
--------------

What methods would I have to overload to accomplish this task? I
could not find an example like this anywhere!

Nicko

arraydiff = array1 - array2
--------------

What methods would I have to overload to accomplish this task? I
could not find an example like this anywhere!

array1 - array2

is just a shorthand for

array1.-(array2)

where "-" is normal method name.

So, you can just do

class Array
  def -(other)
    ...
  end
end

···

From: Nicko [mailto:anko.com@gmail.com]
Sent: Sunday, June 10, 2007 10:30 AM

--
zverok.

The - operator compares objects by their ID, so they aren't removed
unless they are instances of the same object. They may have the same
value, but be separate instances like this example. You can accomplish
what you want like this:

array1.select{|x| !array2.include? x}
# => [{:name=>"stan", :phone=>"hehe}]

Array#include? compares using == so they are compared by value, not by
their #object_id.

[:a, :b, :c].object_id # => 2711200
[:a, :b, :c].object_id # => 2690960 ... a new instance, same value

Regards,
Erwin

Wow!
Thank you both!

I ended up with

class SuperArray < Array
  def -(other)
    self.select{|x| !other.include? x}
  end
end

and it works great :slight_smile: I can optimise it later :slight_smile:

Nicko

···

On Jun 10, 5:56 pm, "Erwin Abbott" <erwin.abb...@gmail.com> wrote:

The - operator compares objects by their ID, so they aren't removed
unless they are instances of the same object. They may have the same
value, but be separate instances like this example. You can accomplish
what you want like this:

array1.select{|x| !array2.include? x}
# => [{:name=>"stan", :phone=>"hehe}]

Array#include? compares using == so they are compared by value, not by
their #object_id.

[:a, :b, :c].object_id # => 2711200
[:a, :b, :c].object_id # => 2690960 ... a new instance, same value

Regards,
Erwin

Is that so? Then why does this work?

irb(main):001:0> %w{a b c} - %w{b}
=> ["a", "c"]

And any number of similar examples.

···

On Jun 10, 2:56 am, "Erwin Abbott" <erwin.abb...@gmail.com> wrote:

The - operator compares objects by their ID, so they aren't removed
unless they are instances of the same object.

--
-yossef

It is usually not such a good idea to inherit base classes like Array and Hash. Here are two more healthy approaches.

1. wrap Array with a class that represents the concept (which one btw?) your Array is used for. Then implement #- (and all the other methods).

2. wrap Hash with a class that represents the concept (which one btw?) your Hash is used for. Then implement #==, #hash and #eql? accordingly.

The basic reason why your code does not work as you would like it to work is that Hash does not implement #eql? and #hash in a way that considers Hash content (for the reasons please search the archives, the topic has come up frequently). Note:

irb(main):037:0> h={:foo=>:bar}
=> {:foo=>:bar}
irb(main):038:0> h == h.dup
=> true
irb(main):039:0> h.eql? h.dup
=> false
irb(main):040:0> h.hash == h.dup.hash
=> false

Kind regards

  robert

···

On 10.06.2007 10:25, Nicko wrote:

Wow!
Thank you both!

I ended up with

class SuperArray < Array
  def -(other)
    self.select{|x| !other.include? x}
  end
end

and it works great :slight_smile: I can optimise it later :slight_smile:

Because all instances of the same string are indeed instances of the
same object.

-s

···

In message <1181491239.357509.233710@q66g2000hsg.googlegroups.com>, Yossef Mendelssohn writes:

Is that so? Then why does this work?

Yes, I responded hastily. The rdocs for Array#- don't say how objects
are compared so I made a bad assumption. I only meant to convey it
wasn't being done by comparing values.

Thanks for pointing that out.

···

On 6/10/07, Yossef Mendelssohn <ymendel@pobox.com> wrote:

Is that so? Then why does this work?

irb(main):001:0> %w{a b c} - %w{b}
=> ["a", "c"]

[...]

It is usually not such a good idea to inherit base classes like Array
and Hash.

[...]

That is an interesting statement. I don't think I agree with it, but I'd
like to hear your reasoning behind it.

Kind regards
  robert

--Greg

···

On Sun, Jun 10, 2007 at 07:50:35PM +0900, Robert Klemme wrote:

... at least with the array of Hashes, Hash#hash is used and not
Hash#== or some value based comparison. I'm not sure how it's done
with Strings or Fixnums, I'd have to check the source code probably.
Check it out with the profiler:

$ ruby -rprofile -e '[{:a=>3}] - [{:b=>0,:a=>0}]'
  % cumulative self self total
time seconds seconds calls ms/call ms/call name
  0.00 0.00 0.00 2 0.00 0.00 Kernel.hash
  0.00 0.00 0.00 1 0.00 0.00 Array#-
  0.00 0.01 0.00 1 0.00 10.00 #toplevel

$ ruby -rprofile -e '%w[a b c] - %w[b d e f]'
  % cumulative self self total
time seconds seconds calls ms/call ms/call name
  0.00 0.00 0.00 1 0.00 0.00 Array#-
  0.00 0.01 0.00 1 0.00 10.00 #toplevel

$ ruby -rprofile -e '[1,2,3] - [0,3,5]'
  % cumulative self self total
time seconds seconds calls ms/call ms/call name
  0.00 0.00 0.00 1 0.00 0.00 Array#-
  0.00 0.01 0.00 1 0.00 10.00 #toplevel

Regards,

Erwin

···

On 6/10/07, Erwin Abbott <erwin.abbott@gmail.com> wrote:

Yes, I responded hastily. The rdocs for Array#- don't say how objects
are compared so I made a bad assumption. I only meant to convey it
wasn't being done by comparing values.

Peter Seebach wrote:

Because all instances of the same string are indeed instances of the
same object.

"bla".object_id=="bla".object_id

=> false

Or did I misunderstand what you're saying?

···

--
Ist so, weil ist so
Bleibt so, weil war so

It is usually not such a good idea to inherit base classes like Array
and Hash. Here are two more healthy approaches.

The code is meant to be getting two lists of files, one on a usb stick
and one on a network share, putting them in hashes (for filename, size
and md5 hash) and now i want a list of the files that are in one list
but not on the other.

If the hashes are the same, they won't be the same instance because
they were generated seperately.

Why is inheriting from Array not a healthy approach?

Sorry I'm a ruby newbie.

Thanks for the below info, it just seems like an overkill for what i
am doing.

Nicko

···

On Jun 10, 8:46 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

1. wrap Array with a class that represents the concept (which one btw?)
your Array is used for. Then implement #- (and all the other methods).

2. wrap Hash with a class that represents the concept (which one btw?)
your Hash is used for. Then implement #==, #hash and #eql? accordingly.

The basic reason why your code does not work as you would like it to
work is that Hash does not implement #eql? and #hash in a way that
considers Hash content (for the reasons please search the archives, the
topic has come up frequently). Note:

irb(main):037:0> h={:foo=>:bar}
=> {:foo=>:bar}
irb(main):038:0> h == h.dup
=> true
irb(main):039:0> h.eql? h.dup
=> false
irb(main):040:0> h.hash == h.dup.hash
=> false

Kind regards

        robert

I could just be wrong. I should not answer questions in the morning.

-s

···

In message <200706101915.16027.sepp00@web.de>, Sebastian Hungerecker writes:

Peter Seebach wrote:

Because all instances of the same string are indeed instances of the
same object.

"bla".object_id=="bla".object_id

=> false

Or did I misunderstand what you're saying?

This has been discusses numerous times - even here. On a conceptual level basically more often than not a user defined class XYZ /is not/ an Array but /uses/ an Array (for storing something). More practically by inheriting Array you conveniently publish all methods you might consider useful but you also publish methods that allow for direct Array manipulation - which is especially bad if you want to ensure some additional constraints (e.g. a certain element order). While you can /unpublish/ methods with Ruby IMHO it is less error prone to explicitly define methods that you want to allow on your class. (Just consider a new version of Ruby is available which adds methods to Array that you do not want to be available for your clients but which by default /are/ available unless you change your code as well. If you use delegation in this case you do not have to do anything about it.

If you disagree then you might be sharing a camp with Bertrand Meyer whom I regard highly for his book OOSE, where he also promotes implementation inheritance (which you find in Eiffel). Note though that in Eiffel you have more options to control visibility of methods and inheritance than in Ruby and the compiler will catch many mistakes you can make in this area.

Kind regards

  robert

···

On 10.06.2007 17:38, Gregory Seidman wrote:

On Sun, Jun 10, 2007 at 07:50:35PM +0900, Robert Klemme wrote:
[...]

It is usually not such a good idea to inherit base classes like Array and Hash.

[...]

That is an interesting statement. I don't think I agree with it, but I'd
like to hear your reasoning behind it.

It is usually not such a good idea to inherit base classes like Array
and Hash. Here are two more healthy approaches.

The code is meant to be getting two lists of files, one on a usb stick
and one on a network share, putting them in hashes (for filename, size
and md5 hash) and now i want a list of the files that are in one list
but not on the other.

Why then don't you just substract key arrays (assuming that your keys are file names)? Or is size and MD5 important for your comparison? In that case I'd probably do this:

FileInfo = Struct.new :file_name, :size, :md5

If you put instances of this class in an Array or Set your substraction logic will work.

If the hashes are the same, they won't be the same instance because
they were generated seperately.

Why is inheriting from Array not a healthy approach?

See my other reply.

Kind regards

  robert

···

On 11.06.2007 03:25, Nicko wrote:

On Jun 10, 8:46 pm, Robert Klemme <shortcut...@googlemail.com> wrote: