A small problem for arrays

I have 2 array. ar_1, ar_2
How can we, on the basis of existing arrays to create an array ar_3, of
all the elements that are ar_1 and are not included in ar_2.

···

--
Posted via http://www.ruby-forum.com/.

I have 2 array. ar_1, ar_2
How can we, on the basis of existing arrays to create an array ar_3, of
all the elements that are ar_1 and are not included in ar_2.

Just make the difference between the two arrays:

a1 = (1..10).to_a

=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

a2 = (4..7).to_a

=> [4, 5, 6, 7]

a1 - a2

=> [1, 2, 3, 8, 9, 10]

a2 - a1

=>

Cheers,

···

2010/8/21 Unc88 Unc88 <unc88@mail.ru>:

--
JJ Fleck
PCSI1 Lycée Kléber

Can do the subtraction of arrays without using the operator "-"?
Is there a standard method in the class Array?

···

--
Posted via http://www.ruby-forum.com/.

Can do the subtraction of arrays without using the operator "-"?
Is there a standard method in the class Array?

"-" *is* a standard method of class Array. The apparent "operator"
behavior is just syntaxic sugar added on it:

a1 = (1..10).to_a

=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

a2 = (5..7).to_a

=> [5, 6, 7]

a1.-(a2)

=> [1, 2, 3, 4, 8, 9, 10]

Last line call "-" method of object "a1" with object "a2" as an argument.

~> ri Array.-

---------------------------------------------------------------- Array#-
     array - other_array -> an_array

···

2010/8/21 Unc88 Unc88 <unc88@mail.ru>:
------------------------------------------------------------------------
     Array Difference---Returns a new array that is a copy of the
     original array, removing any items that also appear in other_array.
     (If you need set-like behavior, see the library class Set.)

        [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ] #=> [ 3, 3, 5 ]

Cheers,

--
JJ Fleck
PCSI1 Lycée Kléber

Can do the subtraction of arrays without using the operator "-"?
Is there a standard method in the class Array?

The method is `-`, and using it as an infix operator is just syntactic
sugar. If you want it to look like standard .method syntax, do this:

    > ar_3 = ar_1.-(ar_2)
    => [1, 2, 3, 8, 9, 10]

Of course, the parentheses are optional:

    > ar_3 = ar_1.- ar_2
    => [1, 2, 3, 8, 9, 10]

···

On Sun, Aug 22, 2010 at 01:00:54AM +0900, Unc88 Unc88 wrote:

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Just adding to that: if Arrays are large and / or there are frequent set operations needed then using class Set might yield better performance.

Kind regards

  robert

···

On 21.08.2010 18:27, Jean-Julien Fleck wrote:

2010/8/21 Unc88 Unc88<unc88@mail.ru>:

Can do the subtraction of arrays without using the operator "-"?
Is there a standard method in the class Array?

"-" *is* a standard method of class Array. The apparent "operator"
behavior is just syntaxic sugar added on it:

a1 = (1..10).to_a

=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

a2 = (5..7).to_a

=> [5, 6, 7]

a1.-(a2)

=> [1, 2, 3, 4, 8, 9, 10]

Last line call "-" method of object "a1" with object "a2" as an argument.

~> ri Array.-

---------------------------------------------------------------- Array#-
      array - other_array -> an_array
------------------------------------------------------------------------
      Array Difference---Returns a new array that is a copy of the
      original array, removing any items that also appear in other_array.
      (If you need set-like behavior, see the library class Set.)

         [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ] #=> [ 3, 3, 5 ]

Cheers,

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

···

On 21.08.2010 18:27, Jean-Julien Fleck wrote:

=> [5, 6, 7]
      Array Difference---Returns a new array that is a copy of the
      original array, removing any items that also appear in other_array.
      (If you need set-like behavior, see the library class Set.)

         [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ] #=> [ 3, 3, 5 ]

Cheers,

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

Kind regards

  robert

-- Just adding to that: if Arrays are large and / or there are frequent
set
-- operations needed then using class Set might yield better
performance.
I do't much understand what you mean. If not hard can give you an
example...
--
Posted via http://www.ruby-forum.com/\.

It means that there are some operations that are more efficient in Set
than in Array, and that if you need a lot of those, it would be better
to use Set instead. For example, the intersection of two Sets is
faster than the intersection of two Arrays:

require 'benchmark'
require 'set'

n = 1_000

a1 = (1..10_000).map {|x| rand(100)}
a2 = (1..10_000).map {|x| rand(100)}
s1 = Set.new.merge a1
s2 = Set.new.merge a2

Benchmark.bmbm do |x|
    x.report("array minus") do
      n.times {a1 - a2}
    end
    x.report("set &") do
      n.times {s1 & s2}
    end
end

$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 0.900000 0.000000 0.900000 ( 0.935476)
set & 0.280000 0.070000 0.350000 ( 0.361684)
-------------------------------------- total: 1.250000sec

                  user system total real
array minus 0.880000 0.010000 0.890000 ( 0.890552)
set & 0.280000 0.070000 0.350000 ( 0.353687)

Jesus.

···

On Mon, Aug 30, 2010 at 5:45 PM, Ruby Users Ruby Users <unc88@mail.ru> wrote:

Robert Klemme wrote:

On 21.08.2010 18:27, Jean-Julien Fleck wrote:

=> [5, 6, 7]
Array Difference---Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

     \[ 1, 1, 2, 2, 3, 3, 4, 5 \] \- \[ 1, 2, 4 \]  \#=&gt;   \[ 3, 3, 5 \]

Cheers,

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

Kind regards

robert

-- Just adding to that: if Arrays are large and / or there are frequent
set
-- operations needed then using class Set might yield better
performance.
I do't much understand what you mean. If not hard can give you an
example...

Robert Klemme wrote:

=> [5, 6, 7]
Array Difference---Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

     \[ 1, 1, 2, 2, 3, 3, 4, 5 \] \- \[ 1, 2, 4 \]  \#=&gt;   \[ 3, 3, 5 \]

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

-- Just adding to that: if Arrays are large and / or there are frequent
set
-- operations needed then using class Set might yield better
performance.
I do't much understand what you mean. If not hard can give you an
example...

It means that there are some operations that are more efficient in Set
than in Array, and that if you need a lot of those, it would be better
to use Set instead. For example, the intersection of two Sets is
faster than the intersection of two Arrays:

require 'benchmark'
require 'set'

n = 1_000

a1 = (1..10_000).map {|x| rand(100)}
a2 = (1..10_000).map {|x| rand(100)}
s1 = Set.new.merge a1
s2 = Set.new.merge a2

Here's another (probably more efficient) way to write that:

a1 = Array.new(10_000) { rand(100) }
a2 = Array.new(10_000) { rand(100) }

s1 = a1.to_set
s2 = a2.to_set

It would probably be better to apply #uniq! on those Arrays (or do "a1
= s2.to_a" after set creation) to get collections with identical
sizes.

Benchmark.bmbm do |x|
x.report("array minus") do
n.times {a1 - a2}
end
x.report("set &") do
n.times {s1 & s2}
end
end

$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 0.900000 0.000000 0.900000 ( 0.935476)
set & 0.280000 0.070000 0.350000 ( 0.361684)
-------------------------------------- total: 1.250000sec

             user     system      total        real

array minus 0.880000 0.010000 0.890000 ( 0.890552)
set & 0.280000 0.070000 0.350000 ( 0.353687)

I'm sorry, but you are comparing apples and oranges here:

irb(main):001:0> a=[1,2,3]; b=[2,3,4]
=> [2, 3, 4]
irb(main):002:0> a & b
=> [2, 3]
irb(main):003:0> a.to_set & b.to_set
=> #<Set: {2, 3}>
irb(main):004:0> a - b
=> [1]
irb(main):005:0> a.to_set - b.to_set
=> #<Set: {1}>

Operators - and & do not do the same thing. But they behave identical
for Array and Set!

Kind regards

robert

···

2010/8/30 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>:

On Mon, Aug 30, 2010 at 5:45 PM, Ruby Users Ruby Users <unc88@mail.ru> wrote:

On 21.08.2010 18:27, Jean-Julien Fleck wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

=> [5, 6, 7]
Array Difference---Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

     \[ 1, 1, 2, 2, 3, 3, 4, 5 \] \- \[ 1, 2, 4 \]  \#=&gt;   \[ 3, 3, 5 \]

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

-- Just adding to that: if Arrays are large and / or there are frequent
set
-- operations needed then using class Set might yield better
performance.
I do't much understand what you mean. If not hard can give you an
example...

It means that there are some operations that are more efficient in Set
than in Array, and that if you need a lot of those, it would be better
to use Set instead. For example, the intersection of two Sets is
faster than the intersection of two Arrays:

require 'benchmark'
require 'set'

n = 1_000

a1 = (1..10_000).map {|x| rand(100)}
a2 = (1..10_000).map {|x| rand(100)}
s1 = Set.new.merge a1
s2 = Set.new.merge a2

Here's another (probably more efficient) way to write that:

a1 = Array.new(10_000) { rand(100) }
a2 = Array.new(10_000) { rand(100) }

s1 = a1.to_set
s2 = a2.to_set

It would probably be better to apply #uniq! on those Arrays (or do "a1
= s2.to_a" after set creation) to get collections with identical
sizes.

Yes, as an afterthought it would have been better to build two arrays
for example (1..5000).to_a and (3000..8000).to_a and randomize them.

Benchmark.bmbm do |x|
x.report("array minus") do
n.times {a1 - a2}
end
x.report("set &") do
n.times {s1 & s2}
end
end

$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 0.900000 0.000000 0.900000 ( 0.935476)
set & 0.280000 0.070000 0.350000 ( 0.361684)
-------------------------------------- total: 1.250000sec

             user     system      total        real

array minus 0.880000 0.010000 0.890000 ( 0.890552)
set & 0.280000 0.070000 0.350000 ( 0.353687)

I'm sorry, but you are comparing apples and oranges here:

irb(main):001:0> a=[1,2,3]; b=[2,3,4]
=> [2, 3, 4]
irb(main):002:0> a & b
=> [2, 3]
irb(main):003:0> a.to_set & b.to_set
=> #<Set: {2, 3}>
irb(main):004:0> a - b
=> [1]
irb(main):005:0> a.to_set - b.to_set
=> #<Set: {1}>

Operators - and & do not do the same thing. But they behave identical
for Array and Set!

I totally brainfarted !!! The reviewed version, with surprising
results, at least for me: Set#- is less efficient than Array#- (unless
I'm doing something wrong again):

require 'benchmark'
require 'set'

n = 1_000

a1 = (1..5_000).sort_by { rand }
a2 = (3_000..8_000).sort_by { rand }
s1 = a1.to_set
s2 = a2.to_set

Benchmark.bmbm do |x|
    x.report("array minus") do
      n.times {a1 - a2}
    end
    x.report("set minus") do
      n.times {s1 - s2}
    end
end

$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 1.370000 0.010000 1.380000 ( 1.398643)
set minus 10.880000 3.060000 13.940000 ( 14.100127)
------------------------------------- total: 15.320000sec

                  user system total real
array minus 1.410000 0.010000 1.420000 ( 1.428664)
set minus 10.990000 3.070000 14.060000 ( 14.188415)

Could it be because Array is written in C, while Set is in Ruby
iterating over an Enumerable object? Did I do something wrong again?

Jesus.

···

On Tue, Aug 31, 2010 at 2:34 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

2010/8/30 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>:

On Mon, Aug 30, 2010 at 5:45 PM, Ruby Users Ruby Users <unc88@mail.ru> wrote:

On 21.08.2010 18:27, Jean-Julien Fleck wrote:

Robert Klemme wrote:

=> [5, 6, 7]
Array Difference---Returns a new array that is a copy of the
original array, removing any items that also appear in other_array.
(If you need set-like behavior, see the library class Set.)

     \[ 1, 1, 2, 2, 3, 3, 4, 5 \] \- \[ 1, 2, 4 \]  \#=&gt;   \[ 3, 3, 5 \]

Just adding to that: if Arrays are large and / or there are frequent set
operations needed then using class Set might yield better performance.

-- Just adding to that: if Arrays are large and / or there are frequent
set
-- operations needed then using class Set might yield better
performance.
I do't much understand what you mean. If not hard can give you an
example...

It means that there are some operations that are more efficient in Set
than in Array, and that if you need a lot of those, it would be better
to use Set instead. For example, the intersection of two Sets is
faster than the intersection of two Arrays:

require 'benchmark'
require 'set'

n = 1_000

a1 = (1..10_000).map {|x| rand(100)}
a2 = (1..10_000).map {|x| rand(100)}
s1 = Set.new.merge a1
s2 = Set.new.merge a2

Here's another (probably more efficient) way to write that:

a1 = Array.new(10_000) { rand(100) }
a2 = Array.new(10_000) { rand(100) }

s1 = a1.to_set
s2 = a2.to_set

It would probably be better to apply #uniq! on those Arrays (or do "a1
= s2.to_a" after set creation) to get collections with identical
sizes.

Yes, as an afterthought it would have been better to build two arrays
for example (1..5000).to_a and (3000..8000).to_a and randomize them.

Benchmark.bmbm do |x|
x.report("array minus") do
n.times {a1 - a2}
end
x.report("set &") do
n.times {s1 & s2}
end
end

$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 0.900000 0.000000 0.900000 ( 0.935476)
set & 0.280000 0.070000 0.350000 ( 0.361684)
-------------------------------------- total: 1.250000sec

             user     system      total        real

array minus 0.880000 0.010000 0.890000 ( 0.890552)
set & 0.280000 0.070000 0.350000 ( 0.353687)

I'm sorry, but you are comparing apples and oranges here:

irb(main):001:0> a=[1,2,3]; b=[2,3,4]
=> [2, 3, 4]
irb(main):002:0> a & b
=> [2, 3]
irb(main):003:0> a.to_set & b.to_set
=> #<Set: {2, 3}>
irb(main):004:0> a - b
=> [1]
irb(main):005:0> a.to_set - b.to_set
=> #<Set: {1}>

Operators - and & do not do the same thing. But they behave identical
for Array and Set!

I totally brainfarted !!! The reviewed version, with surprising
results, at least for me: Set#- is less efficient than Array#- (unless
I'm doing something wrong again):

require 'benchmark'
require 'set'

n = 1_000

a1 = (1..5_000).sort_by { rand }
a2 = (3_000..8_000).sort_by { rand }
s1 = a1.to_set
s2 = a2.to_set

Benchmark.bmbm do |x|
x.report("array minus") do
n.times {a1 - a2}
end
x.report("set minus") do
n.times {s1 - s2}
end
end

$ ruby set_bm.rb
Rehearsal -----------------------------------------------
array minus 1.370000 0.010000 1.380000 ( 1.398643)
set minus 10.880000 3.060000 13.940000 ( 14.100127)
------------------------------------- total: 15.320000sec

             user     system      total        real

array minus 1.410000 0.010000 1.420000 ( 1.428664)
set minus 10.990000 3.070000 14.060000 ( 14.188415)

Could it be because Array is written in C, while Set is in Ruby
iterating over an Enumerable object?

Probably. Could be that your collections were not large enough to
show the advantage of Set or value distributions are unfortunate for
Set.

Please keep in mind that performance is not the only advantage of
using Set - it's also the semantics to have each value only once in
the set and it helps documenting requirements.

Did I do something wrong again?

Not as far as I can see. I changed it a bit to look at different sizes:

require 'benchmark'
require 'set'

n = 100

Benchmark.bmbm 30 do |x|
  size = 1000

  while size < 100_000

    a1 = (1..size).sort_by { rand }
    a2 = ((size / 2)..(size / 2 + size)).sort_by { rand }
    s1 = a1.to_set
    s2 = a2.to_set

    x.report("array minus #{size}") do
      n.times {a1 - a2}
    end

    x.report("set minus #{size}") do
      n.times {s1 - s2}
    end

    x.report("array & #{size}") do
      n.times {a1 & a2}
    end

    x.report("set & #{size}") do
      n.times {s1 & s2}
    end

    size *= 2
  end
end

Kind regards

robert

···

2010/8/31 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>:

On Tue, Aug 31, 2010 at 2:34 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

2010/8/30 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>:

On Mon, Aug 30, 2010 at 5:45 PM, Ruby Users Ruby Users <unc88@mail.ru> wrote:

On 21.08.2010 18:27, Jean-Julien Fleck wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/