Detecting duplicates in an array, anything in the standard library?

Jimmy Kofler wrote:

Jeremy Woertink wrote:
I actually had to ... find all the duplicate account
numbers and the number of times they were duplicated and ... .
...
~Jeremy

A much less verbose 'nil' fix of the original version would be to use
[v] instead of v:

a = [nil,1,2,2,3,nil]
p a.uniq.map {|v| (a - [v]).size < (a.size - 1) ? [v] :
nil}.compact.flatten
=> [nil, 2]

This fix does not work for a = [nil,1,2,[7],2,[7],3,nil], but the
previous version using "(a.size - a.nitems > 1) ? ..." does. Ruby 1.9
though is said to introduce a non-greedy Array#flatten:

# Ruby 1.9
a = [nil,1,[7],2,2,[7],3,nil]
p a.uniq.map {|v| (a - [v]).size < (a.size - 1) ? [v] :
nil}.compact.flatten(1)
=> [nil, [7], 2]

Cheers,

j.k.

···

--
Posted via http://www.ruby-forum.com/\.

Hi --

···

On Sun, 19 Aug 2007, Wolfgang Nádasi-Donner wrote:

David A. Black wrote:

Hi --

On Sun, 19 Aug 2007, Wolfgang Nádasi-Donner wrote:

: [r[0], r[1][1..-1]]}[0].uniq # => ["c"]

How about:

  >> a = [1,2,3,4,5,4,2,2]
   => [1, 2, 3, 4, 5, 4, 2, 2]
  >> a.inject() {|acc,e| acc << e unless acc.include?(e); acc }
   => [1, 2, 3, 4, 5]

David

The problem is, that he wants all non unique elements. Unfortunately the
difference of two arrays doesn't care about double elements,

Sorry, just ignore me. I've reinvented Array#uniq :slight_smile: /me reaches for
coffee....

David

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

It does, but as far as I can see OP wanted exactly the duplicates back.

Cheers

robert

···

2007/8/20, thomas.macklin@gmail.com <thomas.macklin@gmail.com>:

On Aug 19, 5:16 pm, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 19.08.2007 23:15, Robert Klemme wrote:
>
>
>
> > On 19.08.2007 12:38, Thibaut Barrère wrote:
> >> Hi!
>
> >> Just wondering if there is something simple already built in the std
> >> library to remove duplicates from an array (or an enumerable). I've
> >> seen and used various approaches, like:
>
> >> module Enumerable
> >> def dups
> >> inject({}) {|h,v| h[v]=h[v].to_i+1; h}.reject{|k,v| v==1}.keys
> >> end
> >> end
>
> >> which will give:
>
> >>> %w(a b c c).dups
> >> => ["c"]
>
> > Actually you are not deleting duplicates as far as I can see.
>
> Did I say it's too late? Man, I should've worn my glasses...
>
> > Here's another one
>
> > irb(main):012:0> a.inject(Hash.new(0)) {|h,x|
> > h+=1;h}.inject(){|h,(k,v)|h<<k if v>1;h}
> > => ["c"]
>
> > You could even change that to need just one iteration through the
> > original array but it's too late and I'm too lazy. :slight_smile:
>
> Cheers
>
> robert

or...

require 'set'

new_ary = ary.to_set.to_a #set strips dups.

how about calling the uniq method:

[1,2,2,3].uniq

or did I miss the point again? :wink:

···

On 20 Aug 2007, at 13:45, thomas.macklin@gmail.com wrote:

On Aug 19, 5:16 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On 19.08.2007 23:15, Robert Klemme wrote:

On 19.08.2007 12:38, Thibaut Barrère wrote:

Hi!

Just wondering if there is something simple already built in the std
library to remove duplicates from an array (or an enumerable).

# irb(main):015:0> a.select{|e| (a-[e]).size < a.size - 1}.uniq
# => [1, 2]

oops,

irb(main):014:0> a
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):015:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}
=> [1, 2]

···

From: Peña, Botp [mailto:botp@delmonte-phil.com]

Nice! But I'd think this is more efficient:

irb(main):001:0> a = [1, 1, 2, 2, 2, 4, 3]
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}
=> [1, 2]

Kind regards

robert

···

2007/8/21, Peña, Botp <botp@delmonte-phil.com>:

From: Jimmy Kofler [mailto:koflerjim@mailinator.com]
# uniq.map {|v| (self - [v]).size < (self.size - 1) ? v : nil}.compact

cool.
could we simplify it like,

irb(main):014:0> a
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):015:0> a.select{|e| (a-[e]).size < a.size - 1}.uniq
=> [1, 2]

Posted by Peña, Botp (Guest) on 21.08.2007 10:31

could we simplify it like

irb(main):014:0> a
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):015:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}
=> [1, 2]

Sure.

ruby -e 'a = [nil,1,2,2,3,nil]' -e 'p a.uniq.select{|e| (a-[e]).size <
a.size - 1}'
=> [nil, 2]

So we do not need to fix the original version to handle nil correctly:

ruby -e 'a = [nil,1,2,2,3,nil]' -e 'p (a.size - a.nitems > 1) ? ([nil]
+ a.uniq.map {|v| (a - [v]).size < (a.size - 1) ? v : nil}.compact) :
(a.uniq.map {|v| (a - [v]).size < (a.size - 1) ? v : nil}.compact)'
=> [nil, 2]

Cheers,

j.k.

···

--
Posted via http://www.ruby-forum.com/\.

I'm a n00b, sorry if I'm poking nose in. Couldn't the op do something using &, like so:

[1,2,3] & [2,3,4] == [2,3]

?

Regards Gabe

···

On 20 Aug 2007, at 14:50, Robert Klemme wrote:

require 'set'

new_ary = ary.to_set.to_a #set strips dups.

It does, but as far as I can see OP wanted exactly the duplicates back.

Cheers

robert

# irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}

compare also,

irb(main):056:0> b=a.dup
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):057:0> b.uniq.select{|e| (b.reject!{|f| f == e}).size > 1}
=> [1, 2]

···

From: Robert Klemme [mailto:shortcutter@googlemail.com]

Hi --

···

On Tue, 21 Aug 2007, Gabriel Dragffy wrote:

On 20 Aug 2007, at 14:50, Robert Klemme wrote:

require 'set'

new_ary = ary.to_set.to_a #set strips dups.

It does, but as far as I can see OP wanted exactly the duplicates back.

Cheers

robert

I'm a n00b, sorry if I'm poking nose in. Couldn't the op do something using &, like so:

[1,2,3] & [2,3,4] == [2,3]

The original question was how to get all dups occurring in one array:

   [1,2,3,2,4,5,5,6] => [2,5]

David

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

From: Robert Klemme [mailto:shortcutter@googlemail.com]
# irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}

compare also,

irb(main):056:0> b=a.dup
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):057:0> b.uniq.select{|e| (b.reject!{|f| f == e}).size > 1}
=> [1, 2]

I came up with something vaguely similar:

class Array
   def dupes
     a = self.dup
     self.partition { |o| a.delete(o) }.last
   end
end

>> [1,2,2,3,4,4].dupes
=> [2, 4]

···

On Aug 21, 2007, at 01:59 , Peña, Botp wrote:

I still think it's easier just to union itself...

a = [1,2,3,2,1]
b = a & a
b = [1,2,3]
---------------------------------------------------------------|
~Ari
"I don't suffer from insanity. I enjoy every minute of it" --1337est man alive

···

On Aug 21, 2007, at 4:59 AM, Peña, Botp wrote:

From: Robert Klemme [mailto:shortcutter@googlemail.com]
# irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}

compare also,

irb(main):056:0> b=a.dup
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):057:0> b.uniq.select{|e| (b.reject!{|f| f == e}).size > 1}
=> [1, 2]

Hi --

From: Robert Klemme [mailto:shortcutter@googlemail.com]
# irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}

compare also,

irb(main):056:0> b=a.dup
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):057:0> b.uniq.select{|e| (b.reject!{|f| f == e}).size > 1}
=> [1, 2]

I came up with something vaguely similar:

class Array
def dupes
   a = self.dup
   self.partition { |o| a.delete(o) }.last
end
end

[1,2,2,3,4,4].dupes

=> [2, 4]

You'd want to throw a .uniq on there; otherwise, non-consecutive dupes
get processed twice:

[1,2,2,3,4,4,2].dupes

=> [2, 4, 2]

David

···

On Tue, 21 Aug 2007, Ryan Davis wrote:

On Aug 21, 2007, at 01:59 , Peña, Botp wrote:

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

...but that's not what the OP wanted. What you've written is the same
as the #uniq method.

Don't feel bad, this thread has been filled with people answering the
wrong question. :stuck_out_tongue: The original question was roughly "How do I find
out all the elements in the array that are duplicates?"

Solutions to that question would not include '3' in the above results.
It's unclear to me if %w| a b b b | should include 'b' once or twice
in the output, though, and the original poster has not clarified that,
that I can see.

···

On Aug 21, 10:04 am, Ari Brown <a...@aribrown.com> wrote:

On Aug 21, 2007, at 4:59 AM, Peña, Botp wrote:

> From: Robert Klemme [mailto:shortcut...@googlemail.com]
> # irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}

> compare also,

> irb(main):056:0> b=a.dup
> => [1, 1, 2, 2, 2, 4, 3]
> irb(main):057:0> b.uniq.select{|e| (b.reject!{|f| f == e}).size > 1}
> => [1, 2]

I still think it's easier just to union itself...

a = [1,2,3,2,1]
b = a & a
b = [1,2,3]

Hi --

···

On Wed, 22 Aug 2007, Phrogz wrote:

On Aug 21, 10:04 am, Ari Brown <a...@aribrown.com> wrote:

On Aug 21, 2007, at 4:59 AM, Peña, Botp wrote:

From: Robert Klemme [mailto:shortcut...@googlemail.com]
# irb(main):002:0> a.uniq.select{|e| (a-[e]).size < a.size - 1}

compare also,

irb(main):056:0> b=a.dup
=> [1, 1, 2, 2, 2, 4, 3]
irb(main):057:0> b.uniq.select{|e| (b.reject!{|f| f == e}).size > 1}
=> [1, 2]

I still think it's easier just to union itself...

a = [1,2,3,2,1]
b = a & a
b = [1,2,3]

...but that's not what the OP wanted. What you've written is the same
as the #uniq method.

Don't feel bad, this thread has been filled with people answering the
wrong question. :stuck_out_tongue: The original question was roughly "How do I find
out all the elements in the array that are duplicates?"

Solutions to that question would not include '3' in the above results.
It's unclear to me if %w| a b b b | should include 'b' once or twice
in the output, though, and the original poster has not clarified that,
that I can see.

I think once, since it's just the quality of being non-unique in
the array that qualifies an object for inclusion. At least, that's my
understanding, though as one of the people who reimplemented
Array#uniq, I may not be the right person to listen to :slight_smile:

David

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)