Is this a sensible implementation for Array#group_by?

I was looking for a method to split an array into smaller arrays based
on some property of the members (set with a code block). I came up with
the following and wanted to know what you guys think about a) whether
it's sensible b) whether it's been done already:

class Array
  def group_by
    result_array =
    self.each do |element|
      key = yield element
      if (found = result_array.assoc(key))
        found[1] << element
      else
        result_array << [key, [element]]
      end
    end
    return result_array.collect{|a| a[1]}
  end
end

arr = %w(apple banana pear plum nectarine orange melon)

=> ["apple", "banana", "pear", "plum", "nectarine", "orange", "melon"]
#group by length of name

arr.group_by{|s| s.length}

=> [["apple", "melon"], ["banana", "orange"], ["pear", "plum"],
["nectarine"]]
#group by whether it has an e

arr.group_by{|s| (s =~ /e/) == nil}

=> [["apple", "pear", "nectarine", "orange", "melon"], ["banana",
"plum"]]

One thing i haven't done is make it sort: since the grouping key is just
an object and most objects aren't sortable. I could get round this but
not without slowing it down and the user can always sort the results by
comparing the first member of each subarray.

I'm just after some feedback really. I'm guessing i just couldn't find
the good implementation of it :slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.

It seems sensible to me, although I'd use a hash instead of a
associative array - it just looks cleaner. I didn't check the
performance difference though.

class Array
def group_by
   result={}
   self.each do |element|
     (result[yield(element)]||=) << element
- end
   return result.values
end
end

I think the Facets library already has a similar method.
-Adam

···

On Mon, Sep 8, 2008 at 10:04 AM, Max Williams <toastkid.williams@gmail.com> wrote:

I was looking for a method to split an array into smaller arrays based
on some property of the members (set with a code block). I came up with
the following and wanted to know what you guys think about a) whether
it's sensible b) whether it's been done already:

class Array
def group_by
   result_array =
   self.each do |element|
     key = yield element
     if (found = result_array.assoc(key))
       found[1] << element
     else
       result_array << [key, [element]]
     end
   end
   return result_array.collect{|a| a[1]}
end
end

I'm just after some feedback really. I'm guessing i just couldn't find
the good implementation of it :slight_smile:

Adam Shelly wrote:
(...)>

I'm just after some feedback really. I'm guessing i just couldn't find
the good implementation of it :slight_smile:

It seems sensible to me, although I'd use a hash instead of a
associative array - it just looks cleaner. I didn't check the
performance difference though.

-Adam

It's also in Ruby 1.8.7 Enumerable.

Another possibility: use a set.

require 'set'
fruits = %w(apple banana pear plum nectarine orange melon).to_set
p fruits.classify{|s| s.length}
p fruits.classify{|s| s.include?("e")}

regards,

Siep

···

--
Posted via http://www.ruby-forum.com/\.

Adam Shelly wrote:

It seems sensible to me, although I'd use a hash instead of a
associative array - it just looks cleaner. I didn't check the
performance difference though.

I thought about a hash first, but for some reason shied away from a hash
where the keys could be any object, including nil. There's no reason to
be afraid of that though, is there? I think a hash would probably be
faster.

···

--
Posted via http://www.ruby-forum.com/\.

Indeed, Facets does have an Enumerable#group_by. And it has an
Enumerable#cluster_by as well. And the latter is the one you're
looking for, because you want an Array and not a Hash.

Group_by uses each, because it's faster than inject.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

···

----------------------------------------------------------------

module Enumerable
   def group_by
      res = {}
      each{|e| (res[yield(e)] ||= []) << e}
      res
    end

   def cluster_by(&block)
     #group_by(&block).values # In case of unsortable keys.
     group_by(&block).sort.transpose.pop || []
   end
end

----------------------------------------------------------------

a = %w(apple banana pear plum nectarine orange melon)

a.group_by{|e| e.length} # ==> {5=>["apple", "melon"],
6=>["banana", "orange"], 9=>["nectarine"], 4=>["pear", "plum"]}
a.cluster_by{|e| e.length} # ==> [["pear", "plum"], ["apple",
"melon"], ["banana", "orange"], ["nectarine"]]

----------------------------------------------------------------

It's also in Ruby 1.8.7 Enumerable.

ah...we're still on 1.8.6 round these parts. We need to change up
really...i keep seeing this cool stuff.

Another possibility: use a set.

investigates....ah yes, sets, i'd completely overlooked those. For some
reason the Set class is hard to find in the api, or at least hard for me
to find in this particular api - RDoc Documentation

Converting to_set, then calling divide, then calling to_a again can't be
very efficient though, can it?

thanks

···

--
Posted via http://www.ruby-forum.com/\.

Adam Shelley wrote

I think the Facets library already has a similar method.

Erik Veenstra wrote:

Indeed, Facets does have an Enumerable#group_by. And it has an
Enumerable#cluster_by as well. And the latter is the one you're
looking for, because you want an Array and not a Hash.

Facets - investigates again... now that is *very* good indeed. Wow. I
had a feeling this would exist already in a better form :slight_smile:

thanks a lot everyone.

···

--
Posted via http://www.ruby-forum.com/\.

If I would want to do it myself, then I'd probably do

module Enumerable
  def group_by
    result = Hash.new {|h,k| h[k] = }
    each {|el| result[yield el] << el}
    result
  end
end

Kind regards

robert

···

2008/9/9 Max Williams <toastkid.williams@gmail.com>:

Adam Shelley wrote

I think the Facets library already has a similar method.

Erik Veenstra wrote:

Indeed, Facets does have an Enumerable#group_by. And it has an
Enumerable#cluster_by as well. And the latter is the one you're
looking for, because you want an Array and not a Hash.

Facets - investigates again... now that is *very* good indeed. Wow. I
had a feeling this would exist already in a better form :slight_smile:

--
use.inject do |as, often| as.you_can - without end