How to make combinations of an array to produce all possible expressions?

I have an array ‘conds’, which contains some sub-expressions for an
xpath query:

conds = ["@title=‘Foo’", “@edition=‘Bar’”, “@date=‘20040513’”]

Is there an existing library that lets me construct all possible
combinations like this?

puts conds..collect{|n| n.join ’ and '}

which produces:

@title=‘Foo’ and @edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’ and @edition=‘Bar’
@title=‘Foo’ and @date=‘20040513’
@edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’
@edition=‘Bar’
@date=‘20040513’

“Erik Terpstra” erik@terpnet.nl schrieb im Newsbeitrag
news:40a333fc$0$65124$e4fe514c@news.xs4all.nl…

I have an array ‘conds’, which contains some sub-expressions for an
xpath query:

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

Is there an existing library that lets me construct all possible
combinations like this?

puts conds..collect{|n| n.join ’ and '}

which produces:

@title=‘Foo’ and @edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’ and @edition=‘Bar’
@title=‘Foo’ and @date=‘20040513’
@edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’
@edition=‘Bar’
@date=‘20040513’

Not yet, but:

module Enumerable
def combine
masks = inject([, 1]){|(ar, m), e| [ar<<m, m<<1]}[0]
all = masks.inject(0){|al, m| al|m}

result = []

for i in 1..all do
  tmp = []

  each_with_index do |e, idx|
    tmp << e unless (masks[idx] & i) == 0
  end

  result << tmp
end

result

end
end

irb(main):098:0> puts conds.combine.collect{|n| n.join ’ and '}
@title=‘Foo’
@edition=‘Bar’
@title=‘Foo’ and @edition=‘Bar’
@date=‘20040513’
@title=‘Foo’ and @date=‘20040513’
@edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’ and @edition=‘Bar’ and @date=‘20040513’
=> nil

robert

Erik Terpstra wrote:

I have an array ‘conds’, which contains some sub-expressions for an
xpath query:

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

There’s no method I know of, but this seems to work (note that I’ve explicitly avoided generating an empty set, because you didn’t have one in your example, but it should probably be included if we wanted to call this method “subsets” as I have) …

module Enumerable
def subsets
values =

(2 << length - 1).times do |n|
  items = []

  length.times do |i|
    if n[i] == 1
      items << self[i]
    end
  end

  if items.length > 0  # I'd omit this test for a real "subsets"
    values << items
  end
end

return values

end
end

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

conds.subsets.collect { |subset| p subset.join(" and ") }

Hi,
I have found at least to ways.
Using recursion (maybe not so efficient):

class Array
def combine
comb = Proc.new do |a, *t|
tail = (t.empty? ? : comb[t])
[[a]] + tail.collect{ |e| [a] + e} + tail
end
comb[self]
end
end

The following should perform better:

class Array
def combine2
(1…(2**(size) - 1)).collect do |i|
i <<= 1
self.select { (i >>= 1) & 1 == 1}
end
end
end

irb(main):018:0> puts conds.combine.collect{|n| n.join ’ and '}
@title=‘Foo’
@title=‘Foo’ and @edition=‘Bar’
@title=‘Foo’ and @edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’ and @date=‘20040513’
@edition=‘Bar’
@edition=‘Bar’ and @date=‘20040513’
@date=‘20040513’
=> nil
irb(main):019:0> puts conds.combine2.collect{|n| n.join ’ and '}
@title=‘Foo’
@edition=‘Bar’
@title=‘Foo’ and @edition=‘Bar’
@date=‘20040513’
@title=‘Foo’ and @date=‘20040513’
@edition=‘Bar’ and @date=‘20040513’
@title=‘Foo’ and @edition=‘Bar’ and @date=‘20040513’
=> nil

Kristof

···

On Thu, 13 May 2004 10:38:19 +0200, Erik Terpstra wrote:

I have an array ‘conds’, which contains some sub-expressions for an
xpath query:

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

I have an array ‘conds’, which contains some sub-expressions for an
xpath query:

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

Something i made some weeks ago seems related to this. In my case, i
wanted to generate a number of combinations by assigning a set of
possible values to a number of ‘fields’ (you also seem to have fields
and values you want to assign to them). I don’t know whether this is
relevant to you, but it might give you some ideas…

So, the first thing i had, was a class ‘Profile’ which is merely a
placeholder for the ‘fields’. The constructor of ‘Profile’ takes a
Hash as parameter. This Hash is used to assign a certain value to
certain fields (the fields are just instance variables), fields not
included as key in the Hash recieve a default value.

I needed to build a lot of these Profiles and my idea was to define a
number of ‘constraints’ on certain fields, like this:

constraints = {“size” => [22,23,24,25],
“tidy” => [true,false],
“new” => [true],
“freq” => [[2,2],[2,4],[2,8]] }

With those constraints, i want to generate Profiles initialized with
the following Hashes:

{“size” => 22, “tidy” => true, “new” => true, “freq” => [2,2]}
{“size” => 22, “tidy” => true, “new” => true, “freq” => [2,4]}
{“size” => 22, “tidy” => true, “new” => true, “freq” => [2,8]}
{“size” => 22, “tidy” => false, “new” => true, “freq” => [2,2]}
{“size” => 22, “tidy” => false, “new” => true, “freq” => [2,4]}
{“size” => 22, “tidy” => false, “new” => true, “freq” => [2,8]}
{“size” => 23, “tidy” => true, “new” => true, “freq” => [2,2]}
{“size” => 23, “tidy” => true, “new” => true, “freq” => [2,4]}
{“size” => 23, “tidy” => true, “new” => true, “freq” => [2,8]}
{“size” => 23, “tidy” => false, “new” => true, “freq” => [2,2]}
{“size” => 23, “tidy” => false, “new” => true, “freq” => [2,4]}
{“size” => 23, “tidy” => false, “new” => true, “freq” => [2,8]}

To do this, i made the following ‘GeneratingHash’. You construct it
with a constraints-Hash, and then you can use it as an iterator with
the ‘each’ method. You could use the same idea to generate
combinations of Arrays. A good thing about this way of working is that
you can use it to generate one Hash at a time, do stuff, forget about
the first Hash, generate the next one, do stuff,… might be important
for memory usage in case you have lots of combinations. (and yeah, i
know, it is recursive and will already create x Hashes with ‘x’ the
number of keys you use…)

class GeneratingHash

attr_accessor :constraints

def initialize(constraints)
@constraints = constraints
end

def recursive_each(keys,hash,&block)
if keys.empty?
yield hash
else
key = keys.pop
@constraints[key].each { |val|
my_hash = hash.clone
my_hash[key]=val
my_keys = keys.clone
recursive_each(my_keys,my_hash,&block)
}
end
end

def each(&block)
mykeys = @constraints.keys
recursive_each(mykeys,Hash.new(),&block)
end

end

For those who are interested, i also included the rest of the
(relevant) code. (this is used in a program to perform a number of
benchmarks based on a lot of different ‘setups’(Profiles))
Comments on the code are welcome :slight_smile:

Ruben

···

==============================

class Profile

attr_accessor :size, :tidy, :new, :lo_freq, :hi_freq

def initialize(values)
@size = 22 unless @size = values[“size”]
@tidy = false unless @tidy = values[“tidy”]
@new = false unless @new = values[“new”]
if values[“freq”]
@lo_freq = values[“freq”][0]
@hi_freq = values[“freq”][1]
else
@lo_freq = @hi_freq = 0
end
end

end

class ProfileCollection

attr_accessor :constraints

def initialize(constraints)
@constraints = constraints
end

def each(&block)
gh = GeneratingHash.new(@constraints)
gh.each { |hash|
prof = Profile.new(hash)
yield prof
}
end

end

=========================================

and those classes are used like this:

constraints = {“size” => [22,23,24,25],
“tidy” => [true,false],
“new” => [true],
“freq” => [[2,2],[2,4],[2,8]] }
prof_collection = ProfileCollection.new(constraints)
prof_collection.each { |profile|

do stuff with the profile

}

“Harry Ohlsen” harryo@qiqsolutions.com schrieb im Newsbeitrag
news:40A349EB.8040002@qiqsolutions.com

Erik Terpstra wrote:

I have an array ‘conds’, which contains some sub-expressions for an
xpath query:

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

There’s no method I know of, but this seems to work (note that I’ve
explicitly avoided generating an empty set, because you didn’t have one in
your example, but it should probably be included if we wanted to call this
method “subsets” as I have) …

module Enumerable
def subsets
values =

(2 << length - 1).times do |n|
  items = []

  length.times do |i|
    if n[i] == 1
      items << self[i]
    end
  end

  if items.length > 0  # I'd omit this test for a real "subsets"
    values << items
  end
end

return values

end
end

Ah, similar idea but nicer coding. I like especially your calculation of
the counting range and int[idx] as bit test. I didn’t know that one.

Btw, you don’t need the test for length 0 if you do

for n in 1 … (2<<length) do

Regards

robert

Robert Klemme wrote:

Ah, similar idea but nicer coding.

Coming from you, that’s a much appreciated compliment!

I like especially your calculation of
the counting range and int[idx] as bit test. I didn’t know that one.

I discovered during a re-browse of pickaxe a couple of months ago, but this is the first time I’ve actually had a chance to use it.

Btw, you don’t need the test for length 0 if you do

for n in 1 … (2<<length) do

Good point.

Something I was thinking about was that the power set (set of subsets) becomes large very quickly (obvious from the “2 << length”, I guess). It might be nice to have an iterator, too. The method that returns the collection of subsets could then just use it.

Something like this (I’ve removed the empty set test, since I think this is cleaner in a mathematical sense) …

module Enumerable
def each_subset(&block)
(2 << length - 1).times do |n|
subset =

  length.times do |i|
    if n[i] == 1
      subset << self[i]
    end
  end

  yield subset
end

end

def subsets
subsets =

each_subset { |s| subsets << s }

return subsets

end
end

if FILE == $0

conds = [“@title=‘Foo’”, “@edition=‘Bar’”, “@date=‘20040513’”]

puts “Using iterator …”
puts

conds.each_subset { |subset| p subset.join(" and ") }

puts
puts “Using aggregate …”
puts

conds.subsets.collect { |subset| p subset.join(" and ") }

end

“Harry Ohlsen” harryo@qiqsolutions.com schrieb im Newsbeitrag
news:40A41302.9090307@qiqsolutions.com

Robert Klemme wrote:

Ah, similar idea but nicer coding.

Coming from you, that’s a much appreciated compliment!

You’re welcome! :slight_smile:

I like especially your calculation of
the counting range and int[idx] as bit test. I didn’t know that one.

I discovered during a re-browse of pickaxe a couple of months ago, but
this is the first time I’ve actually had a chance to use it.

Btw, you don’t need the test for length 0 if you do

for n in 1 … (2<<length) do

Good point.

Something I was thinking about was that the power set (set of subsets)
becomes large very quickly (obvious from the “2 << length”, I guess). It
might be nice to have an iterator, too. The method that returns the
collection of subsets could then just use it.

Another good point!

Something like this (I’ve removed the empty set test, since I think this
is cleaner in a mathematical sense) …

That might be handled by an additional flag with a default value (which
one, the mathematical or the practical?). Practically you often don’t
need / want the empty set.

Funny to see how this evolves. My current vesion looks like this:

module Enumerable
def each_subset(skip_empty = false)
for n in (skip_empty ? 1 : 0) … (1 << size) do
subset =

  each_with_index do |elem, i|
    subset << elem if n[i]
  end

  yield subset
end

self

end

def powerset(skip_empty = false)
subsets =

each_subset(skip_empty) { |s| subsets << s }

return subsets

end
end

Changes / Improvements:

  • self[index] is not used since it is not available with all Enumerables
  • flag ‘skip_empty’ added
  • self returned from iterator
  • renamed #subsets to #powerset (this is the mathematical correct term,
    isn’t it)
  • changed (2 << length - 1) to (1 << length)

Here’s another experimental version that works even if #size is not
supported. This really needs only #each. Alternatively one could do an
initial iteration to calculate the size - that saves the intermediate
array.

module Enumerable
def each_subset(skip_empty = false)
enum = respond_to?( :size ) ? self : to_a

for n in (skip_empty ? 1 : 0) ... (1 << enum.size) do
  subset = []

  enum.each_with_index do |elem, i|
    subset << elem if n[i] == 1
  end

  yield subset
end

self

end

def powerset(skip_empty = false)
subsets =

each_subset(skip_empty) { |s| subsets << s }

return subsets

end
end

test class

class Test
include Enumerable

def initialize(n); @n=n; end
def each; @n.times {|c| yield c}; self; end
end

p Test.new(3).powerset

Kind regards

robert

module Enumerable
def each_subset(skip_empty = false)
enum = respond_to?( :size ) ? self : to_a

for n in (skip_empty ? 1 : 0) ... (1 << enum.size) do
  subset = []

  enum.each_with_index do |elem, i|
    subset << elem if n[i] == 1
  end

  yield subset
end

self

end

def powerset(skip_empty = false)
subsets =

each_subset(skip_empty) { |s| subsets << s }

return subsets

end

irb(main):013:0> def powerset( x )
irb(main):014:1> x.inject([]) {|m, n| m.map {|b| [n] + b} + m }
irb(main):015:1> end
irb(main):017:0> a = [1,2,3]
=> [1, 2, 3]
irb(main):018:0> powerset a
=> [[3, 2, 1], [3, 2], [3, 1], [3], [2, 1], [2], [1], ]
irb(main):019:0>

Only works for arrays, but it might be possible to generalize it?

Robert Klemme wrote:

Funny to see how this evolves. My current vesion looks like this:

module Enumerable
def each_subset(skip_empty = false)
for n in (skip_empty ? 1 : 0) … (1 << size) do
subset =

  each_with_index do |elem, i|
    subset << elem if n[i]
  end

  yield subset
end

self

end

def powerset(skip_empty = false)
subsets =

each_subset(skip_empty) { |s| subsets << s }

return subsets

end
end

The first version in your latest e-mail didn’t work for me. It just gave me the complete array every time.

I think it’s because when n[i] is zero, that’s not the same as false, hence the “if” always succeeds.

You just need to change that line to

    subset << elem if (n[i] == 1)

The second version worked fine.

I like all of your clean-ups. I wonder whether re-coding in C would increase performance?

Anyway, it might be worth creating an RCR to have this added to Enumerable.

Cheers,

Harry O.

“Harry Ohlsen” harryo@qiqsolutions.com schrieb im Newsbeitrag
news:40A8445E.5020403@qiqsolutions.com

The first version in your latest e-mail didn’t work for me. It just
gave me the complete array every time.

I think it’s because when n[i] is zero, that’s not the same as false,
hence the “if” always succeeds.

You just need to change that line to

    subset << elem if (n[i] == 1)

The second version worked fine.

Yeah, I noticed the error but apparently didn’t recheck the first version.
You’re right of course.

I like all of your clean-ups.

Thank you!

I wonder whether re-coding in C would increase performance?

Dunno. I didn’t write a ruby extension yet but I heard it’s pretty easy.
:slight_smile:

Anyway, it might be worth creating an RCR to have this added to
Enumerable.

Maybe. The crucial point is, is this general enough to place it there?
Well, the vote might show it.

Regards

robert