Coding challenge (on Ruby Garden)

No, I think it can.

David

···

On Tue, 6 Aug 2002, Brian D. Baker wrote:

Is it legal to assume that the first array contains no duplicate items?


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

damn! i went to measure the speed on that one liner, but alas i run on
stable.

what’s inject do?

also, it would be nice to have a brief description of the main idea
behind the solution. (all of the solutions, actually)

~transami

···

On Tue, 2002-08-06 at 05:41, George Ogata wrote:

David Alan Black wrote:

Hi –

On Tue, 6 Aug 2002, Avi Bryant wrote:

David Alan Black dblack@candle.superlink.net wrote in message
news:Pine.LNX.4.30.0208052056270.15880-100000@candle.superlink.net

Hello –

Coding Challenge 0002 (my numbering scheme :slight_smile: has been posted to
http://www.rubygarden.org/ruby?CodingChallenge0002.

So, what are the criteria for the challenge, apart from passing the
tests? This obviously isn’t golf, but the stats for my solution are:

bytes: 139
lines: 3
expressions: 1
assignments: 0

The criteria are passing the tests, and the mysterious “meeting the
requirements” (which is a way of saying I doubt that the tests are
comprehensive enough). I think you get the honorary golf trophy,
though :slight_smile:

David

I think Avi’s doesn’t quite meet the requirements. This should be true,
shouldn’t it?:

one_in_each([1,2,3],[[1,2],[1,2],[1,3]])

I have a 118 char solution (including the “def…end”!) which I think meets
all requirements. Anyone care to shorten it?

http://www.rubygarden.org/ruby?G00002


~transami

algorithms with order N vs order N^2: 10x the number of items means
1/10 vs 1/100 the speed…

···

On Tuesday 06 August 2002 11:00 am, Bill Kelly wrote:

. . . Anyway, I’d be interested in anyone’s thoughts on what’s
going on here… (Hope I didn’t botch the tests somehow -
but I don’t think so!)


Ned Konz
http://bike-nomad.com
GPG key ID: BEEA7EFE

Ah, nice catch. The only way I can think to fix that would be to use
reject{|e|e.id==s.id} as you do, instead of rows-[s]. Along with
Pit’s suggestion, and adding a != nil, that would make our solutions
identical - so I think I’ll just leave mine in its (broken) state.

···

George Ogata g_ogata@optushome.com.au wrote:

I think Avi’s doesn’t quite meet the requirements. This should be true,
shouldn’t it?:

one_in_each([1,2,3],[[1,2],[1,2],[1,3]])

bill can you tell where mine is failing? it passes david’s tests plus
the one avi’s missed --which your test dosn’t seem to catch. would like
to fix mine though if there’s a bug.

these times are very interesting though. how did david’s gain so much
speed? on smaller probs it was the slowest. very interesting. just goes
to show…

~transami

···

On Tue, 2002-08-06 at 12:00, Bill Kelly wrote:

(ruby 1.6.6 (2001-12-26) [i586-mswin32])

dblack (1000 iterations) 19.188 (0.019188 per iter) [failed test]

transami (1000 iterations) 20.469 (0.020469 per iter) [failed test]

dnaseby (1000 iterations) 61.609 (0.061609 per iter)

george (10 iterations) 40.638 (4.0638 per iter)

avi (10 iterations) 68.238 (6.8238 per iter)

bwk (10 iterations) 78.503 (7.8503 per iter)

“Bill Kelly” billk@cts.com wrote in

class Array # from http://www.rubygarden.org/ruby?OneLiners
def shuffle!
each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
self[j]}
end
end

Note that this one liner does not create an even distribution
of shuffles.

. . . Anyway, I’d be interested in anyone’s thoughts on what’s
going on here… (Hope I didn’t botch the tests somehow -
but I don’t think so!)

From the output of my little benchmarks script which generates
``one_in_each’’ random square (matrix) examples, I’d say that most
published solutions (including yours) are on the average O(n^3) (n x n
= size of the matrix) solutions, but one can write solutions which are
on the average O(n^2).

···

other good values to try are 50, 100, 150, 300

and 1000? (if you have the time).

N = 200

def one_in_each(i,r)

end

class Array
def shuffle!
s = size
0.upto(size-1) do |i|
j = i + rand(size-i)
tmp = self[i]
self[i] = self[j]
self[j] = tmp
end
self
end
end

Const = 2<<29
def create_problem(m)
items = Array.new(m)
rows = (0…m).collect do |i|
tmp = Array.new(m) do rand(Const) end
items[i] = tmp[rand(m)] = -rand(Const) - 1
tmp
end
return items.shuffle!, rows
end

require ‘benchmark’
include Benchmark

bmbm { |x|
i,r = nil, nil
x.report(“create”){ i,r = create_problem(N) }
GC.start
x.report(“one_in_each”) { one_in_each(i,r) }
}

/Christoph

Hi –

···

On Tue, 6 Aug 2002, Bill Kelly wrote:

… BTW also I’m noticing the objects in the unit tests are all
Enumerable. Can we depend on the objects in the arrays being
“sortable”, or just testable for equality?

I think just testable for equality. Whatever’s harder :slight_smile:

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Tom Sawyer wrote:

damn! i went to measure the speed on that one liner, but alas i run on
stable.

?

I don’t know how your measuring them, but I think it’d be quite difficult
to compare them fairly. Each algorithm probably goes through the
search-space in a different order, making different algorithms
discover a solution quicker than others for different inputs.
Granted, the recursive types may have an edge since they effectively
use the ruby stack as a search-tree, making the underlying C do more
of the work for them. (?)

what’s inject do?

Inject is explained in the Pickaxe book in the chapter “Containers,
Blocks, and Iterators.” I changed #inject to #find, though, since
it’s cleaner, shorter, and supported in 1.6.

Having taken another look at Avi’s now though, I realize our solutions
are essentially the same. There probably isn’t a better way short of
introducing search heuristics.

(Note: I’m viewing the problem as a constraint satisfaction problem,
hence terms like “search-tree” and “search heuristics”. Search Google
for an explanation of CSPs.)

also, it would be nice to have a brief description of the main idea
behind the solution. (all of the solutions, actually)

Ok, I put a short explanation on mine.

. . . Anyway, I’d be interested in anyone’s thoughts on what’s
going on here… (Hope I didn’t botch the tests somehow -
but I don’t think so!)

algorithms with order N vs order N^2: 10x the number of items means
1/10 vs 1/100 the speed…

Hehe, yeah… sorry to come off like a dumbass… :slight_smile: I hadn’t
studied the other solutions yet, and incorrectly assumed they
were all pretty much “brute force”…

Bill

···

From: “Ned Konz” ned@bike-nomad.com

On Tuesday 06 August 2002 11:00 am, Bill Kelly wrote:

Hi,

bill can you tell where mine is failing? it passes david’s tests plus
the one avi’s missed --which your test dosn’t seem to catch. would like
to fix mine though if there’s a bug.

I tried reducing the dataset size as far as possible and having it
still fail. It seems to fail on this:

big_items = [1, 3, 2, 4]
big_rows = [[ 2 ], [ 1 ], [ 3 ], [ 4 ]]

Hope this helps,

Bill

···

From: “Tom Sawyer” transami@transami.net

Hi,

“Bill Kelly” billk@cts.com wrote in

class Array # from http://www.rubygarden.org/ruby?OneLiners
def shuffle!
each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
self[j]}
end
end

Note that this one liner does not create an even distribution
of shuffles.

Aiyeee!!! =) I’d specifically copied the one from the wiki page
that claimed to be, “O(N) and non-biased” because I recalled a prior
discussion about certain solutions . . . er, not being evenly-
distributed in their shuffelation. :slight_smile:

. . . Anyway, I’d be interested in anyone’s thoughts on what’s
going on here… (Hope I didn’t botch the tests somehow -
but I don’t think so!)

From the output of my little benchmarks script which generates
``one_in_each’’ random square (matrix) examples, I’d say that most
published solutions (including yours) are on the average O(n^3) (n x n
= size of the matrix) solutions, but one can write solutions which are
on the average O(n^2).

O(n^3) sounds about right. . . . . or at least, after the fact,
it seems to mesh with the weird properties I was seeing. I’m not
used to dealing with more than n^2, and was flabbergasted–kindof–
when increasing the items size from 60 to 70 incurred such a
horrendous penalty.

[…]

require ‘benchmark’
include Benchmark

Ah! Thanks, didn’t know about this module…
(Like your invoking of GC.start at an appropriate point, too…)

Thanks,

Bill

···

From: “Christoph” chr_news@gmx.net

Christoph wrote:

. . . Anyway, I’d be interested in anyone’s thoughts on what’s
going on here… (Hope I didn’t botch the tests somehow -
but I don’t think so!)

From the output of my little benchmarks script which generates
``one_in_each’’ random square (matrix) examples, I’d say that most
published solutions (including yours) are on the average O(n^3) (n x n
= size of the matrix) solutions, but one can write solutions which are
on the average O(n^2).


Those tests weren’t necessarily representative of the “average” case. What is the
“average” case anyway? Why should it be a returns-true case? There are more
combinations which should return false then return true given the array/matrix sizes,
aren’t there?. If you take (non-trivial) best and worst cases, you see actually quite
different patterns. On the one hand:

···

Using SlowFailArgs, nitems = 7
dnaseby (10 iterations) 0.055341 (0.0055341 per iter)
george (10 iterations) 1.627122 (0.1627122 per iter)


Using SlowFailArgs, nitems = 7
dnaseby (10 iterations) 0.051418 (0.0051418 per iter)
george (10 iterations) 1.614257 (0.1614257 per iter)


Using SlowFailArgs, nitems = 7
dnaseby (10 iterations) 0.050687 (0.0050687 per iter)
george (10 iterations) 1.639815 (0.1639815 per iter)

… and on the other:


Using QuickFailArgs, nitems = 300
dnaseby (10 iterations) 3.198134 (0.3198134 per iter)
george (10 iterations) 0.658672 (0.0658672 per iter)


Using QuickFailArgs, nitems = 300
dnaseby (10 iterations) 3.216936 (0.3216936 per iter)
george (10 iterations) 0.659457 (0.0659457 per iter)


Using QuickFailArgs, nitems = 300
dnaseby (10 iterations) 3.227569 (0.3227569 per iter)
george (10 iterations) 0.662572 (0.0662572 per iter)

(The script used to generate these is below.)

Yes, mine is choking on a mere 7 elements in the worst case (and starts turning purple at
double figures), but at least it’s correct. After looking into dnaseby’s, I found he cut
one too many corners — this should be false, right?:

one_in_each([1,2,3,4,5], [[1,2,3,4,5],[1,2,3,4,5],[1,2,0,0,0],[1,2,0,0,0],[1,2,0,0,0]])

To correct this, his worst-case complexity must increase dramatically, I think.


The test script (Bill’s slightly modified; I don’t have the unit test stuff installed):

#!/usr/bin/ruby

The algorithms

def one_in_each(items,rows)
return false if items.length != rows.length
return true if rows.empty?
item_matches =
items.each_with_index do |item, index|
item_matches << find_matches(item, rows)
return false if item_matches[index].empty?
end
taken_indices =
rows.length.times do
item_matches.each_with_index do |matches, index|
if matches.length == 1
return false if taken_indices.include? matches[0]
taken_indices << matches[0]
item_matches.delete_at(index)
else
matches.delete_if {|i| taken_indices.include?(i)}
end
end
return true if taken_indices.length == rows.length
end
return item_matches.flatten.uniq.length + taken_indices.length == rows.length
end

def find_matches(item, rows)
indexArr =
rows.each_with_index do |row, index|
indexArr << index if row.detect {|toMatch| toMatch == item }
end
return indexArr
end

def one_in_each2
i,r;i==||r.find{|s|s.include?(i[0])&&one_in_each2(i[1..-1],r.reject{|e|s.id==e.id})}!=nil;end

Arg generators – “Quick”/“Slow” correspond to one_in_each2

QuickFailArgs = proc{|n|[Array.new(n).map!{0}, Array.new(n).map!{Array.new(n).map!{1}}]}
SlowFailArgs = proc {|n|
if n == 0
[,]
else
i = Array.new(n,0)
r = Array.new(n).map{e = Array.new(n,1); e[-1] = 0; e}
r[-1][-1] = 1
[i,r]
end
}
QuickPassArgs = proc {|n|
[(1..n).to_a, (1..n).map{|i|Array.new(n,i)}]
}
SlowPassArgs = proc {|n|
[(1..n).to_a, (0…n).to_a.map!{|i| (1..(n-i)).to_a + Array.new(i,0)}]
}

BK’s test (modified)

class Array # from http://www.rubygarden.org/ruby?OneLiners
def shuffle!

  each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i], self[j]}

end

end

#require ‘test/unit’
#class TestMe < Test::Unit::TestCase
def test_big
puts “\n------------------”
srand 314159

nitems = 300  # if increased to just 70, the slow solutions take "forever" !

big_items = [*1..nitems]
big_items.shuffle!
big_rows = []
big_items.each_with_index do |elm,idx|
  big_rows[idx] = []
  5.times {|i| big_rows[idx] << rand(1000) }
  big_rows[idx] << elm
  5.times {|i| big_rows[idx] << rand(1000) }
end
big_items.shuffle!
argspec = 'Random pass'

# Uncomment one.
#big_items, big_rows = QuickPassArgs.call(nitems); argspec = 'QuickPassArgs'
#big_items, big_rows = SlowPassArgs.call(nitems);  argspec = 'SlowPassArgs'
#big_items, big_rows = QuickFailArgs.call(nitems); argspec = 'QuickFailArgs'
#big_items, big_rows = SlowFailArgs.call(nitems);  argspec = 'SlowFailArgs'
puts "Using #{argspec}, nitems = #{nitems}"

niter = 10
#auths = %w(dblack dnaseby transami avi bwk george)
auths = %w(dnaseby george)
auths.each do |auth|
  eval <<-EOE
    print " #{auth} "
    t1 = Time.now
    #niter.times { assert( #{auth}_one_in_each(big_items, big_rows) ) }
    niter.times {
      case auth
      when 'dnaseby'
        one_in_each(big_items, big_rows)
      when 'george'
        one_in_each2(big_items, big_rows)
      end
    }
    t2 = Time.now
    

    puts "(\#{niter} iterations) \#{(t2-t1)} (\#{(t2-t1)/niter.to_f} per iter)"

  EOE
end

end
#end

3.times{test_big}

just figured it out.

i thought i was producing every permutation, but it turns out i’m only
getting a portion (i think 1/nitems of them).

i’ll fix it.

~transami

p.s. perhaps david’s has a similar problem, though i haven’t looked at
it too deeply.

···

On Tue, 2002-08-06 at 14:09, Bill Kelly wrote:

Hi,

From: “Tom Sawyer” transami@transami.net

bill can you tell where mine is failing? it passes david’s tests plus
the one avi’s missed --which your test dosn’t seem to catch. would like
to fix mine though if there’s a bug.

I tried reducing the dataset size as far as possible and having it
still fail. It seems to fail on this:

big_items = [1, 3, 2, 4]
big_rows = [[ 2 ], [ 1 ], [ 3 ], [ 4 ]]

Hope this helps,

Bill


~transami

Your algorithm chooses any of the n possibilities for element 0, then
chooses any of the n-1 remaining possibilities for element 1, and so on.
There are no duplicates possibilities, and there are
(1…n).inject{|a,b|a*b} possibilities covered. Which is factorial of n.
Which is every possible permutation.

In short, the one above is non-biased, if I’m not mistaken.

And, it occurs to me that the one in Christoph’s program is exactly the
same algorithm.

···

On Wed, 7 Aug 2002, Bill Kelly wrote:

From: “Christoph” chr_news@gmx.net

“Bill Kelly” billk@cts.com wrote in

class Array # from http://www.rubygarden.org/ruby?OneLiners
def shuffle!
each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
self[j]}
end
end
Note that this one liner does not create an even distribution
of shuffles.
Aiyeee!!! =) I’d specifically copied the one from the wiki page
that claimed to be, “O(N) and non-biased” because I recalled a prior
discussion about certain solutions . . . er, not being evenly-
distributed in their shuffelation. :slight_smile:


Mathieu Bouchard http://artengine.ca/matju

“George Ogata” wrote

Those tests weren’t necessarily representative of the “average” case.
What is the
“average” case anyway? Why should it be a returns-true case? There are
more

Hm, I probably should have written the ``average behavior’’ with respect to
my
test setup.

combinations which should return false then return true given the
array/matrix sizes,
aren’t there?. If you take (non-trivial) best and worst cases, you see
actually quite
different patterns. On the one hand:

In a variation of an old statistics saying “there are lies, dammed lies and
benchmarks”;-). On the other hand these types of benchmarks do have
their place. If you positively have to verify that a given 1000 x 1000
matrix is in an ``one_in_each’’ relation to a 1000 elements item array
or more likely have to actually generate the corresponding bijection
you’ll be dammed (i.e. you will never finish in a reasonable amount
of time) if you asymptotically (worst case or not) slower algorithm.

(The script used to generate these is below.)

Yes, mine is choking on a mere 7 elements in the worst case (and starts
turning purple at
double figures), but at least it’s correct. After looking into dnaseby’s,
I found he cut
one too many corners — this should be false, right?:

one_in_each([1,2,3,4,5],
[[1,2,3,4,5],[1,2,3,4,5],[1,2,0,0,0],[1,2,0,0,0],[1,2,0,0,0]])

I think so …

To correct this, his worst-case complexity must increase dramatically, I
think.

In my ``benchmark’’ his test was rather slow.


The test script (Bill’s slightly modified; I don’t have the unit test
stuff installed):

Hm, maybe you want to benchmark my solution, Chr0002, as well?

/Christoph

In short, the one above is non-biased, if I’m not mistaken.

And, it occurs to me that the one in Christoph’s program is exactly the
same algorithm.

All true and thanks for catching this. I’ll promise to check
more closely before making erroneous claims about other
peoples scripts (and probably continue to be my sloppy self;-).

/Christoph

···

“Mathieu Bouchard” matju@sympatico.ca wrote

speaking of which, turned out there was a bad bug in my script and after
fixing it, it was dog slow. of course i took the brute force approach
and determined every purmutation and then looked to see if any of them
fit the criteria.

well, disappointed with that. i tried my hand at a few other methods
with the goal of creating a very fast version no matter the “coding
cost”. but everything i tried turned out slower. at one point i had a
thread running for every array to array-of-array node! my machine about
keeled over. :slight_smile: threads were not helpful at all. so i have been trying
some huristics. essentially reducing the problem down, after eliminating
a number of unnecessaries. still, its turned out to be quite a task and
i haven’t finished yet.

just FYI.

~transami

···

On Wed, 2002-08-07 at 16:04, Christoph wrote:

In a variation of an old statistics saying “there are lies, dammed lies and
benchmarks”;-). On the other hand these types of benchmarks do have
their place. If you positively have to verify that a given 1000 x 1000
matrix is in an ``one_in_each’’ relation to a 1000 elements item array
or more likely have to actually generate the corresponding bijection
you’ll be dammed (i.e. you will never finish in a reasonable amount
of time) if you asymptotically (worst case or not) slower algorithm.

Hi,

Hm, maybe you want to benchmark my solution, Chr0002, as well?

Whoops, I don’t seem to have Array#any? and Array#all?. :slight_smile: New
methods in 1.7 I guess?

What do they do? I’d been mulling over a hash-based solution to
the problem, but yours looks more succinct than where mine was
headed anyway…

Thanks,

Bill

“Bill Kelly” billk@cts.com wrote in message
news:002901c23e70$402b3d30$fa7b1b42@musicbox…

Hi,

Hm, maybe you want to benchmark my solution, Chr0002, as well?

Whoops, I don’t seem to have Array#any? and Array#all?. :slight_smile: New
methods in 1.7 I guess?

Yup, they are a generalization of && and || to Enumerables.

module Enumerable
def all?
each {|e| return false unless yield e }
return true
end
def any?
each {|e| return true if yield e }
return false
end
end

/Christoph

Hi,

“Bill Kelly” billk@cts.com wrote in message

Hm, maybe you want to benchmark my solution, Chr0002, as well?

Whoops, I don’t seem to have Array#any? and Array#all?. :slight_smile: New
methods in 1.7 I guess?

Yup, they are a generalization of && and || to Enumerables.
[…]

Thanks!

Unfortunately both solutions there failed one or more of the tests.
The first (larger in lines of code) failed on these:

items = [ 2,2 ]
rows = [ [1,2], [2,3] ]

items = [ 1,2,2 ]
rows = [ [1,2], [1,2], [1,3] ]

The second failed on just the latter dataset, above. (And both
failed on my larger ‘benchmark’ dataset…)

. . . Those unit tests are merciless, huh? :slight_smile:

Regards,

Bill

···

From: “Christoph” chr_news@gmx.net