>sort_by{ rand } is actually biased, since sort_by will preserve the
>relative order of the elements for which rand() returned the same value.
I did some numbers and found the following probability estimates:
array size P(#rand() collision)
1000 5.54558e-11
10000 5.55056e-09
1000000 5.55096e-05
10000000 5.53574e-03
A collision implies that the associated pair of elements has got the same
ordering in the output array, instead of 50-50 chances of it being reversed.
More info, including the Ruby code I used, available at
http://eigenclass.org/hiki.rb?sort_by+rand+is+biased
does
%w( a b c d ).sort{ rand <=> rand }
help?
I'm not totally sure it's really unbiased, but at first sight it looks good.
When called with no arguments, rand will use
static double
genrand_real(void)
{
unsigned long a=genrand_int32()>>5, b=genrand_int32()>>6;
return(a*67108864.0+b)*(1.0/9007199254740992.0);
}
so it would seem that two consecutive calls to rand() cannot return the
same value, but I still have to think about the effect of qsort.
It is however somewhat slower than sort_by{ rand }.
[100, 1000, 10000, 100000].each do |n|
GC.start
t0 = Time.new
(1..n).sort_by{rand}
a = (t1 = Time.new) - t0 # => 0.000425, 0.003002, 0.054392, 0.939692
(1..n).sort{rand <=> rand}
b = Time.new - t1 # => 0.001161, 0.026275, 0.289807, 3.791833
"%4.2f" % [b / a] # => "2.73", "8.75", "5.33", "4.04"
end
RUBY_VERSION # => "1.8.3"
RUBY_RELEASE_DATE # => "2005-09-24"
···
On Sun, Dec 04, 2005 at 10:21:02AM +0900, ara.t.howard@noaa.gov wrote:
On Sun, 4 Dec 2005, Mauricio [iso-8859-1] Fernández wrote:
--
Mauricio Fernandez