the big difference is that a value will have rand called multiple times for it
- in otherwords an element compareds differently every time. this is bound to
help shuffle more uniformly it seems:
harp:~ > cat a.rb
%w( a b c d ).sort{|a,b| ra = rand; rb = rand; p a => ra; p b => rb; ra <=> rb }
harp:~ > ruby a.rb
{"a"=>0.181439380218727}
{"c"=>0.579403886629126}
{"c"=>0.713513989939958}
{"d"=>0.573687508540169}
{"a"=>0.851534740906118}
{"d"=>0.00156074151427565}
{"b"=>0.803591007691647}
{"a"=>0.494066110872563}
{"b"=>0.834676789626288}
{"c"=>0.792106134460754}
so, note that "a" was sorted using a different value each time.
can anyone who remembers stats and knows the sort algorithm weigh in?
-a
···
On Tue, 6 Dec 2005, Mauricio [iso-8859-1] Fernández wrote:
On Sun, Dec 04, 2005 at 10:21:02AM +0900, ara.t.howard@noaa.gov wrote:
On Sun, 4 Dec 2005, Mauricio [iso-8859-1] Fernández wrote:
sort_by{ rand } is actually biased, since sort_by will preserve the
relative order of the elements for which rand() returned the same value.I did some numbers and found the following probability estimates:
array size P(#rand() collision)
1000 5.54558e-11
10000 5.55056e-09
1000000 5.55096e-05
10000000 5.53574e-03A collision implies that the associated pair of elements has got the same
ordering in the output array, instead of 50-50 chances of it being reversed.More info, including the Ruby code I used, available at
http://eigenclass.org/hiki.rb?sort_by+rand+is+biaseddoes
%w( a b c d ).sort{ rand <=> rand }
help?
I'm not totally sure it's really unbiased, but at first sight it looks good.
--
ara [dot] t [dot] howard [at] noaa [dot] gov
all happiness comes from the desire for others to be happy. all misery
comes from the desire for oneself to be happy.
-- bodhicaryavatara
===============================================================================