Feature request: stable sort

I’ve just been poking around and I see that ruby’s sort is implemented by
calling its own implementation of qsort()

This is fine except that qsort is not a ‘stable’ sort - i.e. elements with
equal keys do not have their existing order preserved.

I can work around this but it would be nice to have an alternative sort
implementation with this property. FreeBSD has a mergesort() function with
the same API as qsort(); the source code is 350 lines, including the
copyright notice. Would it perhaps be worth merging in?

Regards,

Brian.

irb(main):001:0> a=[“one”,“two”,“one”,“two”,“one”,“two”]
=> [“one”, “two”, “one”, “two”, “one”, “two”]
irb(main):002:0> a.collect {|e| e.id}
=> [67715604, 67715594, 67715584, 67715574, 67715564, 67715554]
#1 #2 #3 #4 #5 #6
irb(main):003:0> b=a.sort
=> [“one”, “one”, “one”, “two”, “two”, “two”]
irb(main):004:0> b.collect {|e| e.id}
=> [67715604, 67715564, 67715584, 67715574, 67715594, 67715554]
#1 #5 #3 #4 #2 #6

Hi,

···

At Thu, 31 Jul 2003 21:54:31 +0900, Brian Candler wrote:

I’ve just been poking around and I see that ruby’s sort is implemented by
calling its own implementation of qsort()

This is fine except that qsort is not a ‘stable’ sort - i.e. elements with
equal keys do not have their existing order preserved.

It’s a well known trade-off. Fast sorts are unstable or need
work area.

I guess it is not required as stable in all cases. How about
new methods, #stable_sort and #stable_sort!?


Nobu Nakada

Yes, new methods would be fine. Perhaps if you use mergesort it should just
be called “mergesort”. I believe it could give better performance where the
input is already partially sorted (e.g. two sorted arrays concatenated), so
you might want to choose it even if stability isn’t important.

Also, doesn’t quicksort have some nasty input cases where it degrades to
O(n^2), or am I thinking of something else?

Regards,

Brian.

···

On Thu, Jul 31, 2003 at 10:48:49PM +0900, nobu.nokada@softhome.net wrote:

Hi,

At Thu, 31 Jul 2003 21:54:31 +0900, > Brian Candler wrote:

I’ve just been poking around and I see that ruby’s sort is implemented by
calling its own implementation of qsort()

This is fine except that qsort is not a ‘stable’ sort - i.e. elements with
equal keys do not have their existing order preserved.

It’s a well known trade-off. Fast sorts are unstable or need
work area.

I guess it is not required as stable in all cases. How about
new methods, #stable_sort and #stable_sort!?

Also, doesn’t quicksort have some nasty input cases where it degrades to
O(n^2), or am I thinking of something else?

If I recall correctly, I think if the input is sorted exactly wrong (ie it’s
in descending order and you want ascending) AND the pivot is chosen
particularly badly, it goes to O(n^2).

Most modern qsort() implementations use a “middle of 3” choice for pivot, so
even in the “worst case” scenario, it’s better than n^2 (albeit probably
worse than n log n)

But then, it’s been a few years since I studied the theory of algorithms, so
I might be misremembering.

“Michael Campbell” michael_s_campbell@yahoo.com wrote in message news:LEEEIDNEKIDGNLMOEIJOEEBNCDAA.michael_s_campbell@yahoo.com

Also, doesn’t quicksort have some nasty input cases where it degrades to
O(n^2), or am I thinking of something else?

If I recall correctly, I think if the input is sorted exactly wrong (ie it’s
in descending order and you want ascending) AND the pivot is chosen
particularly badly, it goes to O(n^2).

Most modern qsort() implementations use a “middle of 3” choice for pivot, so
even in the “worst case” scenario, it’s better than n^2 (albeit probably
worse than n log n)

Correct. Also, in cases where the list is “almost sorted” in one
direction or another, quicksort can have a very nasty running time.
These cases can be alleviated either through manipulation of the
pivot, as you suggested, or by simply randomizing the list before
giving it to quicksort. Of course, with randomizing, there’s always
the (extremely slim) chance that you’ll end up with a sorted list (a
la bogosort), but it’s an easier hack than changing the quicksort
algorithm.

Dave Dembinski

“Michael Campbell” michael_s_campbell@yahoo.com wrote in message
news:LEEEIDNEKIDGNLMOEIJOEEBNCDAA.michael_s_campbell@yahoo.com

Also, doesn’t quicksort have some nasty input cases where it degrades
to
O(n^2), or am I thinking of something else?

If I recall correctly, I think if the input is sorted exactly wrong (ie
it’s
in descending order and you want ascending) AND the pivot is chosen
particularly badly, it goes to O(n^2).

Most modern qsort() implementations use a “middle of 3” choice for
pivot, so
even in the “worst case” scenario, it’s better than n^2 (albeit probably
worse than n log n)

Correct. Also, in cases where the list is “almost sorted” in one
direction or another, quicksort can have a very nasty running time.
These cases can be alleviated either through manipulation of the
pivot, as you suggested, or by simply randomizing the list before
giving it to quicksort. Of course, with randomizing, there’s always
the (extremely slim) chance that you’ll end up with a sorted list (a
la bogosort), but it’s an easier hack than changing the quicksort
algorithm.

Going back to the stable sort issue: Is there any
reason you couldn’t just map to an array that has
the original elements along with the indices, and
use the indices as a secondary key? I would think
that would work unless you were going for blazing
speed.

Hal

···

----- Original Message -----
From: “Dave Dembinski” thealmightydaev@hotmail.com
Newsgroups: comp.lang.ruby
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, August 11, 2003 3:39 PM
Subject: Re: Feature request: stable sort


Hal Fulton
hal9000@hypermetrics.com

It’s what I did to work around my original problem, but it’s a pain having
to do that. An existing array of [key,value] needs to be changed to
[key,original_sequence,value]

(I was doing “sort these E-mail addresses by dated created, but any groups
which had the same creation date should be kept in the same order”)

Cheers,

Brian.

···

On Tue, Aug 12, 2003 at 06:06:19AM +0900, Hal E. Fulton wrote:

Going back to the stable sort issue: Is there any
reason you couldn’t just map to an array that has
the original elements along with the indices, and
use the indices as a secondary key? I would think
that would work unless you were going for blazing
speed.