Are Hash speeds documented?

Are the lookup, insertion, deletion, and sort costs of Hash objects
documented anywhere? I would be interested in average-case
and worst-case times... are they linear, logarithmic, exponential, etc.
in time?

Also, is there a way to 'tune' hashes so that they use underlying
algorithms which are faster at, say, look-ups over inserts or
vice-versa? That would be killer.

···

--
Posted via http://www.ruby-forum.com/.

Are the lookup, insertion, deletion, and sort costs of Hash objects
documented anywhere? I would be interested in average-case
and worst-case times... are they linear, logarithmic, exponential, etc.
in time?

Are you familiar with hashing in general? If not, please have a look here:

If you want true numbers you need to benchmark because all theory may fail you for particular data or access patterns.

Also, is there a way to 'tune' hashes so that they use underlying
algorithms which are faster at, say, look-ups over inserts or
vice-versa? That would be killer.

You can't modify internal workings of the built in Hash class unless you start patching the C code. But of course you can create your own implementation for custom cases.

Cheers

  robert

···

On 25.02.2011 17:57, Nick Brown wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Lookup, insertion, deletion -> O(1)
Sort -> O(N * logN)

···

On Fri, Feb 25, 2011 at 5:57 PM, Nick Brown <nick@nick-brown.com> wrote:

Are the lookup, insertion, deletion, and sort costs of Hash objects
documented anywhere? I would be interested in average-case
and worst-case times... are they linear, logarithmic, exponential, etc.
in time?

--
Pozdrawiam

Radosław Bułat
http://radarek.jogger.pl - mój blog

Nick Brown wrote:

Are the lookup, insertion, deletion, and sort costs of Hash objects
documented anywhere? I would be interested in average-case
and worst-case times... are they linear, logarithmic, exponential, etc.
in time?

No, as far as I know, performance characteristics are not part of the
Ruby Language Specification. Neither the Draft ISO Ruby Specification
nor RubySpec describe any kind of performance characteristics.

The popular Ruby implementations currently have the following
asymptotic time complexities and asymptotic step complexities:

    lookup: Ο(n) worst-case, Ο(1) amortized worst-case
    insertion: Ο(n) worst-case, Ο(1) amortized worst-case
    deletion: Ο(n) worst-case, Ο(1) amortized worst-case
    sort: Hash doesn't have a sort method

If you want to know the average case, you need to supply a probability
distribution. Average case analysis is always relative to a
probability distribution.

But note that these are *not* part of the language specification. A
Ruby implementation can have much slower implementations and still be
a fully compliant Ruby implementation, similar to what happened with
Jscript, where array lookup was Ο(n) worst-case, Ο(n) amortized
worst-case.

jwm

Robert Klemme wrote in post #983928:

Are you familiar with hashing in general?

Sure (abstractly), but there are different ways it could be implemented.
A hash map to binary trees will have faster lookup times than a hash map
to arrays ("associative array")... at least it would when you're using
huge amounts of data and dealing with substantial hash collisions (like
me).

You can't modify internal workings of the built in Hash class unless you
start patching the C code. But of course you can create your own
implementation for custom cases.

I believe Java at least lets you tune its hashmap by altering some
constants at creation time. I hadn't heard about anything like this in
Ruby but it sure would be nice.

So it sounds like there is neither data on costs nor tuning capability
available in Ruby. Maybe I'll write an uber-hash while I'm waiting on
this script to finish running and share it with you all :slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.