3) Judy is in general not faster than a hash lookup. It is just more
memory efficient.
I cannot fully agree with that characterization, even if it's correct
considering
only the operation count.
It's not something you see on typical benchmarks [1], as they tend to access all
memory with the same frequency, but real programs' memory access patterns
more or less follow the power law - very few data structures are
accessed all the time,
and gradually bigger groups of data structures are accessed less frequently,
ending with most of the program memory being accessed very rarely.
For typical programs computers are pretty good at having the more commonly
accessed things in faster caches. This means even modest decrease in
data structure sizes will pull frequently accessed data structures into faster
caches. In most benchmarks all items are equally probably, so the items pulled
into freed caches tends to be uninteresting.
I'd like to see Judy vs hash tables performance comparison on real programs,
or at least on more realistic benchmarks.
One more thing - Ruby uses numeric hash tables for symbol lookup.
Symbols ids are highly non-random:
ids = ObjectSpace.enum_for(:each_object,
Module).inject(){|ms,md| ms + md.instance_methods}.uniq.map{|x|
x.to_sym.object_id}.sort; [ids.min, ids.max, ids.size]
=> [378, 269218, 1061]
That's exactly the situation Judy is optimized for.
[1] - A performance comparison of Judy to hash tables
···
On 22/02/07, gga <GGarramuno@aol.com> wrote:
--
Tomasz Wegrzanowski [ http://t-a-w.blogspot.com/ ]