Ruby and Judy

Joe:

[Joseph McDonald]
Judy looks cool: http://www.sourcejudy.com/ docs at:
http://www.sourcejudy.com/application/

I think it would be great to have a ruby interface to Judy. Someone
has done a SWIG interface to Judy, pointer here:
http://www.nclug.org/pipermail/nclug/2002-August/004079.html

I’ve never used SWIG before, Lyle could probably whip it into shape
faster than I could read the SWIG docs :slight_smile:

I sure hope somebody does it. I haven’t had the time to read it (318 pages).
But, I would be willing to help someone. (I am Ruby/Python ignorant).

I was also thinking… what if ruby internal hash and array used Judy?
I wonder how much faster it would be? I’ve CC:d the author of Judy on
this note, he may have some insight.

I’ve had some recent email exchanges with Tim Peters about the
applicability of using Judy (SL or L?) in Python for replacing internal
dictionaries. My take on Tim’s feedback is: he seems to think they
have a good handle on their performance with insert/lookups and memory
management inside Python. Right now I am buried with malloc() tests.
As soon as I finish, I will be interested in testing (and possibly
tuning) JudySL for suitability inside an interpreter
(Python/Perl/Java/Ruby/etc.). JudyL and Judy1 have been tuned. I
believe they are unbeatable. JudySL is implemented as a simple digital
tree that uses JudyL to do the work. I have never had a complaint
about JudySL performance. I suspect there is room for about 30%
improvement for both speed and memory for shorter strings. I need to
get my hands on a ‘typical’ data set in order to proceed.

I will shortly add two benchmarks (updated from the Judy source download
versions) to the web site. <www.sourcejudy.com/benchmarks>

  1. SLcompare.c - measure speed and memory usage with the best JudySL
    competitors I know of, using any data set you supply. I am
    presently testing JudySL up to 1/2 billion strings. (yes, that’s a
    lot of disk space and not possible to put on the web)

  2. Judy1LTime.c - measures Judy1 & JudyL speed and memory usage with
    gusto. I am presently testing them with populations in the billions.

I will publish the results soon. (This is actually about testing a new
malloc() design. I already know Judy scales well into the peta-range).

thanks, > -joe

Doug Baskins doug@sourcejudy.com

Hi,

···

In message “Re: Ruby and Judy” on 02/08/26, dougbaskins@frii.com dougbaskins@frii.com writes:

As soon as I finish, I will be interested in testing (and possibly
tuning) JudySL for suitability inside an interpreter
(Python/Perl/Java/Ruby/etc.).

As Ruby’s author, I am pretty interested in replacing internal hash
functions with Judy, but I need support for hash for length specified
string, since strings in Ruby can contain NUL.

						matz.

According to this [http://www.sourcejudy.com/application/judysl.pdf] paper,
the JudySL* functions are implemented atop JudyL. This means we can
duplicate JudySL to use Ruby’s strings instead of ASCIIZ.

This other [http://www.sourcejudy.com/application/Judy_hashing.pdf]
explains how to use Judy to handle collisions chains in a hash. This way
it could be integrated into the current st_* family, and the hashing
functions could be simplified, as collisions would be cheaper…

···

On Mon, Aug 26, 2002 at 03:46:27PM +0900, Yukihiro Matsumoto wrote:

Hi,

In message “Re: Ruby and Judy” > on 02/08/26, dougbaskins@frii.com dougbaskins@frii.com writes:

As soon as I finish, I will be interested in testing (and possibly
tuning) JudySL for suitability inside an interpreter
(Python/Perl/Java/Ruby/etc.).

As Ruby’s author, I am pretty interested in replacing internal hash
functions with Judy, but I need support for hash for length specified
string, since strings in Ruby can contain NUL.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
’_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Absolutely nothing should be concluded from these figures except that
no conclusion can be drawn from them.
– Joseph L. Brothers, Linux/PowerPC Project)