Hashes sensitive to simularity


(Berger, Daniel) #1

Perhaps Napster will open-source their fingerprinting technology now that
they’re bankrupt? Hmmm…maybe not.

A search out on Google revealed precious little. The Perl soundex modules
all appear to be based on the Knuth algorithm.

That’s an interesting alogrithm, though. Maybe teachers should use it to
see if students are cheating off each other. :slight_smile: (I used to teach).

Regards,

Dan

···

-----Original Message-----
From: Thomas Hurst [mailto:tom.hurst@clara.net]
Sent: Thursday, June 06, 2002 11:42 AM
To: ruby-talk@ruby-lang.org
Subject: Re: Hashes sensitive to simularity

Perhaps Levenshtein distance?
http://www.merriampark.com/ld.htm

Mmmn, the thing is, I really need a key I can use for a database and a
hash. Maybe something like an overlong soundex, or a more generic
hashing algorithm that produces identical hashes up to a certain
threshold.

I suppose I might be able to use a less accurate algorithm and use
a simularity matching algorithm like Levenshtein distance on
the smaller
set.