I’m after something like nilsimsa[1]; a hashing algorithm that allow you to
determine how similar two strings are. Ideally similar strings would
have identical hashes so it’s a simple hash lookup, otherwise I might
have to do some work, and we can’t have that
Before I try to make use of nilsimsa, can anyone point out similar
algorithms that might be of use, especially if they’re already available
as Ruby modules
On Thursday 06 June 2002 08:51 am, Thomas Hurst wrote:
I’m after something like nilsimsa[1]; a hashing algorithm that
allow you to determine how similar two strings are. Ideally
similar strings would have identical hashes so it’s a simple hash
lookup, otherwise I might have to do some work, and we can’t have
that
Before I try to make use of nilsimsa, can anyone point out similar
algorithms that might be of use, especially if they’re already
available as Ruby modules
Mmmn, the thing is, I really need a key I can use for a database and a
hash. Maybe something like an overlong soundex, or a more generic
hashing algorithm that produces identical hashes up to a certain
threshold.
I suppose I might be able to use a less accurate algorithm and use
a simularity matching algorithm like Levenshtein distance on the smaller
set.