What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?
···
--
Posted via http://www.ruby-forum.com/.
What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?
--
Posted via http://www.ruby-forum.com/.
Ron Green wrote:
What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?
Run "another test".hash - you get a different number.
It's good for hashes; look them up in any "data structures" textbook. It's a unique number useful for rapidly accessing that string.
And please DON'T SCREAM!
--
Phlip
Test Driven Ajax (on Rails) [Book]
"Test Driven Ajax (on Rails)"
assert_xpath, assert_javascript, & assert_ajax
It should be noted though that String#hash isn't garaunteed to be unique.
On Sep 9, 2007, at 1:34 PM, "Phlip" <phlip2005@gmail.com> wrote:
Ron Green wrote:
What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?Run "another test".hash - you get a different number.
It's good for hashes; look them up in any "data structures" textbook. It's a unique number useful for rapidly accessing that
Marcel Molina Jr. wrote:
On Sep 9, 2007, at 1:34 PM, "Phlip" <phlip2005@gmail.com> wrote:
Ron Green wrote:
What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?Run "another test".hash - you get a different number.
It's good for hashes; look them up in any "data structures"
textbook. It's a unique number useful for rapidly accessing thatIt should be noted though that String#hash isn't garaunteed to be
unique.
Then,again I ask, what is it good for?
--
Posted via http://www.ruby-forum.com/\.
Ron Green wrote:
> It should be noted though that String#hash isn't garaunteed to be
> unique.Then,again I ask, what is it good for?
Then I answer again: A tutorial on data structures tells how to use them.
(Maybe I shouldn't post when I'm having a bad day, huh?
--
Phlip
Test Driven Ajax (on Rails) [Book]
^ assert_xpath
O'Reilly Media - Technology and Business Training <-- assert_latest Model
It's still useful as a hash. Marcel wasn't wrong, but *no* fixed size hash
is "guaranteed" to be unique as that's absolutely impossible, per the
pigeonhole principle (http://en.wikipedia.org/wiki/Pigeonhole_principle\).
String#hash's hash is of a far lower "quality" than that offered by, say,
SHA-1 or SHA-2.
Regards,
Peter Cooper
On 9/9/07, Ron Green <rongreen1@mac.com> wrote:
Marcel Molina Jr. wrote:
> It should be noted though that String#hash isn't garaunteed to be
> unique.Then,again I ask, what is it good for?
Ron Green wrote:
Marcel Molina Jr. wrote:
Ron Green wrote:
What is the purpose of string hash? What would you use it for?
EXAMPLE: "This is a test.".hash RETURNS -649841898. WHAT EXACTLY DOES
THIS REPRESENT? WHAT IS IT GOOD FOR? COMPARISONS?
Run "another test".hash - you get a different number.
It's good for hashes; look them up in any "data structures" textbook. It's a unique number useful for rapidly accessing that
It should be noted though that String#hash isn't garaunteed to be
unique.
Then,again I ask, what is it good for?
It's for internal use, it's used in Hash to make accessing and finding strings in hash faster
On Sep 9, 2007, at 1:34 PM, "Phlip" <phlip2005@gmail.com> wrote:
Peter Cooper wrote:
On 9/9/07, Ron Green <rongreen1@mac.com> wrote:
Marcel Molina Jr. wrote:
> It should be noted though that String#hash isn't garaunteed to be
> unique.Then,again I ask, what is it good for?
It's still useful as a hash. Marcel wasn't wrong, but *no* fixed size
hash
is "guaranteed" to be unique as that's absolutely impossible, per the
pigeonhole principle
(http://en.wikipedia.org/wiki/Pigeonhole_principle\).
String#hash's hash is of a far lower "quality" than that offered by,
say,
SHA-1 or SHA-2.Regards,
Peter Cooper
http://www.rubyinside.com/
Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
--
Posted via http://www.ruby-forum.com/\.
Let's put it this way. MD5 and SHA-* hashes aren't *guaranteed* to be
unique either. There's just many more cases where strings will share a hash
with String#hash as opposed to something like MD5/SHA-*.
Hashes are useful for identify strings in hashtables. You use this every
time you say something like:
foo = {"bar" => "baz"}
foo["bar"] # => "baz"
HTH,
On Sunday 09 September 2007 12:23:43 pm Ron Green wrote:
Peter Cooper wrote:
> On 9/9/07, Ron Green <rongreen1@mac.com> wrote:
>>
>> Marcel Molina Jr. wrote:
>> > It should be noted though that String#hash isn't garaunteed to be
>> > unique.
>>
>> Then,again I ask, what is it good for?
>
>
> It's still useful as a hash. Marcel wasn't wrong, but *no* fixed size
> hash
> is "guaranteed" to be unique as that's absolutely impossible, per the
> pigeonhole principle
> (http://en.wikipedia.org/wiki/Pigeonhole_principle\).
> String#hash's hash is of a far lower "quality" than that offered by,
> say,
> SHA-1 or SHA-2.
>
> Regards,
> Peter Cooper
> http://www.rubyinside.com/Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
--
Konrad Meyer <konrad@tylerc.org> http://konrad.sobertillnoon.com/
Ron Green wrote:
Peter Cooper wrote:
Marcel Molina Jr. wrote:
It should be noted though that String#hash isn't garaunteed to be
unique.Then,again I ask, what is it good for?
It's still useful as a hash. Marcel wasn't wrong, but *no* fixed size hash
is "guaranteed" to be unique as that's absolutely impossible, per the
pigeonhole principle (http://en.wikipedia.org/wiki/Pigeonhole_principle\).
String#hash's hash is of a far lower "quality" than that offered by, say,
SHA-1 or SHA-2.Regards,
Peter Cooper
http://www.rubyinside.com/Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
In general, you wouldn't use String#hash, although you might conceivably want to override it. It's there for Hash. From the documentation on Object#hash:
"Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash. The hash value is used by class Hash."
Note the direction of implication: a == b => a.hash == b.hash, not a.hash == b.hash => a == b.
On 9/9/07, Ron Green <rongreen1@mac.com> wrote:
--
Alex
No hash function is guaranteed to be unique for all inputs. I don't know
what exactly you mean by "can't be used for identity".
The hash method on string (and any other object) is used by ruby when
storing that type of object in a hash table. Haven't looked through the
ruby sources, but here's how this sort of thing generally works and the
problem that we're trying to solve.
We want a data structure that lets us retrieve some key/value pair in
(hopefully) constant time. This means that, no matter how many items of
data we're storing, it will always take us (roughly) the same amount of time
to retrieve an arbitrary item. (IE: O(1)) A simple way to do this is to
have some hash function that breaks our inputs up into "buckets". Then,
when we get a given input, we feed it into that function and look into the
"bucket" that it gives us back.
An example of a very simple hash function for ASCII strings would be
something like this:
Sum the ASCII values of each character in the string, then take he modulus
of that value and 10. This would give us ten "buckets" to look in.
Obviously, there will be a whole lot of overlap here, as we have arbitrary
length strings as inputs to our hash function and only 10 possible outputs
(so this is not a reversible function.)
We get around this problem by storing the values in each bucket in some
other data structure, say a linked list. We then get the bucket (using the
hash function) and search through the list for our value.
A simple example using the hash function I described above (which is *not*
ruby's builtin string hash function.)... Both "a" and "k" would fall into
the same bucket:
irb(main):021:0> "a"[0] % 10
=> 7
irb(main):022:0> "k"[0] % 10
=> 7
So if we wanted to store the keys "a" and "k" in our hash table, they would
both live in bucket seven. Within bucket seven, they would have some
internal organization that lets us retrieve them once we know the bucket
(say a linked list). So if we want to find the value associated with the
key "a" we would perform our hash function on "a" and get bucket 7. We
would then look through the list located in bucket 7 until we find the value
a, and retrieve the value associated with that key.
Ruby's hash tables would work in a similar way, though I don't know what
sort of data structure they use internally to the buckets. (It could be
anything, of course, not just a linked list.)
So you probably would not use string.hash directly often (if ever), but
every time you do something like aHash = {"foo" => 42), it's being used by
ruby.
Again, any decent data structures book will explain this better than I can,
if you're truly interested.
MBL
Peter,
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
--
Posted via http://www.ruby-forum.com/\.
Alex Young wrote:
Ron Green wrote:
pigeonhole principle
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?In general, you wouldn't use String#hash, although you might conceivably
want to override it. It's there for Hash. From the documentation on
Object#hash:"Generates a Fixnum hash value for this object. This function must have
the property that a.eql?(b) implies a.hash == b.hash. The hash value is
used by class Hash."Note the direction of implication: a == b => a.hash == b.hash, not
a.hash == b.hash => a == b.
I think I understand. In other words it's not something I would use
directly. I just ran across it in Peter's book and wanted to make sure I
understood. Thanks everybody. Sorry if my ignorance pissed you off
Philip.
--
Posted via http://www.ruby-forum.com/\.
Ron Green wrote:
Michael Bevilacqua-Linn wrote:
MBL
I think I'll try Data Structures and Algorithms (Addison-Wesley Series
in Computer Science and Information Pr)
When I said it couldn't be used for identity I meant if you can't
guarantee uniqueness.How would you know if you retrieved the correct
data.
--
Posted via http://www.ruby-forum.com/\.
I might suggest you should *never* overwrite String#hash, though you
may want to overwrite Object#hash in your own classes. It's a bit
tricky though since the hash should be unique (or close enough), never
change, and two objects that share the same hash should be
Object#equal. You might also overwrite String#hash in a singleton
class instance, but never the base declaration. That's just askin' for
trouble.
On Sep 9, 2:37 pm, Alex Young <a...@blackkettle.org> wrote:
Ron Green wrote:
> Peter Cooper wrote:
>> On 9/9/07, Ron Green <rongre...@mac.com> wrote:
>>> Marcel Molina Jr. wrote:
>>>> It should be noted though that String#hash isn't garaunteed to be
>>>> unique.
>>> Then,again I ask, what is it good for?>> It's still useful as a hash. Marcel wasn't wrong, but *no* fixed size
>> hash
>> is "guaranteed" to be unique as that's absolutely impossible, per the
>> pigeonhole principle
>> (http://en.wikipedia.org/wiki/Pigeonhole_principle\).
>> String#hash's hash is of a far lower "quality" than that offered by,
>> say,
>> SHA-1 or SHA-2.>> Regards,
>> Peter Cooper
>>http://www.rubyinside.com/> Peter,
> If Its not guaranteed to be unique, then it can't be used for identity.
> Can you give me an example of how i would use string.hash?In general, you wouldn't use String#hash, although you might conceivably
want to override it. It's there for Hash. From the documentation on
Object#hash:"Generates a Fixnum hash value for this object. This function must have
the property that a.eql?(b) implies a.hash == b.hash. The hash value is
used by class Hash."Note the direction of implication: a == b => a.hash == b.hash, not
a.hash == b.hash => a == b.--
Alex
I think I understand. In other words it's not something I would use directly. I just ran across it in Peter's book and wanted to make sure I understood.
More info:
Regards,
Bill
From: "Ron Green" <rongreen1@mac.com>
Ron Green wrote:
When I said it couldn't be used for identity I meant if you can't
guarantee uniqueness.How would you know if you retrieved the correct
data.
You don't need uniqueness. The hash did its job when you can almost instantly chop billions of strings down to a short list of candidate strings. After the hash collision, you trivially search the list for the actual target. The point is to access the stored value, at the target location, quickly!
This is how Google works, for example...
--
Phlip
Ron Green wrote the following on 09.09.2007 21:48 :
Alex Young wrote:
Ron Green wrote:
pigeonhole principle
If Its not guaranteed to be unique, then it can't be used for identity.
Can you give me an example of how i would use string.hash?
In general, you wouldn't use String#hash, although you might conceivably
want to override it. It's there for Hash. From the documentation on
Object#hash:"Generates a Fixnum hash value for this object. This function must have
the property that a.eql?(b) implies a.hash == b.hash. The hash value is
used by class Hash."Note the direction of implication: a == b => a.hash == b.hash, not
a.hash == b.hash => a == b.
I think I understand. In other words it's not something I would use
directly.
You could. Hashes are mainly used to restrict the set of objects you
have to look into to find objects identical to one you have or detect
changes in values (with a small margin for false negatives you must be
able to afford).
The Hash class uses it (I'm guessing storing the objects in a balanced
tree using object hashes as keys for quick access).
I used a hash method (not the Ruby's default one because I wasn't sure
it would still use the same algorithm in Ruby 3.0... ) not so long
ago to code a correlation algorithm across text contents. I'll spare the
details, but I used hashes to get both item lookup speed and storage
space efficiency.
Lionel
No, just the other way around. Two object which are equal should have
the same hash value. But there's no requirement that two objects with
the same hash value be equal.
The way hash and equal interact in the implementation of the Hash
class is that comparing two objects hash values acts as a quick test
to rule out the objects being equal. If they don't have the same hash
they are assumed NOT to be equal. If they do then equality is tested
for the final determination.
If you want to think of it from an analogy with criminal law, if an
object is suspected of being equal to another object, the hashes must
be the same for an indictment, and the trial consists of actually
testing for equality.
On 9/9/07, Sam Smoot <ssmoot@gmail.com> wrote:
I might suggest you should *never* overwrite String#hash, though you
may want to overwrite Object#hash in your own classes. It's a bit
tricky though since the hash should be unique (or close enough), never
change, and two objects that share the same hash should be
Object#equal.
--
Rick DeNatale
My blog on Ruby
http://talklikeaduck.denhaven2.com/
Bill Kelly wrote:
From: "Ron Green" <rongreen1@mac.com>
I think I understand. In other words it's not something I would use
directly. I just ran across it in Peter's book and wanted to make sure I
understood.More info:
Hash function - Wikipedia
Hash table - WikipediaRegards,
Bill
Thanks Bill.
--
Posted via http://www.ruby-forum.com/\.
Phlip wrote:
Ron Green wrote:
When I said it couldn't be used for identity I meant if you can't
guarantee uniqueness.How would you know if you retrieved the correct
data.You don't need uniqueness. The hash did its job when you can almost
instantly chop billions of strings down to a short list of candidate
strings. After the hash collision, you trivially search the list for the
actual target. The point is to access the stored value, at the target
location, quickly!This is how Google works, for example...
Thank you.
--
Posted via http://www.ruby-forum.com/\.