Hashes and ordering

I've been wondering something today...

Do people test equality of hashes very often? I, for one,
do not.

For example, does anyone depend on these hashes being
equal?

   x = {1=>2, 3=>4}
   y = {3=>4, 1=>2}
   x == y # => true

Or does anyone have code that depends in some other way
on the fact that the ordering in a hash is NOT predictable?

Curiously,
Hal

I am writing some layout manager classes for wxRuby and I have one base class, AppSizer (pending rename) which provides most of the basic functionality for laying out widgets. I have just finished my first rendition of the ParagraphLayout which inherits from AppSizer, and I override 1 method. My question is in regards to the program design for distribution (when other people come to download a layout manager for wxRuby).

Should I avoid inheritance and have each layout manager hold their own, so users can download only the layout manager they want?
-or-,
Should I keep my design nice, clean and simple and expect users to download the layout managers as a package?

I am asking because I have never programmed for a larger distribution then myself and my guys in the IS dept. at work and I'd like to do a good job.

Thanks,

Zach

Do people test equality of hashes very often? I, for one,
do not.

For example, does anyone depend on these hashes being
equal?

   x = {1=>2, 3=>4}
   y = {3=>4, 1=>2}
   x == y # => true

Or does anyone have code that depends in some other way
on the fact that the ordering in a hash is NOT predictable?

Where are you going with this? There seem to be three unrelated things
here:

1. The question "does anyone use Hash.==(other)?"
2. The implication that you think the order in which pairs are added to
a hash is (or should be) significant.
3. The question of whether anyone depends on hashes not being "ordered"
in some sense

To see why this seems odd, let me recast your question in terms of
arrays:

<analogy>
Does anyone depend on these arrays being equal?

   x = Array.new; x[1] = 2; x[3] = 4
   y = Array.new; y[3] = 4; y[1] = 2
   x == y # => true

Or does anyone have code that depends in some other way on the fact that
the order in which items are added too an array is NOT significant?
</analogy>

-- MarkusQ

Hi,

For example, does anyone depend on these hashes being
equal?

  x = {1=>2, 3=>4}
  y = {3=>4, 1=>2}
  x == y # => true

Hash value orders (if any) should not affect equality check to honor
tradition, I think.

Or does anyone have code that depends in some other way
on the fact that the ordering in a hash is NOT predictable?

Predictable order is a subset of unpredictable orders :wink:

              matz.

···

In message "Re: Hashes and ordering" on Sat, 4 Sep 2004 11:33:04 +0900, Hal Fulton <hal9000@hypermetrics.com> writes:

Hal Fulton wrote:

I've been wondering something today...

Do people test equality of hashes very often? I, for one,
do not.

Nor I, though I'm curious where one might want to do this, other than perhaps to see if two things are really the same thing.

For example, does anyone depend on these hashes being
equal?

  x = {1=>2, 3=>4}
  y = {3=>4, 1=>2}
  x == y # => true

Or does anyone have code that depends in some other way
on the fact that the ordering in a hash is NOT predictable?

Is there a reason that ordering should be unpredictable? I'm wonder if, given a set of name/value pairs, using them to create a hash should always give the same ordering, as the hash should be following some specific algorithm to ensure, say, the fastest insertion/retrieval or some such thing. Not that it just randomly decides how and where to store something.

So, your example may be plausible, unless it breaks some fundamental aspect of hashes. (Though relying on a side effect of implementation might not be the best idea; for this to be useful, the language must assure that such behavior is part of the API.)

James

Markus wrote:

Where are you going with this? There seem to be three unrelated things
here:

Well, I have been thinking along the lines of: What if we had a built-in
ordered indexable collection in Ruby?

To see why this seems odd, let me recast your question in terms of
arrays:

<analogy>

[snip]

Clever analogy, but you're presupposing an unordered arbitrarily indexable
collection. A hash, like our Hash, is inherently unordered, but I am
imagining a similar ordered data structure.

It's more like:

   x = [1,2]
   y = [2,1]
   x == y # false - we *do* depend on this being false

I'm envisioning a data structure that has an order, but is otherwise (in
behavior, not implementation) like our current Hash.

I don't want to go into detail about what I'm thinking at the moment.

But I can see where my post was unclear or vague and would raise
very valid questions in your mind.

Hal

Markus wrote:

To see why this seems odd, let me recast your question in terms of
arrays:

<analogy>
Does anyone depend on these arrays being equal?

   x = Array.new; x[1] = 2; x[3] = 4
   y = Array.new; y[3] = 4; y[1] = 2
   x == y # => true

Or does anyone have code that depends in some other way on the fact that
the order in which items are added too an array is NOT significant?
</analogy>

This seems like apples and oranges. One definition of an array is that it's a contiguous series of memory storage locations; people take for granted that order doesn't matter. Change this and it's not an array any more.

But with hashes, allowing two hashes to have some notion of equality by virtue of having the exact same key/value pairs does not conflict with the essential behavior of a hash.

James

Is there a reason that ordering should be unpredictable? I'm wonder if,
given a set of name/value pairs, using them to create a hash should
always give the same ordering, as the hash should be following some
specific algorithm to ensure, say, the fastest insertion/retrieval or

···

On Sat, Sep 04, 2004 at 12:40:42PM +0900, James Britt wrote:
  ==================
the normal algo. with buckets/bins and collisions lists; take a look at
st.c.

some such thing. Not that it just randomly decides how and where to
store something.

It's not as if it was doing rand() just for the sake of it :slight_smile:

When two hash values collide, the order in the collision list will
depend on the insertion order of the corresponding elements.
If you keep adding elements, the hash will have to use more bins and
rehash the items, so that two elements that were in the same bin could
end in different ones afterwards.

So, your example may be plausible, unless it breaks some fundamental
aspect of hashes.

Ordered iteration comes at a cost; if you're willing to pay it, you can
use rbtree...

--
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

I think I agree with Matz that as Hash.equal? shouldn't be changing, but an
ordered hash could be useful even if its .equal? was limited in that way. It
wouldn't break Hash for order to be maintained. But Hal's right in thinking
that this the ordered hash ideally is a new beast, a slightly different
concept from Hash, that deserves its own .equal?.

···

"Yukihiro Matsumoto" <matz@ruby-lang.org> wrote:

Hash value orders (if any) should not affect equality check to honor
tradition, I think.

Predictable order is a subset of unpredictable orders :wink:

"Hal Fulton" <hal9000@hypermetrics.com> wrote:

does anyone depend on these hashes being equal?
   x = {1=>2, 3=>4}
   y = {3=>4, 1=>2}
   x == y # => true

Or does anyone have code that depends in some other way
on the fact that the ordering in a hash is NOT predictable?

Hal Fulton wrote:

I'm envisioning a data structure that has an order, but is otherwise (in
behavior, not implementation) like our current Hash.

I don't want to go into detail about what I'm thinking at the moment.

But I can see where my post was unclear or vague and would raise
very valid questions in your mind.

Java may have such a thing now (it seems to have a Collection object for everything else), but I recall writing one for myself because the notion of an ordered hash is quite useful, mostly for reliable iteration.

James

php's arrays are like this:

     PHP: Arrays - Manual

They are pretty useful primitives. I wouldn't mind having one of these in ruby :slight_smile:

Cheers,

Patrick

···

On Friday, September 3, 2004, at 11:39 PM, Hal Fulton wrote:

Markus wrote:

Where are you going with this? There seem to be three unrelated things
here:

Well, I have been thinking along the lines of: What if we had a built-in
ordered indexable collection in Ruby?

To see why this seems odd, let me recast your question in terms of
arrays:
<analogy>

[snip]

Clever analogy, but you're presupposing an unordered arbitrarily indexable
collection. A hash, like our Hash, is inherently unordered, but I am
imagining a similar ordered data structure.

It's more like:

  x = [1,2]
  y = [2,1]
  x == y # false - we *do* depend on this being false

I'm envisioning a data structure that has an order, but is otherwise (in
behavior, not implementation) like our current Hash.

I don't want to go into detail about what I'm thinking at the moment.

But I can see where my post was unclear or vague and would raise
very valid questions in your mind.

Markus wrote:

Where are you going with this? There seem to be three unrelated things
here:

Well, I have been thinking along the lines of: What if we had a built-in
ordered indexable collection in Ruby?

What about an association list (Array)?

[["one", 1], ["two", 2], ["three", 3]].assoc("one")
=> ["one", 1]

You could wrap it in a class to have it behave more like a Hash.

To see why this seems odd, let me recast your question in terms of
arrays:

<analogy>

[snip]

Clever analogy, but you're presupposing an unordered arbitrarily indexable
collection. A hash, like our Hash, is inherently unordered, but I am
imagining a similar ordered data structure.

It's more like:

   x = [1,2]
   y = [2,1]
   x == y # false - we *do* depend on this being false

I'm envisioning a data structure that has an order, but is otherwise (in
behavior, not implementation) like our current Hash.

I don't want to go into detail about what I'm thinking at the moment.

But I can see where my post was unclear or vague and would raise very
valid questions in your mind.

Hal

Regards,
KB

···

On Sat, 04 Sep 2004 12:39:31 +0900, Hal Fulton wrote:

"Dave Burt" <burtdav@hotmail.com> schrieb im Newsbeitrag
news:Gye_c.18929$D7.7399@news-server.bigpond.net.au...

I think I agree with Matz that as Hash.equal? shouldn't be changing, but

an

ordered hash could be useful even if its .equal? was limited in that way.

It

wouldn't break Hash for order to be maintained. But Hal's right in

thinking

that this the ordered hash ideally is a new beast, a slightly different
concept from Hash, that deserves its own .equal?.

> Hash value orders (if any) should not affect equality check to honor
> tradition, I think.
>
> Predictable order is a subset of unpredictable orders :wink:

> does anyone depend on these hashes being equal?
> x = {1=>2, 3=>4}
> y = {3=>4, 1=>2}
> x == y # => true
>
> Or does anyone have code that depends in some other way
> on the fact that the ordering in a hash is NOT predictable?

As far as I understood Hal he doesn't want to mandate Hash to change but
rather a new class that implements an ordered map. (experimental impl
attached)

    robert

TreeMap.rb (1.91 KB)

···

"Yukihiro Matsumoto" <matz@ruby-lang.org> wrote:
"Hal Fulton" <hal9000@hypermetrics.com> wrote:

But my point was, without some details it is very hard to respond
meaningfully to your questions. At least, you seem to see a single
meaningful was to connect what to me are marginally related things.

Do you mean only that Hash.== should act as if there were some
consistent ordering of the key=>value pairs? If so, this could be
accomplished easily with out ever actually ordering them.

On the other hand, do you mean that you would like Hash.each et. al. to
proceed through the items in the hash in a predictable way? If so:

        Do you mean that the order should be determined by the keys,
        regardless of how they are inserted? If so, how do you want to
        order items whose keys are not comparable?
        
        Do you mean that the order should be determined by the order in
        which the pairs are added? If so, what do you do when the value
        at a key is replaced? Or just changed (internally modified)?
        
        Do you mean that the order should be determined by some other
        factor, and if so, what?
        
Or are you just wanting arrays that take arbitrary subscripts?

-- MarkusQ

···

On Fri, 2004-09-03 at 20:39, Hal Fulton wrote:

I don't want to go into detail about what I'm thinking at the moment.

Hi Hal,

Well, I have been thinking along the lines of: What if we had a built-in
ordered indexable collection in Ruby?

[...]

It's more like:

   x = [1,2]
   y = [2,1]
   x == y # false - we *do* depend on this being false

I'm envisioning a data structure that has an order, but is otherwise (in
behavior, not implementation) like our current Hash.

There's a nice structure in stl called a multimap... I've used
it recently to model a priority queue.

Quoting from Generic Programming and the STL by M H Austern:

  multimap<Key, T, Compare, Allocator>
    A multimap is a Sorted Associative Container that associates
  objects of type Key with objects of type T. It is a Pair
  Associative Container, meaning that its value type is
  pair<const Key, T>. It is also a Multiple Associative
  Container, meaning that it may contain two or more identical
  elements. (This is the only way in which map and multimap
  differ.) That is, the multimap class doesn't map a value of
  type Key to _an_ object of type T, but rather to _one or more_
  objects of type T.
    Since multimap is a Sorted Associatiave Container, its
  elements are always sorted in ascending order. The template
  parameter Compare defines the ordering relation.
    Like list, multimap has the important property that inserting
  a new element does not invalidate iterators that point to
  existing elements. Erasing an element from multimap does not
  invalidate any iterators either, except, of course, for
  iterators that point to the element that is being erased.

It's pretty cool - you can just toss values into it with
insert() as you would a hash. But you can iterate forward
and backward through it, seeing the elements in sorted
order (sorted by Key).

So for my priority queue, my key is simply an integer and
my value is... well just any ol' object.

Since it's a *multi* map instead of just map, I can toss
in multiple elements having the same priority. I.e.
  pri_q.insert(pair(9, "a"))
  pri_q.insert(pair(10, "b"))
  pri_q.insert(pair(10, "c"))
  pri_q.insert(pair(11, "d"))

...The 9, 10, and 11 are the keys. It'll store both 10's for
me without conflict. (Instead of the 10=>"c" replacing the
10=>"b".)

Anyway - dunno if this is anything like what you're
thinking of.

Regards,

Bill

···

From: "Hal Fulton" <hal9000@hypermetrics.com>

"James Britt" <jamesUNDERBARb@neurogami.com> schrieb im Newsbeitrag
news:41393AFC.4090001@neurogami.com...

Hal Fulton wrote:
>
> I'm envisioning a data structure that has an order, but is otherwise (in
> behavior, not implementation) like our current Hash.
>
> I don't want to go into detail about what I'm thinking at the moment.
>
> But I can see where my post was unclear or vague and would raise
> very valid questions in your mind.

Java may have such a thing now (it seems to have a Collection object for
everything else), but I recall writing one for myself because the notion
of an ordered hash is quite useful, mostly for reliable iteration.

What do you mean by "now"? java.util.TreeMap is quite old (at least since
1.3).

    robert

"Kristof Bastiaensen" <kristof@vleeuwen.org> schrieb im Newsbeitrag
news:pan.2004.09.04.07.58.02.69935@vleeuwen.org...

···

On Sat, 04 Sep 2004 12:39:31 +0900, Hal Fulton wrote:

> Markus wrote:
>> Where are you going with this? There seem to be three unrelated things
>> here:
>
> Well, I have been thinking along the lines of: What if we had a built-in
> ordered indexable collection in Ruby?
>

What about an association list (Array)?

[["one", 1], ["two", 2], ["three", 3]].assoc("one")
=> ["one", 1]

You could wrap it in a class to have it behave more like a Hash.

Very inefficient if you have many lookups (O(n)). And it doesn't maintain
order automatically.

The typical and most efficient implementation in the general case of an
ordered map is a tree AFAIK.

Kind regards

    robert

Markus wrote:

But my point was, without some details it is very hard to respond
meaningfully to your questions. At least, you seem to see a single
meaningful was to connect what to me are marginally related things.

Yes, true.

On the other hand, do you mean that you would like Hash.each et. al. to
proceed through the items in the hash in a predictable way? If so:

        Do you mean that the order should be determined by the keys,
        regardless of how they are inserted? If so, how do you want to
        order items whose keys are not comparable?

I want insertion order.

        Do you mean that the order should be determined by the order in
        which the pairs are added? If so, what do you do when the value
        at a key is replaced? Or just changed (internally modified)?

Replaced/modified values -- a very good question to which I have
given no thought.

        Or are you just wanting arrays that take arbitrary subscripts?

Probably so, yes. With the condition that a convenient syntax for
literals should exist.

Hal

jib:~ > cat a.rb
   require 'arrayfields'

   a = [0,1,2]
   a.fields = %w(zero one two)
   a['three'] = 3
   a.each_with_field{|k,v| puts "#{ k } -> #{ v }"}

   jib:~ > ruby a.rb
   0 -> zero
   1 -> one
   2 -> two
   3 -> three

arrayfields does a whole lot more that this

   jib:~ > cat b.rb
   require 'arrayfields'

   fa =
     FieldedArray[
       'zero', 0,
       'one', 1,
       'two', 2,
       'three', 3,
     ]

   fa.each_pair{|k,v| puts "#{ k } -> #{ v }"}

   a = [0,1,2,3]
   a.fields = fa.fields

   a.store 'three', 42
   p a['three']
   p a.keys
   p a.values

   jib:~ > ruby b.rb
   zero -> 0
   one -> 1
   two -> 2
   three -> 3
   42
   ["zero", "one", "two", "three"]
   [0, 1, 2, 42]

i think the literal syntax is pretty clear:

   jib:~ > cat c.rb
   require 'arrayfields'

   fa = FieldedArray[
     ['zero', 0],
     ['one', 1],
     ['two', 2],
     ['three', 3],
   ].each_pair{|k,v| puts "#{ k } -> #{ v }"}

   fa = FieldedArray[
     'zero', 0,
     'one', 1,
     'two', 2,
     'three', 3,
   ].each_pair{|k,v| puts "#{ k } -> #{ v }"}

   jib:~ > ruby c.rb
   zero -> 0
   one -> 1
   two -> 2
   three -> 3
   zero -> 0
   one -> 1
   two -> 2
   three -> 3

kind regards.

-a

···

On Sun, 5 Sep 2004, Hal Fulton wrote:

Or are you just wanting arrays that take arbitrary subscripts?

Probably so, yes. With the condition that a convenient syntax for
literals should exist.

--

EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
A flower falls, even though we love it;
and a weed grows, even though we do not love it. --Dogen

===============================================================================

"Kristof Bastiaensen" <kristof@vleeuwen.org> schrieb im Newsbeitrag
news:pan.2004.09.04.07.58.02.69935@vleeuwen.org...

> Markus wrote:
>> Where are you going with this? There seem to be three unrelated
>> things here:
>
> Well, I have been thinking along the lines of: What if we had a
> built-in ordered indexable collection in Ruby?
>
>
What about an association list (Array)?

[["one", 1], ["two", 2], ["three", 3]].assoc("one") => ["one", 1]

You could wrap it in a class to have it behave more like a Hash.

Very inefficient if you have many lookups (O(n)). And it doesn't maintain
order automatically.

Yes, that's true. I mentioned it because it is already in the language.

The typical and most efficient implementation in the general case of an
ordered map is a tree AFAIK.

Yes! And a balanced tree. It is probably the best when you want
the map ordered by key.

If you wanted insertion order, what would you think of the combination
of a Hash and a linked list? The hash could provide key lookup, and the
linked list would keep the order. (It would need to be only a
single-linked list).

Kind regards

    robert

Regards,
Kristof

···

On Sat, 04 Sep 2004 11:33:54 +0200, Robert Klemme wrote:

On Sat, 04 Sep 2004 12:39:31 +0900, Hal Fulton wrote: