A use case for an ordered hash

Austin Ziegler wrote:

In the case that Hal is talking about, and in the case where
PDF::Writer uses an ordered hash, this would be inappropriate.
Insertion-ordered associations are generally better. I will probably
be modifying PDF::Writer a bit to make it clear that it's using an
ordered association, not an ordered "hash", but it would be useful to
have an ordered association. Maybe we *could* take advantage of a
literal for this:

ordered_assoc = ( 1 => 2, 2 => 3, 'a' => 5 )

Maybe. For Ruby 2.0.

Either (=>) or [=>] would be fine with me. I'd slightly prefer
the latter.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

I once suggested Map -- a nice short name -- but there was some
reason this wasn't acceptable.

I'd even suggest (gasp) letting it inherit from Hash.

Hal

...and which could have been seen by someone who looked at more than
the last two messages of a thread.

-austin

···

On 8/24/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

surf wrote:
> Can't you create your own using mixins and including enumerable or
> something like that ?
Yes, but I would lose the convenient literal notation.

--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo • statue
               * austin@zieglers.ca

Hal Fulton wrote:

surf wrote:
>
> Can't you create your own using mixins and including enumerable or
> something like that ?
>

Yes, but I would lose the convenient literal notation.

Hal

I was curious about this, so I had to try it:

···

================================================

class OrderedHash

include Enumerable

def initialize(*parm)
@arr = parm
@hash = {}
parm.each do |p|
   p.each do |k,v|
     @hash[k] = v
   end
end
end

def =(idx, val)
el = {idx => val}
@arr << el
@hash[idx] = val
end

def (idx)
@hash[idx]
end

def each
  @arr.each do |e|
    e.each do |k,v|
      yield k,v
    end
  end
end

end

# need each paramter in {}, could also work out a way to do
# [:frog,"green", :sky, "blue" etc ]

thingColor = OrderedHash.new({:frog => "green"},
                             {:sky => "blue"},
                             {:elephant => "pink"},
                             {:scarf => "red"}

                    )

puts "the sky is #{thingColor[:sky]}\n\n"

thingColor[:limo] = "black"
thingColor[:snowman] = "white"

thingColor.each do |k,v|
  puts "#{k} = #{v}"
end

Mauricio Fernandez wrote:

# But this would be nicer
class Array
  def cassoc(x) # "case assoc"
    each{|arr| return arr if arr[0] === x}
    nil
  end
end

re, action = actions.cassoc("abcdef")
action and action.call # => 1
re, action = actions.cassoc("abcx")
action and action.call # => 3

Now, that's clever. That's probably the least painful
solution I've seen using arrays.

I still wish we had a so-called ordered hash. :wink:

Hal

You could use find() there, right?

class Array
   def cassoc(match)
     find { |arr| arr.first === match }
   end
end

James Edward Gray II

···

On Aug 13, 2006, at 5:03 AM, Mauricio Fernandez wrote:

# But this would be nicer
class Array
  def cassoc(x) # "case assoc"
    each{|arr| return arr if arr[0] === x}
    nil
  end
end

Robert Klemme wrote:

Hal Fulton wrote:

I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Um... So you're saying that you wouldn't bother if lookups were O(n)? Wow!

I think you mean "wouldn't care"?

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

If I were storing 20 million elements, I wouldn't use a Hash
either. I'd either use a database or flat file; or if I
needed it in RAM I might customize something (or likely use
the red-black tree implementation).

I'm a late-optimizer. Until you mentioned it now, I had never
thought of speed as a reason to use a hash. (Although, if you
had asked me, I'd have said it was faster than O(n), of course.)

Hal

Either (=>) or [=>] would be fine with me. I'd slightly prefer
the latter.

Or we could use another %?{ } constructor, which are, after all, being
reserved for future expansion.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Also, it would probably include RCR 306: include rbtree in the stdlib
along the way :slight_smile:

m.

···

On 8/14/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

Austin Ziegler wrote:

···

On 8/24/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

surf wrote:
> Can't you create your own using mixins and including enumerable or
> something like that ?
Yes, but I would lose the convenient literal notation.

...and which could have been seen by someone who looked at more than
the last two messages of a thread.

Heh heh. That's OK. We were at "that point" in the fugue.

Hal

Note: There are about a dozen implementations of an insertion ordered
hash-like objects. I've written one. Hal has probably written one. Ara
Howard has written one, I think.

All that's being discussed is whether it's worthwhile making a C
version that has a *new* literal syntax to create insertion ordered
hash-like objects. The main concerns are (1) what literal syntax, and
(2) what name. Since we're talking something *new*, we're not talking
something that would affect the performance for people who don't need
the functionality.

Just saying that we can implement this in Ruby -- or proviing such an
implementation -- isn't helpful in the least. We know this. We've done
this. We're looking for something different.

-austin

···

On 8/26/06, surf <surfunbear@yahoo.com> wrote:

I was curious about this, so I had to try it:

--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo • statue
               * austin@zieglers.ca

Hal Fulton wrote:

I still wish we had a so-called ordered hash. :wink:

Hal

We do:

http://raa.ruby-lang.org/project/orderedhash/

Tom

···

--
Tom Werner
Helmets to Hardhats
Software Developer
tom@helmetstohardhats.org
www.helmetstohardhats.org

Hal Fulton wrote:

Robert Klemme wrote:

Hal Fulton wrote:

I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Um... So you're saying that you wouldn't bother if lookups were O(n)? Wow!

I think you mean "wouldn't care"?

Probably. I'm not a native speaker so I'm certainly not authoritative on this matter. :slight_smile:

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

That probably also depends on the frequency of lookups. :slight_smile: But yes, my hashes are typically larger (although I also use small ones).

If I were storing 20 million elements, I wouldn't use a Hash
either. I'd either use a database or flat file; or if I
needed it in RAM I might customize something (or likely use
the red-black tree implementation).

>

I'm a late-optimizer. Until you mentioned it now, I had never
thought of speed as a reason to use a hash. (Although, if you
had asked me, I'd have said it was faster than O(n), of course.)

Interesting.

GoOd (n)ight

  robert

:wink:

Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob's statistics.
Sometimes you just don't want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person's benefit would kill the speed and jack up memory usage.

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

Something like this would probably sit better, and is still shorter
than the PHP equiv.:

# a = [['key 1','1'],['key 2', '2'],['key 3',3]].toAssoc

# a['key 1']
=> 1

# p a
=> [['key 1','1'],['key 2', '2'],['key 3',3]]

···

On 8/14/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

Robert Klemme wrote:
> Hal Fulton wrote:
>>
>> I never use a hash for fast lookup. I use it because it
>> allows me easily to associate an arbitrary key with a
>> value.
>
> Um... So you're saying that you wouldn't bother if lookups were O(n)?
> Wow!

I think you mean "wouldn't care"?

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

--
Phillip Hutchings
http://www.sitharus.com/

Wouldn't it be the same as Hash lookup? It's still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

-austin

···

On 8/14/06, Martin DeMello <martindemello@gmail.com> wrote:

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo • statue
               * austin@zieglers.ca

"Martin DeMello" <martindemello@gmail.com> writes:

Either (=>) or [=>] would be fine with me. I'd slightly prefer
the latter.

Or we could use another %?{ } constructor, which are, after all, being
reserved for future expansion.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Fun history fact:

#524<5|>lilith:~/src$ grep Dict ruby-0.49/dict.c
...
    C_Dict = rb_define_class("Dict", C_Object);
    rb_name_class(C_Dict, rb_intern("Hash")); /* alias */
...

···

On 8/14/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

m.

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

Austin Ziegler wrote:

···

On 8/26/06, surf <surfunbear@yahoo.com> wrote:
> I was curious about this, so I had to try it:

Note: There are about a dozen implementations of an insertion ordered
hash-like objects. I've written one. Hal has probably written one. Ara
Howard has written one, I think.

All that's being discussed is whether it's worthwhile making a C
version that has a *new* literal syntax to create insertion ordered
hash-like objects. The main concerns are (1) what literal syntax, and
(2) what name. Since we're talking something *new*, we're not talking
something that would affect the performance for people who don't need
the functionality.

Just saying that we can implement this in Ruby -- or proviing such an
implementation -- isn't helpful in the least. We know this. We've done
this. We're looking for something different.

I started to get the idea to such an effect, or that some short hand
notation was desired although on my last post Hal just said it wouldn't
work due to literal notation.

up
I also need ordered hash too,use like {} literal syntax

···

--
Posted via http://www.ruby-forum.com/.

Tom Werner wrote:

Hal Fulton wrote:

I still wish we had a so-called ordered hash. :wink:

Hal

We do:

http://raa.ruby-lang.org/project/orderedhash/

I knew we were reaching that point in the fugue
again. :wink:

I wrote something similar to this six years ago.

This is not a solution for me because there is no
convenient notation for literals.

Hal

Phillip Hutchings wrote:

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob's statistics.
Sometimes you just don't want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person's benefit would kill the speed and jack up memory usage.

Your point is well taken.

However, it remains to be seen whether it would kill the speed.
Personally I doubt it would be significant, but I've seen no
numbers one way or the other.

Memory I would consider a more serious issue. Every pair in the
hash would occupy an extra N bytes (addmitedly a low value of N,
like 4). In the case of 30,000 pairs, that's an additional (say)
120K of RAM used.

As for "one person's benefit" -- it's true that the people
wanting this functionality are in the minority. But we are
far more than one person. :slight_smile:

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

That is definitely a better description of the finctionality.

I would even say that "associative array" alone would be a sufficient
term, since "array" implies order. (But I may be wrong there.)

I'd be happy with a separate class for that functionality *IF* there
were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

Hal

True, I suppose you could do it with a doubly linked list overlaid on
a hash. Was thinking of sorted order.

martin

···

On 8/14/06, Austin Ziegler <halostatue@gmail.com> wrote:

On 8/14/06, Martin DeMello <martindemello@gmail.com> wrote:
> OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Wouldn't it be the same as Hash lookup? It's still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

> OHash or Assoc or even Dict (though the latter at least implies an O(1)
lookup).

Wouldn't it be the same as Hash lookup? It's still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

That too is a point that needs definition. I'd like to have two behaviors.
I call the first one static, and the second one dynamic (and I describe the
second one first of course).
Than I would like to be able to write, I do not need literals, but that is
another story that shall be told another day.

dyn = D.new( :a => 42, :b => 42, :c => 42 ) #using future named params of
course
dyn[:b] = 42
puts dyn.keys.join("*") = > "a*c*b"
while the static bersion would of course result in => "a*b*c"
Even the static version has no memory of course and
static[k]=static.delete(k) would become a very meaningful statement,
suddenly :slight_smile:

Cheers
Robert

BTW
Do you think one(*42) can generalize Hash values to 42? I think that would
be a nice Set(up).

R.

···

On 8/14/06, Austin Ziegler <halostatue@gmail.com> wrote:

On 8/14/06, Martin DeMello <martindemello@gmail.com> wrote:

-austin
--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo • statue
               * austin@zieglers.ca

--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein