Subclass of hash, that times last read

Hi all. I'm rather unfamiliar with the subclass / superclass principle,
but I just tried to write one. The result is below here, and should be a
hash-like class, with the exception that it stores the last-accessed
time (the purpose of this being that I can more easily delete entries
from the hash that haven't been used in a while - I'm using a hash for
caching a costly function, but over time the hash grows big and slow,
with a lot of old entries in it.. I'm trying out different ways of
pruning the hash, and this seemed to be an interesting option).

The below code seems to work fine, but it would be a comfort to me to
know if I'm introducing several hairy bug-prone things if I add this to
my code, or that this is the relatively safe and correct way to do it.

class Myhash < Hash
   alias :old_write :[]=
   def []=(key,value)
      old_write(key,[value,Time.now])
   end

   alias :old_read :[]
   def [](key)
      value = old_read(key)
      old_write(key,[value[0],Time.now]) if value.respond_to?(:[])
      old_read(key)
   end
end

hash = Myhash.new
1000.times {hash[rand] = rand}
sleep 7
1000.times {hash[rand] = rand}
hash.delete_if {|k,v| Time.now - v[1] > 5}

Which should delete the 1000 entries from the hash that were created in
the first loop.

···

--
Posted via http://www.ruby-forum.com/.

Hi all. I'm rather unfamiliar with the subclass / superclass principle,
but I just tried to write one. The result is below here, and should be a
hash-like class, with the exception that it stores the last-accessed
time (the purpose of this being that I can more easily delete entries
from the hash that haven't been used in a while - I'm using a hash for
caching a costly function, but over time the hash grows big and slow,
with a lot of old entries in it.. I'm trying out different ways of
pruning the hash, and this seemed to be an interesting option).

The below code seems to work fine, but it would be a comfort to me to
know if I'm introducing several hairy bug-prone things if I add this to
my code, or that this is the relatively safe and correct way to do it.

The only bug I've found so far is that it won't work if the user provides a
default value for the hash.

But allow me to nitpick a bit:

class Myhash < Hash
   alias :old_write :[]=
   def []=(key,value)
      old_write(key,[value,Time.now])
   end

I would avoid aliasing things out of the way. It pollutes the namespace -- you
now have an old_write method, which is really only used by your one writer
method.

And I think a conventional way to do that is to choose names that reflect what
you're doing to change it -- for example, here, you'd choose something
like :write_without_myhash. (Imagine someone inherits from the stack, and you
both do the exact same thing, and define a method called old_write. You'll
create an infinite loop.)

Since you're in a subclass anyway, though, you get this for free:

def []=(key,value)
  super(key,[value,Time.now])
end

   alias :old_read :[]
   def [](key)
      value = old_read(key)
      old_write(key,[value[0],Time.now]) if value.respond_to?(:[])
      old_read(key)
   end
end

The same can be done here, reducing some code duplication:

def [](key)
  value = super(key)
  value[1] = Time.now if self.include? key
  value
end

If that makes sense, you can skip the part where I take five times as long to
explain it...

I'm setting value[1] to Time.now. This will stay, because it's still the same
array object -- I don't have to change anything in the hash.

I'm checking for self.include?, because this will tell me if that particular
key has been assigned yet. If it hasn't, we're dealing with a default value,
and I'm not sure quite what you want to do -- probably store it and assign a
timestamp anyway? If so:

def [](key)
  value = super(key)
  if self.include? key
    value[1] = Time.now
  else
    self[key] = value
    value = self[key]
  end
  value
end

And since you already have the result of super(key) in a value, I'm returning
that value, rather than looking it up again, when I an. Because it's still
the same array, it will have the new timestamp, which is what your old code
did.

hash.delete_if {|k,v| Time.now - v[1] > 5}

You might consider putting this in your class. For example:

def delete_before(time)
  delete_if { |k,v| v.last < time }
end
hash.delete_before(Time.now - 5)

Maybe even:

def delete_older_than(seconds)
  delete_before(Time.now - seconds)
end
hash.delete_older_than(5)

···

On Wednesday 28 May 2008 13:28:58 Boris Schmid wrote:

My approach would be quite different. I'd use files to store the values
("costly functions"), with the name of the file being determined by the
key. (A sort of mini-database I suppose.)

It's very easy to determine when a file was last accessed, so you could
have a separate process running in the background looking for files to
delete. When I did this, I used a cron job, but other approaches are
possible of course.

File accesses are going to be slower than hash lookups, but whether this
is important or not depends on your application. And if some keys are
used very frequently, the associated files should get cached in memory
by the OS, so they should be quick to read.

Of course you could wrap this in a class called MyHash. :wink:

···

--
Posted via http://www.ruby-forum.com/.