Yet another caching mechanism. The one without the cache

(Erik Veenstra) #1

I had a list of points. Many points. Too many points. It had to
be reduced, in order for my GPS to be able to store it. 250
points, max.

One of the algorithms I came up with, looked for "the worst
point", removed it and started over again. Until the list was
short enough. It's a clean algorithm and not hard to implement.

But...

Determining the "worst point" involves a lot of creating lines
to neighboring points, calculating cross products, calculating
angles, comparing angles and a lot of other expensive stuff.

It was slow. Very slow. Of course it was. Search the list,
shorten it by one element and search the list again. Sounds
like 99% of the work in an iteration is already done in the
previous iteration. To avoid this, we can a. alter the
algorithm, or b. add a cache to the points.

I decided to add the cache to keep the algorithm clean. (I was
still tinkering with it. For result, not speed.) But I didn't
want to pollute the "business classes" either, just for
speeding up this particular algorithm. Asking a point to
calculate _the_ angle by giving it two points, with
"Point#angle(p1,p2)", sounds reasonable. But what's the meaning
of "Point@angle"? By not storing the result of a method call,
we avoid side-effects and keep the algorithm pure functional.
That's a good thing.

We could use "memoize" or "once" to cache the result of a
method call, but that pollutes the class Point, while we only
want to add the cache to the objects involved in this
particular algorithm, not to all instances of the class. Okay,
we could apply memoize on specific objects with
"point.extend(Memoize).memoize(:angle)", but how can we "undo"
this memoization in order to get rid of the burden of the cache
after we've finished our reducing-algorithm?

Is there another way to do it? Well, yes, there is.

points = points.collect{|point| Cache::Cache.cache(point)}

This replaces every point in the list with a Cache object. This
cache object is catches every call to a method of a point (as
in "business methods", not Object.instance_methods, on purpose)
and uses its @cache and @real_object to return the proper
value.

The Cache objects are a kind of "delegators-with-cache".

After our algorithm has finished, we want to get rid of these
Cache objects. That's easy:

points = points.collect{|point| Cache::Cache.un_cache(point)}

In short: Don't change the business classes, don't change the
algorithm. Just add one line before the algorithm and one line
after the algorithm. That's it.

The implementation is easy. In my particular situation, the
speed is more than doubled. Not bad for such a simple thing.

I don't pretend that this will work in every situation, but it
worked for me. It's _a_ solution for _my_ problem, not _the_
solution for _any_ problem.

I just wanted to share... Ruby is fun... ;]

gegroet,
Erik V. - http://www.erikveen.dds.nl/

PS: Proposal for Ruby Quiz: An algorithm for reducing GPS
    tracks or Google Earth routes in order to be able to use
    them as routes.

···

----------------------------------------------------------------

module Cache
   class Cache
     def initialize(real_object)
       @real_object = real_object
       @cache = {}
     end

     def method_missing(method_name, *args, &block)
       # Not thread-safe! Speed is important in caches... ;]
       @cache[[method_name, args, block]] ||=
@real_object.__send__(method_name, *args, &block)
     end

     def self.cache(real_object)
       Cache.new(real_object)
     end

     def self.un_cache(cached_object)
       cached_object.instance_variable_get("@real_object")
     end
   end
end

----------------------------------------------------------------

class Foo
   def bar(n)
     puts "Calculating... (Very expensive!)"

     5*n
   end
end

foo = Foo.new

puts
puts foo.class
puts foo.bar(7)
puts foo.bar(8)
puts foo.bar(7)
puts foo.bar(8)

foo = Cache::Cache.cache(foo)

puts
puts foo.class
puts foo.bar(7)
puts foo.bar(8)
puts foo.bar(7)
puts foo.bar(8)

foo = Cache::Cache.un_cache(foo)

puts
puts foo.class
puts foo.bar(7)
puts foo.bar(8)
puts foo.bar(7)
puts foo.bar(8)

----------------------------------------------------------------

(7rans) #2

Erik Veenstra wrote:

[snip]

The Cache objects are a kind of "delegators-with-cache".

After our algorithm has finished, we want to get rid of these
Cache objects. That's easy:

points = points.collect{|point| Cache::Cache.un_cache(point)}

In short: Don't change the business classes, don't change the
algorithm. Just add one line before the algorithm and one line
after the algorithm. That's it.

The implementation is easy. In my particular situation, the
speed is more than doubled. Not bad for such a simple thing.

I don't pretend that this will work in every situation, but it
worked for me. It's _a_ solution for _my_ problem, not _the_
solution for _any_ problem.

I just wanted to share... Ruby is fun... ;]

That's a nice solution. Thanks! If you don't mind I'd like to add it to
Facets.

T.

(7rans) #3

Funny! I didn't even realize that this is essentially the same as the
Once class I wrote just a week ago!

  class Once

    # Privatize a few Kernel methods that are most likely to clash,
    # but arn't essential here.

    private :class, :clone, :display, :type, :method, :to_a, :to_s

    def initialize( target )
      @self = target
      @cache = {}
    end

    def self ; @self ; end

    def method_missing( meth, *args, &blk )
      return @cache[ [meth,*args] ] if @cache.key?( [meth, *args] )
      @cache[ [meth,*args] ] = @self.send( meth, *args, &blk )
    end

  end

So I was looking at the differences with your implementation. What does
the un_cahce do for us?

Thanks,
T.

PS. I like the name Cache better and will use that, and will merge your
implmentation and mine together for Facets.

(Erik Veenstra) #4

> The Cache objects are a kind of "delegators-with-cache".

That's a nice solution. Thanks! If you don't mind I'd like to
add it to Facets.

That's OK.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

(Benjohn Barnes) #5

So I was looking at the differences with your implementation. What does
the un_cahce do for us?

It seemed to be an easy way to recover the original objects - ie, pull
them out of the caching decorator.

(7rans) #6

benjohn@fysh.org wrote:

> So I was looking at the differences with your implementation. What does
> the un_cahce do for us?

It seemed to be an easy way to recover the original objects - ie, pull
them out of the caching decorator.

Oh right, so you can use the same var name, eg. points. I see. the use
of #uncache threw me beacuase I was thinking it did something more than
just return the original object. With my implementation its just:

  points = points.collect{|point| point.self}

I've found #self to be a pretty good public method to use for this in
delegators, although prehaps it's a touch misleading, but it's short,
to the point, and very unlikely to get in the way of method_missing.
Hmmm... But I can also see how that might not always be the case, say
if you are delgating another delegator, so myabe your class method
approach is best? It's subtle. I'll just offer both options for unless
someone can show how #self is really a bad idea.

Thanks,
T.