[SOLUTION] SerializableProc (#38)

Hi,

This is the second solution that I could finish in time. Well, it was pretty easy.

I imagine my solution is not very fast, as each time a method on the SerializableProc is called, a new Proc object is created.
The object could be saved in an instance variable @proc so that speed is only low on the first execution. But that would require the definition of custom dump methods for each Dumper so that it would not attempt to dump @proc.

Here's my solution...
(Question: Is it better if I attach it or just paste it like this?)

class SerializableProc

   def initialize( block )
     @block = block
     # Test if block is valid.
     to_proc
   end

   def to_proc
     # Raises exception if block isn't valid, e.g. SyntaxError.
     eval "Proc.new{ #{@block} }"
   end

   def method_missing( *args )
     to_proc.send( *args )
   end

end

if $0 == __FILE__

   require 'yaml'
   require 'pstore'

   code = SerializableProc.new %q{ |a,b| [b,a] }

   # Marshal
   File.open('proc.marshalled', 'w') { |file| Marshal.dump(code, file) }
   code = File.open('proc.marshalled') { |file| Marshal.load(file) }

   p code.call( 1, 2 )

   # PStore
   store = PStore.new('proc.pstore')
   store.transaction do
     store['proc'] = code
   end
   store.transaction do
     code = store['proc']
   end

   p code.call( 1, 2 )

   # YAML
   File.open('proc.yaml', 'w') { |file| YAML.dump(code, file) }
   code = File.open('proc.yaml') { |file| YAML.load(file) }

   p code.call( 1, 2 )

   p code.arity

end

Robin Stocker wrote:

Here's my solution...

And mine's attached to this mail.

I wrote this a while ago and it works by extracting a proc's origin file name and line number from its .inspect string and using the source code (which usually does not have to be read from disc) -- it works with procs generated in IRB, eval() calls and regular files. It does not work from ruby -e and stuff like "foo".instance_eval "lambda {}".source probably doesn't work either.

Usage:

   code = lambda { puts "Hello World" }
   puts code.source
   Marshal.load(Marshal.dump(code)).call
   YAML.load(code.to_yaml).call

proc_source.rb (6.86 KB)

Nice idea, to avoid storing the Proc object in an instance variable and so being able to just use the default serializing. But I guess this is quite slow :wink:

So, here is my solution. It should be almost as fast as normal procs, but I had to implement custom serializing methods. I also implemented a custom ==, because that doesn't really work with method_missing/delegate.

require "delegate"
require "yaml"

class SProc < DelegateClass(Proc)

     attr_reader :proc_src

     def initialize(proc_src)
         super(eval("Proc.new { #{proc_src} }"))
         @proc_src = proc_src
     end

     def ==(other)
         @proc_src == other.proc_src rescue false
     end

     def inspect
         "#<SProc: #{@proc_src.inspect}>"
     end
     alias :to_s :inspect

     def marshal_dump
         @proc_src
     end

     def marshal_load(proc_src)
         initialize(proc_src)
     end

     def to_yaml(opts = {})
         YAML::quick_emit(self.object_id, opts) { |out|
             out.map("!rubyquiz.com,2005/SProc" ) { |map|
                 map.add("proc_src", @proc_src)
             }
         }
     end

end

YAML.add_domain_type("rubyquiz.com,2005", "SProc") { |type, val|
     SProc.new(val["proc_src"])
}

if $0 == __FILE__
     require "pstore"

     code = SProc.new %q{ |*args|
         puts "Hello world"
         print "Args: "
         p args
     }

     orig = code

     code.call(1)

     File.open("proc.marshalled", "w") { |file| Marshal.dump(code, file) }
     code = File.open("proc.marshalled") { |file| Marshal.load(file) }

     code.call(2)

     store = PStore.new("proc.pstore")
     store.transaction do
         store["proc"] = code
     end
     store.transaction do
         code = store["proc"]
     end

     code.call(3)

     File.open("proc.yaml", "w") { |file| YAML.dump(code, file) }
     code = File.open("proc.yaml") { |file| YAML.load(file) }

     code.call(4)

     p orig == code
end

···

On Sun, 10 Jul 2005 21:25:36 +0200, Robin Stocker <robin-lists-ruby-talk@nibor.org> wrote:

   def to_proc
     # Raises exception if block isn't valid, e.g. SyntaxError.
     eval "Proc.new{ #{@block} }"
   end

   def method_missing( *args )
     to_proc.send( *args )
   end

Robin Stocker <robin-lists-ruby-talk@nibor.org> writes:

Hi,

This is the second solution that I could finish in time. Well, it was
pretty easy.

I imagine my solution is not very fast, as each time a method on the
SerializableProc is called, a new Proc object is created.
The object could be saved in an instance variable @proc so that speed
is only low on the first execution. But that would require the
definition of custom dump methods for each Dumper so that it would not
attempt to dump @proc.

Here's my solution...

My code is very similar, but only eval()s once:

require 'delegate'

class SerializableProc < DelegateClass(Proc)
  attr_reader :__code

  def initialize(code)
    @__code = code.lstrip
    super eval("lambda { #@__code }")
  end

  def marshal_dump; @__code; end
  def marshal_load(code); initialize code; end

  def to_yaml
    Object.instance_method(:to_yaml).bind(self).call
  end

  def to_yaml_properties; ["@__code"]; end
  def to_yaml_type; "!ruby/serializableproc"; end
end

# .oO(Is there no easier way to do this?)
YAML.add_ruby_type( /^serializableproc/ ) { |type, val|
  type, obj_class = YAML.read_type_class( type, SerializableProc )
  o = YAML.object_maker( obj_class, val )
  o.marshal_load o.__code
}

Usage:

code = SerializableProc.new %{
  puts "this is serialized!"
  p binding; p caller
}

Obvious problems of this approach are the lack of closures and editor
support (depending on the inverse quality of your editor :P), better
results can be reached with flgr's hack to look for the source on disk
or by using nodewrap to serialize the AST. See
http://rubystuff.org/nodewrap/ for details.

That was a nice quiz.

···

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

Here's my solution...

Here's what I came up with while building the quiz:

class SerializableProc
     def self._load( proc_string )
         new(proc_string)
     end

     def initialize( proc_string )
         @code = proc_string
         @proc = nil
     end

     def _dump( depth )
         @code
     end

     def method_missing( method, *args )
         if to_proc.respond_to? method
             @proc.send(method, *args)
         else
             super
         end
     end

     def to_proc( )
         return @proc unless @proc.nil?

         if @code =~ /\A\s*(?:lambda|proc)(?:\s*\{|\s+do).*(?:\}|end)\s*\Z/
             @proc = eval @code
         elsif @code =~ /\A\s*(?:\{|do).*(?:\}|end)\s*\Z/
             @proc = eval "lambda #{@code}"
         else
             @proc = eval "lambda { #{@code} }"
         end
     end

     def to_yaml( )
         @proc = nil
         super
     end
end

(Question: Is it better if I attach it or just paste it like this?)

It doesn't much matter, but I favor inlining it when it's a single file.

James Edward Gray II

···

On Jul 10, 2005, at 2:25 PM, Robin Stocker wrote:

Granted, we cheated, quite a bit at that, but I think the solution we came up with is pretty:

require 'r2c_hacks'

class ProcStore # We have to have this because yaml calls allocate on Proc
   def initialize(&proc)
     @p = proc.to_ruby
   end

   def call(*args)
     eval(@p).call(*args)
   end
end

code = ProcStore.new { |x| return x+1 }
=> #<ProcStore:0x3db25c @p="proc do |x|\n return (x + 1)\nend">

The latest release of ZenHacks added Proc.to_ruby among other things. Granted, it doesn't preserve the actual closure, just the code, but it looks like that is a limitation of the other solutions as well, so we aren't crying too much.

Our original solution just patched Proc and added _load/_store on it, but it choked on the YAML serialization side of things. Not entirely sure why, and we were too tired to care at the time.

To see what we do to implement Proc.to_ruby:

class Proc
   ProcStoreTmp = Class.new unless defined? ProcStoreTmp
   def to_ruby
     ProcStoreTmp.send(:define_method, :myproc, self)
     m = ProcStoreTmp.new.method(:myproc)
     result = m.to_ruby.sub!(/def myproc\(([^\)]+)\)/, 'proc do |\1|')
     return result
   end
end

···

--
ryand-ruby@zenspider.com - Seattle.rb - http://www.zenspider.com/seattle.rb
http://blog.zenspider.com/ - http://rubyforge.org/projects/ruby2c

Florian Groß wrote:

And mine's attached to this mail.

I wrote this a while ago and it works by extracting a proc's origin file name and line number from its .inspect string and using the source code (which usually does not have to be read from disc) -- it works with procs generated in IRB, eval() calls and regular files. It does not work from ruby -e and stuff like "foo".instance_eval "lambda {}".source probably doesn't work either.

Usage:

  code = lambda { puts "Hello World" }
  puts code.source
  Marshal.load(Marshal.dump(code)).call
  YAML.load(code.to_yaml).call

Interesting. I was considering taking this approach until I realized I'd have to implement a partial Ruby parser, which is what I see you did. Still, it is pretty cool, though obviously a bit hackish.

I wonder if YARV and Ruby byte-code will make it easier for procs to be serialized? I'm not sure how the binding would work (hmmm, if it is just objects maybe they could be serialized as normal), but the proc itself could just be serialized as is if it is self-contained Ruby byte-code.

Does anyone know if this is how YARV will be? Because I'm just guessing here.

Ryan

Proc's documentation tells us that "Proc objects are blocks of code that
have been bound to a set of local variables." (That is, they are "closures"
with "bindings".) Do any of the proposed solutions so far store local
variables?

# That is, can the following Proc be serialized?
  local_var = 42
  code = proc { local_var += 1 } # <= what should that look like in YAML?
  code.call #=> 43
  File.open("proc.marshalled", "w") { |file| Marshal.dump(code, file) }

# New context, e.g. new file:
  code = File.open("proc.marshalled") { |file| Marshal.load(file) }
  code.call #=> 44
  local_var #=> NameError - undefined here

AFAICT, the only one is Christian Neukirchen's Nodewrap suggestion, which
looks very cool. From <http://rubystuff.org/nodewrap/>:

Sample code
This will dump the class Foo (including its instance methods, class
variables, etc.) and re-load it as an anonymous class:
  class Foo
    def foo; puts "this is a test..."; end
  end

  s = Marshal.dump(Foo)
  p Marshal.load(s) #=> #<Class 0lx4027be20>

Here's another, trickier test for SerializableProcs. Can multiple Procs
sharing context, as returned by the following method, be made to behave
consistently across serialization? If the Procs are serialized
independently, I believe this is impossible - an inherent problem with the
idea of serializing Procs (or anything with shared context).
  def two_procs
    x = 1
    [proc { x }, proc { x += 1 }]
  end

  p1, p2 = two_procs
  [p1.call, p2.call, p1.call, p2.call] #=> [1, 2, 2, 3]
  q1, q2 = Marshal.load(Marshal.dump(p1)), Marshal.load(Marshal.dump(p2))
  [q1.call, q2.call, q1.call, q2.call] #=> [3, 4, 4, 5]
  # I expect Nodewrap can get [3, 4, 3, 5] for this last result.

Dave