Marshaling objects partially

Hi!

I’ve got a question about streaming objects with Marshal.

The structure I’d like to stream is cyclicly referenced and the objects
that I want to dump have some data members that I would not like to
stream.

I had a look at _dump and _load, but it seems to me that I’d have to
solve the cyclic references myself again. Is this true?

What is the recomended way of streaming such a structure?
Can I simply tag some members as being ‘volatile’ so they are no longer
streamed?

Regards,

Ronald

Hi,

I had a look at _dump and _load, but it seems to me that I’d have to
solve the cyclic references myself again. Is this true?

Yes.

But if you use Marshal.dump for _dump, Marshal resolves cycles
(potential duplication might happen).

class Foo
def _dump(limit)
Marshal.dump([…data in array…], limit-1)
end
def Foo._load(str)
data = Marshal.load(str)
…build instance using data…
end
end

						matz.
···

In message “Marshaling objects partially” on 03/07/07, Ronald Pijnacker rhp@dse.nl writes:

Hi,

I had a look at _dump and _load, but it seems to me that I’d have to
solve the cyclic references myself again. Is this true?

Yes.

But if you use Marshal.dump for _dump, Marshal resolves cycles
(potential duplication might happen).

Thanks, this works fine.

Do you have any suggestions on how to orgainize _load and _dump such
that in a class hierarchy every class level is responsible for streaming
its own members?

Ronald.

···

In message “Marshaling objects partially” > on 03/07/07, Ronald Pijnacker rhp@dse.nl writes:

Hi,

I had a look at _dump and _load, but it seems to me that I’d have to
solve the cyclic references myself again. Is this true?

Yes.

But if you use Marshal.dump for _dump, Marshal resolves cycles
(potential duplication might happen).

Thanks, this works fine.

Do you have any suggestions on how to orgainize _load and _dump such
that in a class hierarchy every class level is responsible for streaming
its own members?

On second tought… it does not seem to work fine for cycles. The
following example breaks down with a SystemStackError:

class Foo
attr_accessor :bar
def _dump(limit)
Marshal.dump([@bar], limit-1)
end
def Foo._load(str)
data = Marshal.load(str)
obj = self.new
obj.bar = data.shift
end
end

foo1 = Foo.new
foo2 = Foo.new
foo1.bar = foo2
foo2.bar = foo1
str = Marshal.dump(foo1)

It works ok without the _dump/_load part.

Ronald.

···

In message “Marshaling objects partially” > > on 03/07/07, Ronald Pijnacker rhp@dse.nl writes:

On second tought... it does not seem to work fine for cycles. The
following example breaks down with a SystemStackError:

Well, you must do like marshal. Try something like this (*not tested*)

svg% cat b.rb
#!/usr/bin/ruby
class Foo
   attr_accessor :bar
   @@dumped = {}
   @@loaded =
   @@num = 0

   def _dump(limit)
      if @@dumped[self]
         "\000#{@@dumped[self]}\000"
      else
         @@dumped[self] = @@num
         @@num += 1
         Marshal.dump([@bar], limit-1)
      end
   end

   def self._load(str)
      if /\A\000(\d+)\000\z/ =~ str
         obj = @@loaded[$1.to_i]
      else
         obj = self.new
         @@loaded.push(obj)
         obj.bar = Marshal.load(str).shift
      end
      obj
   end
end

foo1 = Foo.new
foo2 = Foo.new
foo1.bar = foo2
foo2.bar = foo1
p foo1
p Marshal.load(Marshal.dump(foo1))
svg%

svg% b.rb
#<Foo:0x40099100 @bar=#<Foo:0x40099010 @bar=#<Foo:0x40099100 ...>>>
#<Foo:0x40098ebc @bar=#<Foo:0x40098e30 @bar=#<Foo:0x40098ebc ...>>>
svg%

The string "\000#{@@dumped[self]}\000" must be something never used by
marshal

Guy Decoux

Hi,

On second tought… it does not seem to work fine for cycles. The
following example breaks down with a SystemStackError:

Well, you must do like marshal. Try something like this (not tested)

Ah, let me think…

To whom it may concern, can I increase format minor version on 1.8.0
to solve this issue? You will have compatibility problem by this
change, but minor version number will change in 1.8.0 anyway, so that
the damage would be minimal.

By this change, _dump() and _load() can return/receive arbitrary
object, not only string.

						matz.
···

In message “Re: Marshaling objects partially” on 03/07/08, ts decoux@moulon.inra.fr writes:

In article 1057720032.125208.13426.nullmailer@picachu.netlab.jp,
matz@ruby-lang.org (Yukihiro Matsumoto) writes:

By this change, _dump() and _load() can return/receive arbitrary
object, not only string.

When object A and B refer each other, which _load is called first?

class C
def _dump … end
def C._load(obj) … end
end

A = C.new
B = C.new
A.instance_eval { @ref = B; @xxx = STDIN }
B.instance_eval { @ref = A; @xxx = STDIN }

m = Marshal.dump(A)

Marshal.load(m)

If _load for A is called first, it cannot initialize @ref to copy of B
because it is not exist yet.

If _load for B is called first, it cannot initialize @ref to copy of A
because it is not exist yet.

So, I think _load should be an instance method.
(See [ruby-dev:18312] for details.)

···


Tanaka Akira

When object A and B refer each other, which _load is called first?

I must say that I've not understood

svg% cat b.rb
#!/usr/bin/ruby
class C
   attr_accessor :ref
   @@dumped = {}
   @@loaded =
   @@num = 0

   def _dump(limit)
      if @@dumped[self]
         "\000#{@@dumped[self]}\000"
      else
         @@dumped[self] = @@num
         @@num += 1
         Marshal.dump([@ref], limit-1)
      end
   end

   def self._load(str)
      if /\A\000(\d+)\000\z/ =~ str
         obj = @@loaded[$1.to_i]
      else
         obj = self.new
         @@loaded.push(obj)
         obj.ref = Marshal.load(str).shift
      end
      obj
   end
end

a = C.new
b = C.new

a.instance_eval { @ref = b; @xxx = STDIN }
b.instance_eval { @ref = a; @xxx = STDIN }

p a
m = Marshal.dump(a)
c = Marshal.load(m)
p c
p c.ref

svg%

svg% b.rb
#<C:0x40098e30 @ref=#<C:0x40098d40 @ref=#<C:0x40098e30 ...>, @xxx=#<IO:0x4009f348>>, @xxx=#<IO:0x4009f348>>
#<C:0x40098b60 @ref=#<C:0x40098ad4 @ref=#<C:0x40098b60 ...>>>
#<C:0x40098ad4 @ref=#<C:0x40098b60 @ref=#<C:0x40098ad4 ...>>>
svg%

Guy Decoux

In article 200307091409.h69E9Qn21340@moulon.inra.fr,
ts decoux@moulon.inra.fr writes:

I must say that I’ve not understood

svg% cat b.rb

This works well because an object is allocated before child object is
loaded as:

     obj = self.new
     @@loaded.push(obj)
     obj.ref = Marshal.load(str).shift

But marshal.c cannot know an object allocated by C._load before
calling C._load.

···


Tanaka Akira

“Tanaka Akira” akr@m17n.org wrote in message
news:87znjnvl2e.fsf@serein.a02.aist.go.jp…

In article 200307091409.h69E9Qn21340@moulon.inra.fr,
ts decoux@moulon.inra.fr writes:

I must say that I’ve not understood

svg% cat b.rb

This works well because an object is allocated before child object is
loaded as:

     obj = self.new
     @@loaded.push(obj)
     obj.ref = Marshal.load(str).shift

But marshal.c cannot know an object allocated by C._load before
calling C._load.

if I run Guys example through a fairly well tested (- the thread part;-)
“Compare by Value” module

···

module CompareByValue

def ==(other)
if not instance_of?(other.class)
false
elsif equal?(other)
true
elsif id < other.id
CompareByValue.comp?(self,other)
else
CompareByValue.comp?(other,self)
end
end

end

class << CompareByValue

Seen is a book keeping class for currently

active thread local CompareByValue comparisons.

class Seen < Hash

# Helper class to store Id Pairs.
const_set :Pair, Struct.new(:l,:r)

def initialize(l,r)
  store(Pair.new(l,r),true)
end

def default(key)
  store(key,true)
  nil
end


def seen?(l,r)
  self[Pair.new(l,r)]
end

def unseen(l,r)
  delete(Pair.new(l,r))
end

end

symbol mangling

const_set :SEEN_COMPS, “CBV#{id}_comps”.intern
const_set :NUM_CALLS, “CBV#{id}_calls”.intern

def comp?(l,r)
return true if thread_local_seen?(l,r)
begin
Thread.current[NUM_CALLS] +=1
return Comp.new.comp?(l,r)
rescue
Thread.current[SEEN_COMPS].unseen(l,r)
raise
ensure
if (Thread.current[NUM_CALLS] -= 1).zero?
Thread.current[SEEN_COMPS].clear
end
end
end

def thread_local_seen?(l,r)
if comps = Thread.current[SEEN_COMPS]
comps.seen?(l,r)
else
Thread.current[NUM_CALLS] = 0
Thread.current[SEEN_COMPS] = Seen.new(l,r)
false
end
end

class Comp < Hash

def comp?(l,r)
  return true if l.equal?(r)
  vars = l.instance_variables.sort!
  return false  unless vars == r.instance_variables.sort!
  #
  store(l.id,r.id)
  vars.each { |name|
    ll = l.instance_eval(name)
    rr = r.instance_eval(name)
    if CompareByValue === ll
      return false unless ll.instance_of?(rr.class)
      if rr_seen_id = self[ll.id]
        return false unless rr_seen_id == rr.id
      else
        return false unless comp?(ll,rr)
      end
    else
      return false   unless ll == rr
    end
  }
  return true
end

end

end

class C
include CompareByValue
attr_accessor :ref
@@dumped = {}
@@loaded =
@@num = 0

def _dump(limit)
if @@dumped[self]
“\000#{@@dumped[self]}\000”
else
@@dumped[self] = @@num
@@num += 1
Marshal.dump([@ref], limit-1)
end
end

def self._load(str)
if /\A\000(\d+)\000\z/ =~ str
obj = @@loaded[$1.to_i]
else
obj = self.new
@@loaded.push(obj)
obj.ref = Marshal.load(str).shift
end
obj
end
end

Kernel.const_set :A, C.new
Kernel.const_set :B, C.new

A.instance_eval { @ref = B; @xxx = 1 }
B.instance_eval { @ref = A; @xxx = 2 }

p A
p B
p a = Marshal.load(Marshal.dump(A))
p b = Marshal.load(Marshal.dump(B))
p B == A # false
p b == a # true
p B == a # false
p b == a.ref # true
p B == A.ref # true
p B == a.ref # false
p b == A.ref # false

I am getting an output that does not look quite right to me


#<C:0x2788020 @ref=#<C:0x2787918 @ref=#<C:0x2788020 …>, @xxx=2>, @xxx=1>
#<C:0x2787918 @ref=#<C:0x2788020 @ref=#<C:0x2787918 …>, @xxx=1>, @xxx=2>
#<C:0x2786d30 @ref=#<C:0x2786928 @ref=#<C:0x2786d30 …>>>
#<C:0x2786928 @ref=#<C:0x2786d30 @ref=#<C:0x2786928 …>>>
false
true
false
true
true
false
false

but I am not sure if I would blame this on the current “_dump, _load”
scheme.

/Christoph

“Christoph” <swap(news_chr)@gmx.net> wrote in message
news:bei96e$2r6q$1@ulysses.news.tiscali.de

I am getting an output that does not look quite right to me

Oh well never mind, I just read the beginning of the thread …
i.e. the output looks right to me now;-)

/Christoph