Stability of Marshaling format

Hi,

I’ve been working on a new networked application in Ruby that heavily uses
Ruby’s internal Marshaling for inter-node communication. I’d like to
better understand the issues involved in that design decision and would
like your input.

Basically my question is:
How stable can one expect the marshaling format itself to be?

The risc with my application is that a future version of the system uses a
new/later Ruby version. If the latter uses a different Marshaling
format old installments of the app might not be able to communicate
with newer versions.

Ok, I hear you scream YAML/Syck and/or XML already but I don’t feel it
should be needed here where readability is not a requirement, marshaling
is very fast and compact (ok, ok “premature optimization”…), and its
easier since Marshal is in core Ruby (ok, ok both yaml and rexml now in
the core install).

One interesting “data point” would be to know if/when/how the marshal
format has changed in the past. Have it been changed so that older
marshaled object couldn’t be read back? What is the likelihood of that
happening again? Matz?

Regards,

Robert Feldt

Basically my question is:
How stable can one expect the marshaling format itself to be?

[snip]

One interesting “data point” would be to know if/when/how the marshal
format has changed in the past. Have it been changed so that older
marshaled object couldn’t be read back? What is the likelihood of that
happening again? Matz?

My impression is that it’s not stable across
versions. There have been times I had to re-run
a generator to create a new marshal file.

But I don’t have any specific data to back that
up, sorry.

Hal

···

----- Original Message -----
From: “Robert Feldt” feldt@ce.chalmers.se
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, June 18, 2003 12:30 AM
Subject: Stability of Marshaling format

Hi,

···

In message “Stability of Marshaling format” on 03/06/18, Robert Feldt feldt@ce.chalmers.se writes:

Basically my question is:
How stable can one expect the marshaling format itself to be?

The format last changed on Sep. 2 2002. I have no specific plan to
change the format in the future. Instead, I have reluctance to change
it, since any minor upgrade may cause dRuby to fail.

						matz.

Would including readers for older marshal formats somewhere in the
distribution make sense? I can definitely imagine scenarios where my
marshalled data might outlive my ruby version.

martin

···

Hal E. Fulton hal9000@hypermetrics.com wrote:

My impression is that it’s not stable across
versions. There have been times I had to re-run
a generator to create a new marshal file.

Hi,

Hi matz and thanks for answering,

Basically my question is:
How stable can one expect the marshaling format itself to be?

The format last changed on Sep. 2 2002. I have no specific plan to
change the format in the future. Instead, I have reluctance to change
it, since any minor upgrade may cause dRuby to fail.

Great, exactly what I wanted to hear.

Thanks,

Robert

···

On Thu, 19 Jun 2003, Yukihiro Matsumoto wrote:

Or better yet (perhaps), an “on-the-fly” converter of older formats to
the newer format.

···

On Wednesday, June 18, 2003, at 01:55 PM, Martin DeMello wrote:

[snip]

Would including readers for older marshal formats somewhere in the
distribution make sense? I can definitely imagine scenarios where my
marshalled data might outlive my ruby version.

martin

I’m unconvinced this is better - we don’t want to encourage data to hang
around in the old format, after all.

martin

···

Mark Wilson mwilson13@cox.net wrote:

On Wednesday, June 18, 2003, at 01:55 PM, Martin DeMello wrote:

Would including readers for older marshal formats somewhere in the
distribution make sense? I can definitely imagine scenarios where my
marshalled data might outlive my ruby version.

Or better yet (perhaps), an “on-the-fly” converter of older formats to
the newer format.

···

Mark Wilson mwilson13@cox.net wrote:

On Wednesday, June 18, 2003, at 01:55 PM, Martin DeMello wrote:

Would including readers for older marshal formats somewhere in the
distribution make sense? I can definitely imagine scenarios where my
marshalled data might outlive my ruby version.

Or better yet (perhaps), an “on-the-fly” converter of older formats to
the newer format.

I’m unconvinced this is better - we don’t want to encourage data to hang
around in the old format, after all.

Is there currently a ‘format version’ field recorded with the
object somewhere? That lets future generations decide for you.

“never put off till tomorrow what you can dump on your grandchildren” :slight_smile:


What garlic is to salad, insanity is to art.
Rasputin :: Jack of All Trades - Master of Nuns

How about a callable method that will read from the old format, when
encountered, and write back in the new format (a permanent conversion)
so that initially there is a conversion load that decreases as more
marshalled objects are encountered. One could also have an option to
permanently convert all marshalled objects that can be found at a
particular time to the new format. I’m assuming that, because this is
for a networked application, that all marshalled objects may not be
simultaneously available and that there may be so many marshalled
objects that one would want to defer conversion until necessary. I’m
also assuming that the newer format has been adopted because it
provides superior features justifying the resources used to convert
formats. If not, one could just make sure that the older format is
available to the application (although this may not be a trivial thing
to do).

···

On Wednesday, June 18, 2003, at 03:36 PM, Martin DeMello wrote:

[snip]

Or better yet (perhaps), an “on-the-fly” converter of older formats to
the newer format.

I’m unconvinced this is better - we don’t want to encourage data to
hang
around in the old format, after all.

martin

As this indicates

$ ruby -v -e ‘p Marshal.dump(nil).unpack(“H*”)’
ruby 1.8.0 (2003-06-14) [i386-mingw32]
[“040830”]

$ ruby -e ‘p Marshal.dump(true).unpack(“H*”)’
[“040854”]

$ ruby -e ‘p Marshal.dump(1).unpack(“H*”)’
[“04086906”]

the first two bytes are version info of some sort. Taking a look in
marshal.c reveals the major active lines:

w_byte(MARSHAL_MAJOR, &arg);
w_byte(MARSHAL_MINOR, &arg);

rb_ensure(dump, (VALUE)&c_arg, dump_ensure, (VALUE)&arg);

return port;

}

and

#define MARSHAL_MAJOR 4
#define MARSHAL_MINOR 8

which indicates that the first byte is the major rev number and the 2nd
one is the minor rev number. Looking a bit further indicates that the 3rd
byte seems to be used to indicate class of the object => nil/true/false
need only three bytes to be uniquely spec’ed. Looking even further (ok
last time ;)) reveals that Marshal.load bails out if major rev is not same
as the expected one or if minor is higher than expected one.

Would be interesting to know how many minor rev’s there has been for the
major rev numbers before 04?

Regards,

Robert

···

On Thu, 19 Jun 2003, Rasputin wrote:

Mark Wilson mwilson13@cox.net wrote:

On Wednesday, June 18, 2003, at 01:55 PM, Martin DeMello wrote:

Would including readers for older marshal formats somewhere in the
distribution make sense? I can definitely imagine scenarios where my
marshalled data might outlive my ruby version.

Or better yet (perhaps), an “on-the-fly” converter of older formats to
the newer format.

I’m unconvinced this is better - we don’t want to encourage data to hang
around in the old format, after all.

Is there currently a ‘format version’ field recorded with the
object somewhere? That lets future generations decide for you.

Robert Feldt wrote:

···

On Thu, 19 Jun 2003, Rasputin wrote:

Mark Wilson mwilson13@cox.net wrote:

On Wednesday, June 18, 2003, at 01:55 PM, Martin DeMello wrote:

Would including readers for older marshal formats somewhere in the
distribution make sense? I can definitely imagine scenarios where my
marshalled data might outlive my ruby version.

Or better yet (perhaps), an “on-the-fly” converter of older formats to
the newer format.

I’m unconvinced this is better - we don’t want to encourage data to hang
around in the old format, after all.

Is there currently a ‘format version’ field recorded with the
object somewhere? That lets future generations decide for you.

As this indicates

$ ruby -v -e ‘p Marshal.dump(nil).unpack(“H*”)’
ruby 1.8.0 (2003-06-14) [i386-mingw32]
[“040830”]

$ ruby -e ‘p Marshal.dump(true).unpack(“H*”)’
[“040854”]

$ ruby -e ‘p Marshal.dump(1).unpack(“H*”)’
[“04086906”]

the first two bytes are version info of some sort. Taking a look in
marshal.c reveals the major active lines:

w_byte(MARSHAL_MAJOR, &arg);
w_byte(MARSHAL_MINOR, &arg);

rb_ensure(dump, (VALUE)&c_arg, dump_ensure, (VALUE)&arg);

return port;

}

and

#define MARSHAL_MAJOR 4
#define MARSHAL_MINOR 8

which indicates that the first byte is the major rev number and the 2nd
one is the minor rev number. Looking a bit further indicates that the 3rd
byte seems to be used to indicate class of the object => nil/true/false
need only three bytes to be uniquely spec’ed. Looking even further (ok
last time ;)) reveals that Marshal.load bails out if major rev is not same
as the expected one or if minor is higher than expected one.

Would be interesting to know how many minor rev’s there has been for the
major rev numbers before 04?

Regards,

Robert

MOD PARENT UP +5 nformative

… uhm… wrong forum, ignore me…

/Anders


dc -e
4ddod3dddn1-89danrn10-dan3+ann6dan2an13dn1+dn2-dn3+5ddan2/9+an13nap