The following small test program:
[1.5, 4.0/3.0].each { |a|
ma = Marshal.dump(a)
b = Marshal.load(ma)
if a == b
puts "Everything is working fine for #{a}"
else
puts "PROBLEM: a is #{a}, b is #{b}, and ma is #{ma.dump}"
end
}
Exposes the following strange behavior:
ruby test.rb
Everything is working fine for 1.5
PROBLEM: a is 1.33333333333333, b is 1.33333333333333, and ma is
"\004010f\0261.333333333333333"
So apparently Marshal sometimes changes the value of a
Float when dumped. The problem looks like it is caused by
an attempt to make a platform-independent Marshal format
using an ASCII-decimal like representation, which cannot
accurately express all possible double’s. I understand that
this is an impossible problem since different floating point
formats have different sets of expressible points on the real
line. I think it would be more convenient and useful in
scientific computing, however, if we could exactly save and
load floating-point values when we are on a consistent
platform like the IEEE 754 standard, and just want to use
Marshal to save/load on disk or across a network with no
numerical instability. Here is one way to achieve it using a
special byte:
Choose the three most popular floating-point formats, and
give them codes, like
0 = IEE754x86, 1 = SPARC, 2 = MAC68k, 3 = OTHER
Then add a single byte to the end of the Marshal string that
bitwise copies in a platform-dependent format the
6 least-significant bits of the significand, wherever they
may be according to the spec. Then, when loading up a Float,
first read the decimal expansion as usual to get a "close"
value. Next, check the 2-bit format code to see if the native
Float format is the same. If it is, copy over the six bits of
platform-dependent significand over the close value. Now it’s
precisely right when the format is the same, and “pretty
close” in other cases. If we’re concerned about the size of the
Marshal format, then it would probably do as well to simply
replace the last decimal digit of a maximum-sized decimal
expansion with this special byte, as it contains more information
than a single decimal digit. Then, this special byte would only
incur an extra cost in the relatively small fraction of Float’s
that have submaximal decimal expansions. Finally, it’s not
hard to see how a single bit can be used instead to indicate
the presence or abscence of the special byte, to further cut
down on space used.
This technique would allow me to do long scientific simulations
that have to load and save lots of intermediate files and over
many different computers working in parallel and communicating
results via network.
What do you all think?
Rudi
(another ruby hack)
nobu.nokada@softhome.net wrote in message news:200304161751.h3GHpIHQ017185@sharui.nakada.kanuma.tochigi.jp…
Hi,
What about just adding an extra digit than DBL_DIG?
Yet another idea is to use original binary format: divide the
value into sign, exponent and mantissa, then encode them
severally.
[ patch follows ]
The DBL_DIG patch makes the problem less common, but does not solve it;
[[1,2,3,4], [81, 2, 118, 3146]].each { |w,x,y,z|
a = (x.to_f + y.to_f / z.to_f) * Math.exp(w.to_f / (x.to_f + y.to_f / z.to_f))
ma = Marshal.dump(a)
b = Marshal.load(ma)
if a == b
puts “Everything is working fine for #{a}”
else
puts “PROBLEM: a is #{a}, b is #{b}, and ma is #{ma.dump}”
end
}
Produces
Everything is working fine for 3.95601527633861
PROBLEM: a is 3.7517675036461267e+17, b is 3.7517675036461261e+17,
and ma is “\004\010f\e3.7517675036461267e+17”
The problem remains that certain numbers that are
exactly representable in binary are not exactly
representable in decimal without more precise rounding
rules than sprintf gives us. I think that your other
suggestion about dividing into sign, mantissa, and
exponent would work, but I am not sure I see the
advantage, as it seems it would be more complex and
perhaps larger than just saving the least significant
few bits of the mantissa.
Wow, you’re so fast. Please commit this fix.
matz.
···
In message “Re: Roundoff problem with Float and Marshal” on 03/04/18, nobu.nokada@softhome.net nobu.nokada@softhome.net writes:
This is caused by bad implementation of strtod that Ruby 1.8 uses (in
util.c). I will seek for the better one.
Hmmm, it seems so.
$ ruby -e ‘p 3.7517675036461267e+17’
3.7517675036461261e+17
Index: util.c
RCS file: /pub/cvs/ruby/src/ruby/util.c,v
retrieving revision 1.32
diff -u -2 -p -w -r1.32 util.c
— util.c 16 Jan 2003 07:34:03 -0000 1.32
+++ util.c 17 Apr 2003 16:10:57 -0000
nobu.nokada@softhome.net wrote in message news:200304171653.h3HGrgHQ002407@sharui.nakada.kanuma.tochigi.jp…
Since it was wrong a little, I fixed and committed it.
Hmmm interesting numerical developments. I continue to test it by
doing
a cvs update (with and without the DBL_DIG patch) and do see perhaps
some further problems:
100000.times {
w = rand(111111111)
x = rand(222222223)
y = rand(33333333334)
z = rand(311414313)
a = (x.to_f + y.to_f / z.to_f) *
Math.exp(w.to_f / (x.to_f + y.to_f / z.to_f))
ma = Marshal.dump(a)
b = Marshal.load(ma)
if a == b
puts “Everything is working fine for #{a}”
else
puts “PROBLEM: a is #{a}, b is #{b}, and ma is #{ma.dump}”
puts “w is #{w}, x is #{x}, y is #{y}, z is #{z}”
end
}
produces many PROBLEM: lines on my system.
I hope your tests are being added to some set of unit tests
for ruby.
···
On Friday, 18 April 2003 at 8:28:05 +0900, Rudi Cilibrasi wrote:
nobu.nokada@softhome.net wrote in message news:200304171653.h3HGrgHQ002407@sharui.nakada.kanuma.tochigi.jp…
Since it was wrong a little, I fixed and committed it.
Hmmm interesting numerical developments. I continue to test it by
doing
a cvs update (with and without the DBL_DIG patch) and do see perhaps
some further problems:
100000.times {
w = rand(111111111)
x = rand(222222223)
y = rand(33333333334)
z = rand(311414313)
a = (x.to_f + y.to_f / z.to_f) *
Math.exp(w.to_f / (x.to_f + y.to_f / z.to_f))
ma = Marshal.dump(a)
b = Marshal.load(ma)
if a == b
puts “Everything is working fine for #{a}”
else
puts “PROBLEM: a is #{a}, b is #{b}, and ma is #{ma.dump}”
puts “w is #{w}, x is #{x}, y is #{y}, z is #{z}”
end
}
produces many PROBLEM: lines on my system.
–
Jim Freeze
It is the business of little minds to shrink.
– Carl Sandburg
Hi,
nobu.nokada@softhome.net wrote in message news:200304171653.h3HGrgHQ002407@sharui.nakada.kanuma.tochigi.jp…
Since it was wrong a little, I fixed and committed it.
Hmmm interesting numerical developments. I continue to test it by
doing
a cvs update (with and without the DBL_DIG patch) and do see perhaps
some further problems:
produces many PROBLEM: lines on my system.
It’s the limitation of base 10 representation of float values that
Marshal uses. We need to use binary representation like XDR to solve
this. I’m not sure if it is worth for the cost of its implementation
and portability. Float tend to be inexact, after all.
matz.
···
In message “Re: Roundoff problem with Float and Marshal” on 03/04/18, Rudi Cilibrasi cilibrar@ugcs.caltech.edu writes:
Hi,
I’d tried C99 style hexadecimal float. I expect this is
portable (except for depending on frexp() and ldexp()), but
could not be loaded by current version.
frexp(), ldexp() dependence is OK, since Ruby uses them in math.c
anyway. But changing format version may cause problems, especially
for things like dRuby. Hmmm. Comments, anyone?
But still, strtod() problem remains.
It is unavoidable as long as we use binary representation.
matz.
···
In message “Re: Roundoff problem with Float and Marshal” on 03/04/18, nobu.nokada@softhome.net nobu.nokada@softhome.net writes:
I have implemented the idea I gave earlier. I had to use 2 bytes to
pass all my test cases. There is a 2-bit space for other floating formats
also. I hope this technique preserves portability as well as before,
yet also gives numerical stability which is essential for some types of
scientific computing, e.g. distributed and long-term simulations.
What do you think of this solution?
Rudi
— marshal.c 2003-04-18 10:25:29.000000000 +0200
+++ marshal.c 2003-04-18 10:25:02.000000000 +0200
@@ -182,4 +182,59 @@ w_long(x, arg)
}
+#define FT_OTHER 0
+#define FT_IEEE754 1
+#define FT_UNDEFINED 5
···
+static int checkFloatingType()
+{
+static double fixMantissaBits(double d, unsigned char *buf)
+{
- int floatingType = checkFloatingType();
- int encodedType = buf[1] >> 6;
- if (floatingType != encodedType)
- return d;
- switch(floatingType) {
- case FT_IEEE754:
-
((unsigned char *) &d)[0] = buf[0];
-
((unsigned char *) &d)[1] &= 0xc0;
-
((unsigned char *) &d)[1] |= buf[1] & ~0xc0;
-
break;
- default:
-
break;
- }
- return d;
+}
-
+/* Saves 2 bytes total of mantissa and floatingType */
+void saveMantissaBits(double d, unsigned char *buf)
+{
static void
w_float(d, arg)
@@ -202,7 +257,8 @@ w_float(d, arg)
else {
/* xxx: should not use system’s sprintf(3) */
- sprintf(buf, “%.16g”, d);
- saveMantissaBits(d, buf);
- sprintf(buf+2, “%.17g”, d);
}
- w_bytes(buf, strlen(buf), arg);
- w_bytes(buf, strlen(buf+2)+2, arg);
}
@@ -930,5 +986,6 @@ r_object0(arg, proc)
}
else {
Hi,
I’d tried C99 style hexadecimal float. I expect this is
portable (except for depending on frexp() and ldexp()), but
could not be loaded by current version.
But still, strtod() problem remains.
How about pack unpack method? Not Portable?
Packing float is not portable at all.
For Float f,
f == [f].pack(“G”).unpack(“G”)[0]
is always true.
True on same system. “Portability” means, you can unmarshal a
data which was marshaled on another architecture.
···
At Fri, 18 Apr 2003 15:00:58 +0900, Park Heesob wrote:
–
Nobu Nakada
Hi,
I have implemented the idea I gave earlier. I had to use 2 bytes to
pass all my test cases. There is a 2-bit space for other floating formats
also. I hope this technique preserves portability as well as before,
yet also gives numerical stability which is essential for some types of
scientific computing, e.g. distributed and long-term simulations.
What do you think of this solution?
Interesting. Is this endian safe, i.e. does this work properly on
both little endian and big endian machine?
No, unfortunately, and assuming binary format is less portable.
If it is endian safe, then we can put mantissa bits at the tail of
float representation, e.g. “2.3847655 \001\001”, to make it work well
with older Marshal.
Implemented with “\0” separator.
Index: marshal.c
···
At Fri, 18 Apr 2003 18:33:37 +0900, Yukihiro Matsumoto wrote:
RCS file: //sharui/cvs/ruby/src/ruby/marshal.c,v
retrieving revision 1.84
diff -u -2 -p -r1.84 marshal.c
— marshal.c 9 Apr 2003 05:08:25 -0000 1.84
+++ marshal.c 18 Apr 2003 11:22:01 -0000
@@ -17,4 +17,7 @@
#include <math.h>
+#ifdef HAVE_FLOAT_H
+#include <float.h>
+#endif
#define BITSPERSHORT (2*CHAR_BIT)
@@ -182,4 +185,49 @@ w_long(x, arg)
}
+#ifdef DBL_MANT_DIG
+static double
+load_least_mantissa(d, buf)
- double d;
- char *buf;
+{
- int e, m, s = 0;
- if (d < 0) {
- d = -d;
- s = 1;
- }
- modf(ldexp(frexp(d, &e), DBL_MANT_DIG - 16), &d);
- m = ((buf[0] & 0xff) << 8) | (buf[1] & 0xff);
- d = ldexp(frexp(d + ldexp((double)m, -16), &m), e);
- if (s) {
- d = -d;
- }
- return d;
+}
+static int
+save_least_mantissa(d, buf)
- double d;
- char *buf;
+{
- int e, m;
- m = (int)(modf(ldexp(frexp(fabs(d), &e), DBL_MANT_DIG - 16), &d) * 0x10000);
- *buf++ = 0;
- *buf++ = m >> 8;
- *buf++ = m;
- return 3;
+}
+#else
+#define load_least_mantissa(d, buf) (d)
+#define save_least_mantissa(d, buf) 0
+#endif
+#ifdef DBL_DIG
+#define FLOAT_DIG (DBL_DIG+1)
+#else
+#define FLOAT_DIG 17
+#endif
+
static void
w_float(d, arg)
@@ -188,21 +236,27 @@ w_float(d, arg)
{
char buf[100];
-
int len;
if (isinf(d)) {
if (d < 0) strcpy(buf, “-inf”);
else strcpy(buf, “inf”);
-
len = strlen(buf);
}
else if (isnan(d)) {
strcpy(buf, “nan”);
-
len = strlen(buf);
}
else if (d == 0.0) {
if (1.0/d < 0) strcpy(buf, “-0”);
else strcpy(buf, “0”);
-
len = strlen(buf);
}
else {
/* xxx: should not use system’s sprintf(3) */
- sprintf(buf, “%.16g”, d);
- sprintf(buf, “%.*g”, FLOAT_DIG, d);
- len = strlen(buf);
- len += save_least_mantissa(d, buf + len);
}
- w_bytes(buf, strlen(buf), arg);
- w_bytes(buf, len, arg);
}
@@ -930,5 +984,9 @@ r_object0(arg, proc)
}
else {
-
char *e;
-
d = strtod(RSTRING(str)->ptr, &e);
-
if (!*e++ && e - RSTRING(str)->ptr < RSTRING(str)->len) {
-
d = load_least_mantissa(d, e);
-
}
}
v = rb_float_new(d);
–
Nobu Nakada
Hi Matz and Nobu again,
This is a tricky problem. I have rewritten my patch in a way that I hope
is endian-independent, and furthermore allows you to load and save exactly
across any IEEE754 system, whether big or little endian floats.
Unfortunately, I could find only one bigendian system to test on, and
when my friend tried to compile ruby on it, it failed, so I was
unable to complete the testing on this new version of the patch.
(against the current version in the CVS tree) Maybe there is some kind
soul out there that would like to help? I also took Matz suggestion and
followed Nobu’s example of how to properly put the special two bytes at
the end of the string after the nul for backwards compatibility. All I
can say with confidence is that this patch also passes all my tests on the
x86 linux machines at my disposal. Cheers,
Rudi
P.S. I am embarrassed to use some extra globals to make this patch work.
— marshal.c 2003-04-19 12:48:14.000000000 +0200
+++ marshal.c 2003-04-19 12:42:48.000000000 +0200
@@ -182,4 +182,72 @@ w_long(x, arg)
}
+#define FT_OTHER 0
+#define FT_IEEE754 1
+#define FT_UNDEFINED 5
···
+static int floatingType = FT_UNDEFINED;
+static int floatingStart, floatingDir; /* forgive me */
+
+static void checkFloatingType()
+{
- if (floatingType == FT_UNDEFINED) {
- volatile double t = 1.234;
- volatile unsigned char *b = (unsigned char *) &t;
- if (b[0] == 0x58 && b[1] == 0x39 && b[2] == 0xb4 && b[3] == 0xc8 &&
-
b[4] == 0x76 && b[5] == 0xbe && b[6] == 0xf3 && b[7] == 0x3f) {
-
floatingType = FT_IEEE754;
-
floatingStart = 0;
-
floatingDir = 1;
- }
- if (b[7] == 0x58 && b[6] == 0x39 && b[5] == 0xb4 && b[4] == 0xc8 &&
-
b[3] == 0x76 && b[2] == 0xbe && b[1] == 0xf3 && b[0] == 0x3f) {
-
floatingType = FT_IEEE754;
-
floatingStart = 7;
-
floatingDir = -1;
- }
- if (floatingType == FT_UNDEFINED)
-
floatingType = FT_OTHER;
- }
- return;
+}
-
+#define GETBYTE(d, x) ((unsigned char *) &d)[floatingStart + floatingDir * x]
+
+static double fixMantissaBits(double d, unsigned char *buf)
+{
+static double saveMantissaBits(double d, unsigned char *buf)
+{
- unsigned char result;
- checkFloatingType();
- switch (floatingType) {
- case FT_IEEE754:
-
buf[0] = GETBYTE(d, 0);
-
buf[1] = GETBYTE(d, 1);
-
break;
- default:
-
buf[0] = 1;
-
buf[1] = 1;
-
break;
- }
- buf[1] = (buf[1] & ~0xc0) | (floatingType << 6);
+}
-
-
static void
w_float(d, arg)
@@ -202,7 +270,8 @@ w_float(d, arg)
else {
/* xxx: should not use system’s sprintf(3) */
- sprintf(buf, “%.16g”, d);
- sprintf(buf, “%.17g”, d);
- saveMantissaBits(d, buf+strlen(buf)+1);
}
- w_bytes(buf, strlen(buf), arg);
- w_bytes(buf, strlen(buf)+3, arg);
}
@@ -931,4 +1000,5 @@ r_object0(arg, proc)
else {
d = strtod(RSTRING(str)->ptr, 0);
- d = fixMantissaBits(d, RSTRING(str)->ptr + strlen(RSTRING(str)->ptr) + 1);
}
v = rb_float_new(d);
My patch is “convervative”;
I believe the idea itself is endian-safe, and the patch as posted
it almost safe. It does not assume a binary format, it checks; The line
if (b[0] == 0xc8b43958 && b[1] == 0x3ff3be76)
may need to be adjusted to include the check that it is a little-endian
machine (if floating point numbers are reversed on big-endian machines).
I don’t want to solve the problem for every machine to convert
between little and big endian. I just want it to work on the most common
machine, the one I am using, and not break the others. I am confident it
won’t break the other platforms because they will set the floatingType
bits to FT_OTHER, which means that the 2 binary bytes at the end are
simply ignored, and so we are back to the original behavior. I would like
to emphasize that I am not assuming any particular binary format for
floating points, but instead testing at runtime what format is in use,
and only applying this fix if it happens to be the right one. There is
still room for two other possible formats to be added, at least. And
it is definately possible to cover the big-endian case as well without an
extra type code, but I would need access to a big-endian machine to test it.
The important point would be that the least significant floating-point
mantissa 8 bits are put in the first byte, then the next least significant
6 bits in the second byte. This is endian-safe so long as you put them
back right, and are using IEEE754.