Bignum Troubles

I'm running a Ruby program that relies on Bignum values.

I have a correct output to compare against and test my work. My current output agrees with the correct output for some time, but eventually, they begin to diverge.

I've tried to track the problem down and it SEEMS to be that the Bignum values stop matching their counterparts. (Note: I can't actually see the large numbers in the correct output, so I'm guessing from behavior, but I'm fairly certain now.)

The Bignum manipulations involve / and %, so I'm guessing it's a precision error. I know in Perl, I can adjust the precision of it's Bignum-like library.

So, my question is, how do I adjust Bignum's precision for thinks like division?

Thanks.

James Edward Gray II

"James Edward Gray II" <james@grayproductions.net> schrieb im Newsbeitrag
news:0F54A0B4-0723-11D9-B224-000A95BA45F8@grayproductions.net...

I'm running a Ruby program that relies on Bignum values.

I have a correct output to compare against and test my work. My
current output agrees with the correct output for some time, but
eventually, they begin to diverge.

I've tried to track the problem down and it SEEMS to be that the Bignum
values stop matching their counterparts. (Note: I can't actually see
the large numbers in the correct output, so I'm guessing from behavior,
but I'm fairly certain now.)

Err, can't you just compare them? If it only *seems* that values differ
then there could be any number of other reasons for the differences you
observe.

The Bignum manipulations involve / and %, so I'm guessing it's a
precision error. I know in Perl, I can adjust the precision of it's
Bignum-like library.

So, my question is, how do I adjust Bignum's precision for thinks like
division?

Bignum ist just integer numbers. AFAIK there is no such thing as a
precision for this - all digits are correct. Since I don't know the math
you're doing it's hard to guess. One candidate that does make a
difference with interger divisions is order of execution: you might not
execute all operations in the same order thus cutting off something which
isn't cut off in another calculation, hence they differ.

I guess you're aware what I mean:

33 / 2 * 2

=> 32

33 * 2 / 2

=> 33

Another source of suble errors might be multiple conversions between float
and integer numbers.

Mabye you better use Rational:

(Rational(33) / 2 * 2).to_i

=> 33

Or are you talking about BigDecimal?

Kind regards

    robert

Err, can't you just compare them? If it only *seems* that values differ
then there could be any number of other reasons for the differences you
observe.

My project is a simulation. I have a "dump" from a correct simulator. I can compare my dump with theirs.

Part of the simulation includes random number generation. To solve this, the spec gives you their random number generation system to use for the purposes of testing. Dumps do not include random numbers that were generated.

My simulation agrees with theirs for 938 turns. At 939, we diverge. The step we part on is 90% about random number generation and I've tried to verify everything else to the best of my ability.

I am "guessing", but I think it's a good guess.

Bignum ist just integer numbers. AFAIK there is no such thing as a
precision for this - all digits are correct.

Well, when you divide two Integers you can get a decimal, right? My understanding was that "big math" libraries cut off the division at some point, to keep from choking on repeating decimal results. Please correct me, if I misunderstand.

Here's a look at my math:

  def initialize()
    # non-related setup work here

    @random_seeds = [ 12345 ]
  end

  # and later

  def test_random(limit)
    until @random_seeds.size == 5
      @random_seeds.push @random_seeds[-1] * 22695477 + 1
    end
    @random_seeds.shift
    
    return ((@random_seeds[-1] / 65536) % 16384) % limit
  end

test_random() is written to the provided specification, returning an integer between 0 and limit.

The spec include the first 100 answers, which I agree with. In fact, we agree much farther than that. As I said it takes 939 turns for us to disagree and the random number generator can be used many times in a turn.

I believe we see above that @random_seeds should always contain Bignums (after the initial few Fixnums, of course).

What you've made me wonder is, what stages does the last line go through? Does that first division produce a Float? If so, that's probably my error right there.

Thanks for your help.

James Edward Gray II

···

On Sep 15, 2004, at 10:39 AM, Robert Klemme wrote:

Err, can't you just compare them? If it only *seems* that values differ
then there could be any number of other reasons for the differences you
observe.

My project is a simulation. I have a "dump" from a correct simulator. I can compare my dump with theirs.

Part of the simulation includes random number generation. To solve this, the spec gives you their random number generation system to use for the purposes of testing. Dumps do not include random numbers that were generated.

My simulation agrees with theirs for 938 turns. At 939, we diverge. The step we part on is 90% about random number generation and I've tried to verify everything else to the best of my ability.

I am "guessing", but I think it's a good guess.

Bignum ist just integer numbers. AFAIK there is no such thing as a
precision for this - all digits are correct.

Well, when you divide two Integers you can get a decimal, right? My understanding was that "big math" libraries cut off the division at some point, to keep from choking on repeating decimal results. Please correct me, if I misunderstand.

Here's a look at my math:

  def initialize()
    # non-related setup work here

    @random_seeds = [ 12345 ]
  end

  # and later

  def test_random(limit)
    until @random_seeds.size == 5
      @random_seeds.push @random_seeds[-1] * 22695477 + 1
    end
    @random_seeds.shift
    
    return ((@random_seeds[-1] / 65536) % 16384) % limit
  end

test_random() is written to the provided specification, returning an integer between 0 and limit.

The spec include the first 100 answers, which I agree with. In fact, we agree much farther than that. As I said it takes 939 turns for us to disagree and the random number generator can be used many times in a turn.

I believe we see above that @random_seeds should always contain Bignums (after the initial few Fixnums, of course).

What you've made me wonder is, what stages does the last line go through? Does that first division produce a Float? If so, that's probably my error right there.

That first division should always produce an Integer (either Fixnum or Bignum). Are you expecting to get a decimal or a float somewhere? I don't see anywhere in this code where you will get one... It all looks like integer math to me; they should be precise, but with any decimal part truncated.

HTH,
Mark

···

On Sep 15, 2004, at 9:52 AM, James Edward Gray II wrote:

On Sep 15, 2004, at 10:39 AM, Robert Klemme wrote:

Thanks for your help.

James Edward Gray II

James Edward Gray II said:

Here's a look at my math:

  def initialize()
    # non-related setup work here

    @random_seeds = [ 12345 ]
  end

  # and later

  def test_random(limit)
    until @random_seeds.size == 5
      @random_seeds.push @random_seeds[-1] * 22695477 + 1
    end
    @random_seeds.shift

    return ((@random_seeds[-1] / 65536) % 16384) % limit
  end

Several things look wrong here.

(1) I don't see why you are using an array of seeds. Only the last seed
is ever accessed, so you could get by with a single variable. If the
original algorithm used an array, then I suspect that the algorithm wasn't
transliterated into Ruby correctly.

(2) The math doesn't need bignums. It explicit throws away the bottow 16
bits (with the divide by 65535) and the tosses all the bits above the
remaining low 14 (with the mod 16384). So a 30 bit number (16 + 14) would
be adequate for this particular algorithm.

Here's a simplier version of the algorithm. It only uses one seed and the
seed never gets big enough to become a BigNum. I tested it against the
first 10000 numbers in your sequence.

  class R2
    TWO_16 = (2**16)
    TWO_14 = (2**14)
    TWO_30 = (2**30)

    def initialize
      @seed = 12345
      3.times { @seed = next_seed }
    end

    def test_random(limit)
      @seed = next_seed
      ((@seed / TWO_16) % TWO_14) % limit
    end

    def next_seed
      (@seed * 22695477 + 1) % TWO_30
    end
  end

···

--
-- Jim Weirich jim@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)

I was in a *very* similar situation (except I had the random stream
as part of the validation dump suite and thus could drill down deeper)
and eventually (with much appreciated help from the folks here) tracked
it down to a bug in bignum.c (see
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/111123 and
the proceeding thread).

     Executive summary: the bignum and & or operations were not getting
along with the garbage collector, causing infrequent errors after a
program had run for a while. Basically, if garbage collection happened
during certain bignum operations, the results could be corrupted.

     I just patched my copy (I can't change to a new version
mid-validation) but I understand that the change is in CVS as well.

     Hope this help.

     -- MarkusQ

···

On Wed, 2004-09-15 at 09:52, James Edward Gray II wrote:

On Sep 15, 2004, at 10:39 AM, Robert Klemme wrote:

My project is a simulation. I have a "dump" from a correct simulator.
I can compare my dump with theirs.

Part of the simulation includes random number generation. To solve
this, the spec gives you their random number generation system to use
for the purposes of testing. Dumps do not include random numbers that
were generated.

That's because your math is a lot better than mine. <laughs> I recognized the division as >> 2**16, but I didn't get the 2**14 modulo.

You're right about the array, it was dumb.

You also optimized my routine incredibly with the % to the seed. Very nice.

Ironically, as it turns out, my number generator was fine, if super slow. I did fine the problem elsewhere. Guess it wasn't such a good guess after all.

Thanks to all who helped!

James Edward Gray II

···

On Sep 15, 2004, at 2:32 PM, Jim Weirich wrote:

Several things look wrong here.

"James Edward Gray II" <james@grayproductions.net> schrieb im Newsbeitrag news:DDD9DF2E-0761-11D9-9076-000A95BA45F8@grayproductions.net...

Several things look wrong here.

That's because your math is a lot better than mine. <laughs> I recognized the division as >> 2**16, but I didn't get the 2**14 modulo.

You're right about the array, it was dumb.

You also optimized my routine incredibly with the % to the seed. Very nice.

The algorithm looks to me, as if random numbers were not equally distributed because of the "% limit". Is that so and if so is that a desired effect?

Ironically, as it turns out, my number generator was fine, if super slow. I did fine the problem elsewhere. Guess it wasn't such a good guess after all.

Ha! -)

Thanks to all who helped!

You're welcome.

Just one other remark, there is a lib that changes the semantics of int division. I don't remember the name though. So 5 / 3 is not always a Fixnum.

Kind regards

    robert

···

On Sep 15, 2004, at 2:32 PM, Jim Weirich wrote:

Robert Klemme wrote:

Just one other remark, there is a lib that changes the semantics of int division. I don't remember the name though. So 5 / 3 is not always a Fixnum.

$ irb -r mathn
irb(main):001:0> 5/3
=> 5/3

Is this a standard lib?

James Edward Gray II

···

On Sep 15, 2004, at 6:05 PM, Robert Klemme wrote:

Just one other remark, there is a lib that changes the semantics of int division. I don't remember the name though. So 5 / 3 is not always a Fixnum.

Just one other remark, there is a lib that changes the semantics of int division. I don't remember the name though. So 5 / 3 is not always a Fixnum.

Is this a standard lib?

Yes, Rational numbers are defined in rational.rb. Just require 'rational'. 'mathn', also in the stdlib, includes 'rational', 'complex' and 'matrix', and adds a little extra glue (as well as implementing a "Prime" class).

Aside:
It's amazing what you can find in the stdlib; when I was first getting familiar with Ruby, I kept getting surprised at what I found there :slight_smile: It's also amazing how nicely things tie in; all the new Numeric types are automatically created in appropriate circumstances:

   require 'mathn'
     ==>true
   5/8
     ==>5/8
   5/8*2
     ==>5/4
   5/8+4/3
     ==>47/24
   Math.sqrt -4
     ==>Complex(0, 2)

cheers,
Mark

···

On Sep 15, 2004, at 4:12 PM, James Edward Gray II wrote:

On Sep 15, 2004, at 6:05 PM, Robert Klemme wrote:

James Edward Gray II