Microrant on Ruy's Math Skills

I'm really impressed with that Wrong library; well done!

Just throwing this out there: judging float equality based on
_difference_ is incorrect. You should use _ratio_, or perhaps in some
cases a combination of both.

For instance, if my standard test library defines a default tolerance
of 0.00001, that seems pretty good, right? OK, but what if the floats
I'm testing are actually really close to zero?

   assert { x.close_to? 0.0000000135 }

Well... x could have the value 0.0000000134999999999999994243561, in
that crazy way that floats behave. That's clearly "close enough" to
the intended value, but it will fail the test because the default
tolerance is inappropriate for this case.

Of course, I can set a different tolerance for a given test, but the
deeper problem is this: numerate people use ratio instead of
difference to judge the proximity of one number to another, and that's
how we should implement tests for float pseudo-equality. You
shouldn't need any parameter then; the implementation should work no
matter the scale of the floats involved.

Discuss.

···

On Wed, Jan 25, 2012 at 6:05 AM, Alex Chaffee <alexch@gmail.com> wrote:

So how to test around this in unit tests? In RSpec, use be_within (née
be_close) [1]; in Wrong (which works inside many test frameworks), use
close_to? [2]

Well, you only convert during reading from and writing to the
database. The rest of the software works with instances of a proper
class which handles all the nifty details internally. That's the
whole point of OO, isn't it?

Cheers

robert

···

On Mon, Jan 30, 2012 at 5:52 PM, Tony Arcieri <tony.arcieri@gmail.com> wrote:

On Mon, Jan 30, 2012 at 1:01 AM, Florian Gilcher <flo@andersground.net>wrote:

On Jan 30, 2012, at 9:32 AM, Tony Arcieri wrote:
> I have found many uses for BigDecimal before and have seen Fixnums used
> where BigDecimal would probably be more appropriate (i.e. "count cents,
not
> dollars!") where having (Big)Decimal literals would probably change
> people's minds about that sort of thing.

Counting cents is perfectly valid, fits every database (even your fancy
NOSQL
database that doesn't have a decimal type) and is the proper way to do math
involving money. The base value is not the Dollar, it is the
cent.

Having worked on these sorts of systems, I really hate them. Having to
constantly multiply and divide by 100 because, sorry, in the real world
it's dollars, not cents, that people actually work with and familiar with,
you leave yourself open for all sorts of off by two orders of magnitude
errors doing these sorts of conversions all over the place.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

suppose we are writing a unit test method
def test_something() do
  a = pre_condition() # => a is a Float
  assert_approx_equal a, expected1 # => ok
  a = process a # => do the real work
  assert_approx_equal a, expected2 # => the assertion may fail, if
expected 2 is evaluated in my head
end

do you consider this as a surprising result that may happen a lot?

···

On Tue, Jan 31, 2012 at 4:03 PM, Chad Perrin <code@apotheon.net> wrote:

On Tue, Jan 31, 2012 at 11:35:10AM +0900, Yong Li wrote:

On Tue, Jan 31, 2012 at 1:27 AM, Chad Perrin <code@apotheon.net> wrote:
> On Mon, Jan 30, 2012 at 05:22:47PM +0900, Robert Klemme wrote:
>> On Mon, Jan 30, 2012 at 6:56 AM, Chad Perrin <code@apotheon.net> wrote:
>> > On Mon, Jan 30, 2012 at 10:03:04AM +0900, Gary Wright wrote:
>
> Can someone please explain to me in clear, direct terms why there is
> opposition to the simple expedient of adding at least one additional
> comparison method to Float to address the vast majority of casual use
> cases without having to resort to reimplementing such comparison methods
> over and over again or bringing in additional libraries with onerous
> syntactic verbosity?

With all due respect to your suggestion, it is good, but we do need
careful analysis of the precise mathematical properties of such an
supplement. Consider, for example, how would you define the behavior
of the following situation:

// assuming your new Float#approx_equal returns true
// if two float are equal until the first digit after the decimal point
// e.g. 1.11.approx_equal(1.10) # => true
a = 1.11
b = 1.1
a.approx_equal(b) #=> true
(a * 100).approx_equal(b * 100) #=> false

I guess, this again breaks the simple math.

What exactly are you arguing here -- that there's no such thing as a
solution that is easier to evaluate in one's head so that the results are
not surprising a lot of the time?

Good Chad,
There is nothing wrong with your suggestion, as there is nothing wrong
for many people saying it won't help much.
It's just different opinions, and I treat them as food for thoughts.
Reading this long conversation has led me to think about problems I
never imagined, and that's a good thing for me.

I was not trying to say you are wrong. In fact, your voice is
necessary to remind me that Float is imperfect, and we should improve
it. I was just trying to give more opinions.

As for my little contrived example, if it is worthless for you, I am
sorry for wasting your time reading it.

···

On Tue, Jan 31, 2012 at 4:03 PM, Chad Perrin <code@apotheon.net> wrote:

On Tue, Jan 31, 2012 at 11:35:10AM +0900, Yong Li wrote:

On Tue, Jan 31, 2012 at 1:27 AM, Chad Perrin <code@apotheon.net> wrote:
> On Mon, Jan 30, 2012 at 05:22:47PM +0900, Robert Klemme wrote:
>> On Mon, Jan 30, 2012 at 6:56 AM, Chad Perrin <code@apotheon.net> wrote:
>> > On Mon, Jan 30, 2012 at 10:03:04AM +0900, Gary Wright wrote:

What exactly are you arguing here -- that there's no such thing as a
solution that is easier to evaluate in one's head so that the results are
not surprising a lot of the time?

Excellent points, well written.

Gavin

···

On Wed, Feb 1, 2012 at 4:23 AM, Chad Perrin <code@apotheon.net> wrote:

The following is, of course, only my evaluation of the situation and how
best to handle it; I'm sure wiser heads than mine in the realm of
language design could find deficiencies in my suggestions. I'd like to
know what's deficient, though, so if someone has something to say, please
share.

It'd be nice if you read the thread before chipping in. Adam Prescott nailed this one. Here's another example:

% ruby -e 'puts "%.60f" % 1.1'
1.100000000000000088817841970012523233890533447265625000000000

···

On Jan 23, 2012, at 16:57 , Chad Perrin wrote:

On Tue, Jan 24, 2012 at 08:19:06AM +0900, Ryan Davis wrote:

Except for that whole "parse time is different from run time" part you
seem to be blithely ignoring. If you've already parsed float literals,
then they're floats and are already lossy.

Are you telling me that 1.1 is automatically lossy, regardless of how you
got there? I guess I need to go back and refresh my understanding of the
math, because I thought a literal decimal number was fine but one
achieved by arithmetic was likely to contain subtle errors due to the way
the binary math is handled.

Except for that whole "parse time is different from run time" part you
seem to be blithely ignoring. If you've already parsed float literals,
then they're floats and are already lossy.

Are you telling me that 1.1 is automatically lossy, regardless of how you
got there?

1.1 is a decimal floating point literal. This is translated into a
floating point number at some point in time (likely parse time). From
there on there is only a floating point number which has a binary
representation. Since certain decimal values cannot be exactly
represented as binary numbers there is loss the very moment the
translation occurs.

I guess I need to go back and refresh my understanding of the
math, because I thought a literal decimal number was fine but one
achieved by arithmetic was likely to contain subtle errors due to the way
the binary math is handled.

There is imprecision in the translation and further numeric effects
which affect precision later. The resulting imprecision (compared to
what pure math or symbolic calculations would yield) depends on

- translation of decimal literals to binary representation
- limits of the binary representation
- nature of the operations
- order of the operations
- translation back to decimal representation

Given all this it's rather surprising that output results are so close
to what one would expect. :slight_smile: You can study the details at

and for example for one format:

Kind regards

robert

···

On Tue, Jan 24, 2012 at 1:57 AM, Chad Perrin <code@apotheon.net> wrote:

On Tue, Jan 24, 2012 at 08:19:06AM +0900, Ryan Davis wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

So how to test around this in unit tests? In RSpec, use be_within (née
be_close) [1]; in Wrong (which works inside many test frameworks), use
close_to? [2]

I'm really impressed with that Wrong library; well done!

Glad you like it! It would work even better if MRI attached ASTs
and/or source code to procs/lambdas/methods, rather than merely
source_location. I'm thinking of logging a bug or two about that.

Just throwing this out there: judging float equality based on
_difference_ is incorrect. You should use _ratio_, or perhaps in some
cases a combination of both.

Fascinating! So if we used division-and-proximity-to-1 instead of
subtraction-and-proximity-to-0 then we could possibly do away with the
tolerance parameter altogether... or at least redefine it.

Of course, your argument reverses itself when dealing with very large
numbers! Let's say I'm dealing with Time. If I say time a should be
close to time b, then I probably want the same default precision (say,
10 seconds) no matter when I'm performing the test, but using ratios
will give me quite different tolerances depending on whether my
baseline is epoch (0 = 1/1/1970) or Time.now.

now = Time.now.to_i.to_f; (now/(now+10))

=> 0.9999999924671322

now = 1.to_f; (now/(now+10))

=> 0.09090909090909091

In any case... I will be happy to review your patch! :slight_smile:

- A

···

On Tue, Jan 24, 2012 at 11:15 PM, Gavin Sinclair <gsinclair@gmail.com> wrote:

On Wed, Jan 25, 2012 at 6:05 AM, Alex Chaffee <alexch@gmail.com> wrote:

--
Alex Chaffee - alex@stinky.com
http://alexchaffee.com
http://twitter.com/alexch

I have tried this, but recently discovered the same issues arise.

To review, instead of:

    (b-d) <= a && (b+d) >= a

We use ratios:

    (a / b - 1).abs <= d

But try 1.1, 1.0 and d=0.1

  (1.1 / 1.0 - 1).abs <= 0.1

and it is false though it should be true b/c

  (1.1 / 1.0 - 1).abs #=> 0.10000000000000009

Which is exactly what instigated my micro-rant.

Any advice?

Agreed. I'm only using it on a few experimental projects, so far.

Anyone else using Alex's Wrong?

···

On 01/24/2012 11:15 PM, Gavin Sinclair wrote:

On Wed, Jan 25, 2012 at 6:05 AM, Alex Chaffee<alexch@gmail.com> wrote:

So how to test around this in unit tests? In RSpec, use be_within (née
be_close) [1]; in Wrong (which works inside many test frameworks), use
close_to? [2]

I'm really impressed with that Wrong library; well done!

Yeah, basically. Of course, "a lot" is relative -- but the upshot is
that it happens enough to be a problem when offering an additional
comparison method should yield something much easier to evaluate in one's
head.

Of course, I don't know how you get a unit test to make use of your
brain's arithmetic capabilities. . . .

···

On Tue, Jan 31, 2012 at 06:35:29PM +0900, Yong Li wrote:

On Tue, Jan 31, 2012 at 4:03 PM, Chad Perrin <code@apotheon.net> wrote:
>
> What exactly are you arguing here -- that there's no such thing as a
> solution that is easier to evaluate in one's head so that the results are
> not surprising a lot of the time?

suppose we are writing a unit test method
def test_something() do
  a = pre_condition() # => a is a Float
  assert_approx_equal a, expected1 # => ok
  a = process a # => do the real work
  assert_approx_equal a, expected2 # => the assertion may fail, if
expected 2 is evaluated in my head
end

do you consider this as a surprising result that may happen a lot?

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

I don't get it. I've been polite and reasonable -- and evidently
overlooked one fucking statement. For this crime, I've got a couple
people crawling up my ass with pitchforks and torches.

Have fun. You don't need my involvement to "justify" your prickish
behavior any longer.

···

On Tue, Jan 24, 2012 at 10:23:36AM +0900, Ryan Davis wrote:

It'd be nice if you read the thread before chipping in. Adam Prescott
nailed this one. Here's another example:

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Generally, you should use ratio instead of difference when comparing
floats, but in many cases - such as the time one provided by Alex - a
difference is obviously a better idea.

There is no silver bullet, just use whatever works for you in a particular case.

-- Matma Rex

Having spent untold hours creating my own testing library
(whitestone), I'm afraid I am only admiring Wrong from a distance.

Credit where it's due, of course: I admired assert2.0 from a distance as well :slight_smile:

···

On Thu, Jan 26, 2012 at 6:31 AM, Joel VanderWerf <joelvanderwerf@gmail.com> wrote:

I'm really impressed with that Wrong library; well done!

Agreed. I'm only using it on a few experimental projects, so far.

Anyone else using Alex's Wrong?

This isn't a problem. The aim here is to test float equality. The
test should return true iff a human would look at the two floats and
say "yep, they're meant to be the same thing, it's just the electrical
engineering that got in the way".

1.0 and 1.1 are not float-equal in this sense -- not even close -- and
0.1 is a ridiculous tolerance for testing float ratio. I'm sure you
were just experimenting, but what I'm saying is: your counterexample
doesn't undermine the overall approach.

···

On Thu, Jan 26, 2012 at 5:53 AM, Intransition <transfire@gmail.com> wrote:

[...]

But try 1.1, 1.0 and d=0.1

(1.1 / 1.0 - 1).abs <= 0.1

and it is false though it should be true b/c

(1.1 / 1.0 - 1).abs #=> 0.10000000000000009

Which is exactly what instigated my micro-rant. Any advice?

Just throwing this out there: judging float equality based on
_difference_ is incorrect. You should use _ratio_, or perhaps in some
cases a combination of both.

Fascinating! So if we used division-and-proximity-to-1 instead of
subtraction-and-proximity-to-0 then we could possibly do away with the
tolerance parameter altogether... or at least redefine it.

Of course, your argument reverses itself when dealing with very large
numbers!

I don't think so. Floats are implemented using a coefficient and an
exponent. So are the following two floats essentially equal?

  B: 6.30912401999999999999 E 59

I'd say yes. What about these two?

  B: 6.30912401999999999999 E 58

Of course not! There is an order-of-magnitude difference. So perhaps
a unit testing float comparison should work like this (pseudo-code):

   def float_equal? a, b
      c1, m1 = coefficient(a), exponent(a)
      c2, m2 = coefficient(b), exponent(b)
      m1 == m2 and (c1/c2 - 1).abs < 0.000000000001
   end

If you take the magnitude away, then dealing with very large numbers
shouldn't be a problem.

Let's say I'm dealing with Time. If I say time a should be
close to time b, then I probably want the same default precision (say,
10 seconds) no matter when I'm performing the test, but using ratios
will give me quite different tolerances depending on whether my
baseline is epoch (0 = 1/1/1970) or Time.now.

If your application or library want to know if two times are within 10
seconds of each other, then that's a property of _your code_ and has
nothing to do with float implementations. In other words, to compare
Time objects, use Time objects, not Float objects :slight_smile:

In any case... I will be happy to review your patch! :slight_smile:

Hard to offer a patch to code I don't even have installed :), but here
is an excerpt from my implementation. See whitestone/lib/whitestone/assertion_classes.rb at master · gsinclair/whitestone · GitHub, line
274, for context [1].

      def run
        if @actual.zero? or @expected.zero?
          # There's no scale, so we can only go on difference.
          (@actual - @expected) < @epsilon
        else
          # We go by ratio. The ratio of two equal numbers is one, so the ratio
          # of two practically-equal floats will be very nearly one.
          @ratio = (@actual/@expected - 1).abs
          @ratio < @epsilon
        end
      end

The problem with this is it's using @epsilon for two different
purposes: a "difference" epsilon and a "ratio" epsilon. That is
clearly wrong, but I just implemented something that would work for
me. I figured there _must_ be a best-practice approach out there
somewhere that I could learn from. I firmly believe this problem
should be solved once and for all, and it won't be by testing
difference, and there should be value for epsilon that is justified by
the engineering. [2]

I also believe the built-in Float class should provide methods to
assist us. It gives us inaccuracy, so it should give us the tools to
deal with it.

  class Float
    def essentially_equal_to?(other)
      # Best-practice implementation here with scientifically valid
value for epsilon.
    end

    def within_delta_of?(other, delta)
      (self - other).abs < delta
        # No default value for delta because it is entirely
context-dependent. This is
        # a convenience method only.
    end
  end

  a = 1.1 - 1.0
  a.essentially_equal_to?(0.1) # true

  4.7.within_delta_of?(4.9251, 0.2) # false

[1] Full link for posterity:

[2] While "one epsilon to rule them all" is appealing, the problem is
that the errors inherent in float representation get magnified by
computation. However, even raising two "essentially equal" floats to
the power of 50 doesn't change their essential equality, assuming a
ratio of 1e-10 is good enough:

  a = 0.1 # 0.1
  b = 1.1 - 1.0 # 0.10000000000000009

  xa = a ** 50 # 1.0000000000000027e-50
  xb = b ** 50 # 1.0000000000000444e-50

  proximity_ratio = (xa/xb - 1).abs
                     # 4.163336342344337e-14

  proximity_ratio < 1e-10
                     # true

By the way, the proximity_ratio for the original a and b was
7.77156e-16, so I hastily conclude:
* The engineering compromises in the representation of floats gives
us a proximity ratio of around 1e-15 (7.77156e-16 above).
* Raising to an enormous power changes the proximity ratio to around
1e-13 (4.163e-14 above).
* A reasonable value for epsilon might therefore be 1e-12.

I expect this conclusion might depend on my choice of values for a and
b, though.

If you made it this far, congratulations.

···

On Thu, Jan 26, 2012 at 5:29 AM, Alex Chaffee <alex@stinky.com> wrote:
  A: 6.30912402 E 59
  A: 6.30912402 E 59

Sorry, my bad for confusing you. Second try:

# process(Fixnum i) is supposed to return 100.0 * i
# and I am going to test it
assert_with_approx_equal( process(1.1), 110.0)
# => oops, this fails, but I thought 100.0 * 1.1 == 110.0

that '110.0' Float literal is calculated by my brain without using
computer, and I may naively claim that the process method is wrong,
but actual it is this unit test which is wrong.

does unit test qualify your "98% of the problem for 98% of casual use cases"?

···

On Wed, Feb 1, 2012 at 12:49 AM, Chad Perrin <code@apotheon.net> wrote:

On Tue, Jan 31, 2012 at 06:35:29PM +0900, Yong Li wrote:

On Tue, Jan 31, 2012 at 4:03 PM, Chad Perrin <code@apotheon.net> wrote:
>
> What exactly are you arguing here -- that there's no such thing as a
> solution that is easier to evaluate in one's head so that the results are
> not surprising a lot of the time?

suppose we are writing a unit test method
def test_something() do
a = pre_condition() # => a is a Float
assert_approx_equal a, expected1 # => ok
a = process a # => do the real work
assert_approx_equal a, expected2 # => the assertion may fail, if
expected 2 is evaluated in my head
end

do you consider this as a surprising result that may happen a lot?

Yeah, basically. Of course, "a lot" is relative -- but the upshot is
that it happens enough to be a problem when offering an additional
comparison method should yield something much easier to evaluate in one's
head.

Of course, I don't know how you get a unit test to make use of your
brain's arithmetic capabilities. . . .

It'd be nice if you read the thread before chipping in. Adam Prescott
nailed this one. Here's another example:

I don't get it. I've been polite and reasonable -- and evidently
overlooked one fucking statement. For this crime, I've got a couple
people crawling up my ass with pitchforks and torches.

Have fun. You don't need my involvement to "justify" your prickish
behavior any longer.

I think it's come to the point where a forum is better than a mailing
list for general discussions about Ruby. I never would have said that
a few years ago, but I'm saying it now. In a mailing list, every
message gets beamed into your inbox, so some people think they have
the right to react negatively to ignorance/oversights/off-topic
chatter, etc. In a forum, you opt in to discussions, and have to
accept that it's a discussion, which typically involves repetition and
redundancy, as it's conducted by humans.

Different sections to a forum make great sense as well.

Maybe there already is a vibrant one out there that I don't know about.

Typed by clumsy thumbs on tiny keys.

···

On 24 Jan 2012, at 02:28 AM, Chad Perrin <code@apotheon.net> wrote:

On Tue, Jan 24, 2012 at 10:23:36AM +0900, Ryan Davis wrote:

It'd be nice if you read the thread before chipping in. Adam Prescott
nailed this one. Here's another example:

I don't get it. I've been polite and reasonable -- and evidently
overlooked one fucking statement.

*massively irrelevant reply…*

Yep, that Ryan Davis was a massive prick, there. You called it! I'm not even on the list any more: I'm just thumbing through this old thread I bumped in to in my inbox and "Whazam!" there're two pricky posts by the same "Ryan Davis" author. Anyway, I'm going to go and eat soup. With my Son. How cool is that!?

Have fun storming the castle!

I doubt that difference is a better idea "in many cases" and I don't
think the time example is even valid, though I could be wrong. Got
any other examples?

As for no silver bullet, the problem of comparing floats arises from
engineering and should be solved by engineering, not by whatever works
in a particular case. So while I don't _have_ a silver bullet, my gut
feeling says there _is_ one.

I'm sure in particular programs, near enough is good enough. If Bob's
program thinks 39.5 and 39.47 are close enough to be called "equal",
then that's a test Bob needs to implement himself, in both his program
and his tests. It has nothing whatsoever to do with "float equality"
in a standard unit testing sense.

···

2012/1/26 Bartosz Dziewoński <matma.rex@gmail.com>:

Generally, you should use ratio instead of difference when comparing
floats, but in many cases - such as the time one provided by Alex - a
difference is obviously a better idea.

There is no silver bullet, just use whatever works for you in a particular case.