As far as my experience goes, using Float#== is always an error. Float should be compared within epsilon because of errors, etc. etc.
This is something that I'd refer to as a code smell rather than an error. Floating point numbers can represent and operate on lots of actual numbers with no loss of precision, for a trivial example: 2.0 + 2.0 will always equal exactly 4.0, and any other result from that operation, no matter how small the difference, would be an error.
Floating point errors arise in specific cases: when a calculation goes outside the precision of the notation. These errors lead to problems with equality only when the two numbers being compared have been calculated in a series of steps which go outside the range of precision in different ways. For example:
2.0**53 == 2.0**53+1.0+(-1.0) # => false
2.0**53 == 2.0**53+(-1.0)+1.0 # => true
2.0**53 is the boundary case for whole integers in floating point notation. You can subtract one and get the expected result, but adding one runs into the absorption problem where large_number + 1 == large_number.
But the question is: why keep Float#== if it is basically useless? Why not have Float#== be defined in the core by:
class Float; def ==(o); ((o - self).abs < 0.0000001); end; end
Interestingly, for small values of epsilon, this could magnify the problem because subtracting two nearly-equal values is a problematic operation. In other words, the difference used for this comparison wouldn't actually represent the actual difference between the numbers in all cases.
This seems at least as counter-intuitive to me as the current problem with ==. You could have two numbers which differ by less than epsilon in mathematical terms, but computing the difference may result in more than epsilon (or vice-versa). I'm not sure that this would be an actual improvement.
Also, you'll note that it fails to solve the example given above for any value of epsilon less than 1.0.
As an example, some unit tests crashed today with this message:
1) Failure:
test_read_coords(TestScaffoldReorder) [./test/test_scaffold_reorder.rb:50]:
<{"9876"=>{"8153"=>[19.1, [63.7, 580.0]], "8154"=>[15.0, [1612.5]]}}> expected but was
<{"9876"=>{"8153"=>[19.1, [63.7, 580.0]], "8154"=>[15.0, [1612.5]]}}>.
It looks stupid, doesn't it? It failed because the computed float where not exactly equal to the constant entered in the testing code, but they display the same. I could drill down the hashes and compare with assert_within_epsilon (or whatever it is), but it makes the code much uglier and more complicated than a simple assert_equal. Or wrap the structure in a class and have a proper equality operator. Or...
But all this seems too complicated for the quick program at hand. And redefining Float#== as shown above made the test pass with much less pain.
So any good reason to keep Float#== the way it is? Or is there any real danger of breaking existing libraries if I redefine Float#== this way?
As I mentioned before, Float#== isn't necessarily an error, it's just an indication that there may be errors - a code smell.
Another code smell is an overuse of literals and constants, and this is equally the case for testing code as for production code. This seems to me to be where the error really lies: you're using a constant in the testing code where you should be using a computed value based on the input.
Knowing (as we do), that floating point calculations are not 100% accurate, a more reliable approach would be to make the test results using a parallel calculation to the code being tested, not simply a constant. In a trivial example:
def foo(a, b) a/b end
# test setup
a = 22.0
b = 7.0
const_result = 3.14285714285714
calc_result = 22.0/7.0
foo(a, b) == const_result # => false
foo(a, b) == calc_result # => true
matthew smillie.
···
On Jul 6, 2006, at 16:39, Guillaume Marcais wrote: