Huge Floating Point Error

loftusoft · Nov 2, 2006

While debugging the following code:

#include <iostream>
using namespace std;
int main()
{
float f = 583312.0F * 0.1F;
cout << f << endl;
return 0;
}

The value of "f" as reported by the debugger is "58331.199" instead of
"58331.2". Note that the console output will correctly display
"58331.2". However, this is not just a "debugger-display" problem, this
problem was found since the errant data is actually being output to
Matlab data files.

I've tested this sample code on VC++ 6, 2002, and 2005. Both 2002 and
2005 exhibit this problem. Version 6 at least shows the correct result
while debugging.

Can this be "fixed" with some compile/link option? (Needed most for
2002)

Is there a patch available? A reasonable work-around?

This is so surprising I must be overlooking something very basic.

Mike

William DePalo [MVP VC++] · Nov 2, 2006

While debugging the following code:

#include <iostream>
using namespace std;
int main()
{
float f = 583312.0F * 0.1F;
cout << f << endl;
return 0;
}

The value of "f" as reported by the debugger is "58331.199" instead of
"58331.2". Note that the console output will correctly display
"58331.2". However, this is not just a "debugger-display" problem, this
problem was found since the errant data is actually being output to
Matlab data files.

I've tested this sample code on VC++ 6, 2002, and 2005. Both 2002 and
2005 exhibit this problem. Version 6 at least shows the correct result
while debugging.

Can this be "fixed" with some compile/link option? (Needed most for
2002)

Is there a patch available? A reasonable work-around?

This is so surprising I must be overlooking something very basic.

What you are overlooking is that floating point arithmetic is inexact.

Declare your variable as a double rather than a float and you get greater
precision.

Regards,
Will

Bruno van Dooren [MVP VC++] · Nov 2, 2006

While debugging the following code:

#include <iostream>
using namespace std;
int main()
{
float f = 583312.0F * 0.1F;
cout << f << endl;
return 0;
}

The value of "f" as reported by the debugger is "58331.199" instead of
"58331.2". Note that the console output will correctly display
"58331.2". However, this is not just a "debugger-display" problem, this
problem was found since the errant data is actually being output to
Matlab data files.

I've tested this sample code on VC++ 6, 2002, and 2005. Both 2002 and
2005 exhibit this problem. Version 6 at least shows the correct result
while debugging.

There is no correct result. your value probably is 58331.1999999 and cout
probably truncates it after the nth digit.
floating point math is different from integer math.

google for 'what every programmer should know about floating point' and the
first n hits will explain exactly why you are having this problem, why it is
not bug, and why there is nothing you can do about it.

--

Kind regards,
Bruno van Dooren
(e-mail address removed)
Remove only "_nos_pam"

loftusoft · Nov 2, 2006

My mistake - hopefully this will make it more clear:

#include <iostream>
using namespace std;
int main(int argc, char **argv)
{
float f1 = (583312.0F * 0.1F);
cout << f1 << endl;
float f2 = (583313.0F * 0.1F);
cout << f2 << endl;
float f3 = f2 - f1;
cout << f3 << endl;
return 0;
}

f3 is 0.101563 instead of "0.1". I can accept the normally miniscule
epsilon. However, this is unacceptable - thus the statement of "Huge"
in regards to the error.

Mike

Bruno van Dooren [MVP VC++] · Nov 2, 2006

My mistake - hopefully this will make it more clear:

#include <iostream>
using namespace std;
int main(int argc, char **argv)
{
float f1 = (583312.0F * 0.1F);
cout << f1 << endl;
float f2 = (583313.0F * 0.1F);
cout << f2 << endl;
float f3 = f2 - f1;
cout << f3 << endl;
return 0;
}

f3 is 0.101563 instead of "0.1". I can accept the normally miniscule
epsilon. However, this is unacceptable - thus the statement of "Huge"
in regards to the error.

Floats (4 byte) have only 7 digits of precision
subtracting 58331.2 from 58331.3 and ending up with 0.101563 is correct,
because the last part '1563' starts from the 8th digit of precision.

floats are simply too small for what you want to do. You should either use
doubles, or multiply your result by 10, truncate it to only its integer part
and then divide by 10 again.

Another symptom of this behaviour: if you have to add a collection of
floats, the order in which you add them determines the outcome.
starting with the smallest values first will yield the most precise answer.
doing it any other way will give larger errors.

These things are implied by the floating point standard itself. The fact
that VC6 yield other results is because it does floating point stuff
differently from later compilers. But both are correct as far as floating
point behavior is concerned.

--

Kind regards,
Bruno van Dooren
(e-mail address removed)
Remove only "_nos_pam"

loftusoft · Nov 2, 2006

I understand now. Thank you for your help.

Mike

Floats (4 byte) have only 7 digits of precision
subtracting 58331.2 from 58331.3 and ending up with 0.101563 is correct,
because the last part '1563' starts from the 8th digit of precision.

floats are simply too small for what you want to do. You should either use
doubles, or multiply your result by 10, truncate it to only its integer part
and then divide by 10 again.

Another symptom of this behaviour: if you have to add a collection of
floats, the order in which you add them determines the outcome.
starting with the smallest values first will yield the most precise answer.
doing it any other way will give larger errors.

These things are implied by the floating point standard itself. The fact
that VC6 yield other results is because it does floating point stuff
differently from later compilers. But both are correct as far as floating
point behavior is concerned.

--

Kind regards,
Bruno van Dooren
(e-mail address removed)
Remove only "_nos_pam"

Huge Floating Point Error

loftusoft

William DePalo [MVP VC++]

Bruno van Dooren [MVP VC++]

loftusoft

Bruno van Dooren [MVP VC++]

loftusoft