Excel's COMBIN and integers

  • Thread starter Thread starter Henry
  • Start date Start date
H

Henry

In theory COMBIN works with integers in Excel (2003 and 2007).

So "=COMBIN(9,3)" gives 84 and so does "=COMBIN(9.7,3.6)".
Similarly "=COMBIN(9,3)-84" gives 0.

But "=COMBIN(9,3)-84-0" gives -1.42109E-14
and there are many similar examples suggesting that some sort of
rounding is involved.
The sign can even change so for example "=COMBIN(15,3)-455-0" gives
5.68434E-14

=IF(COMBIN(9,3)=84,"same","different") and
=IF(COMBIN(9,3)-84=0,"same","different") do not give identical
results.
Nor do =COMBIN(9,3)-67-17 and =COMBIN(9,3)-68-16.




It gets stranger:
 
Henry said:
Similarly "=COMBIN(9,3)-84" gives 0.
But "=COMBIN(9,3)-84-0" gives -1.42109E-14

Interesting find! What that tells us is that COMBIN(9,3) is not
returning an integer(!). Indeed, COMBIN(9,3) returns exactly
83.9999999999999,857891452847979962825775146484375 in IEEE 64-bit
floating point form.

Off-hand, I cannot think of any reason for the computational
inaccuracy in __this__ case. Even if COMBIN computes this the hard
way (which it shouldn't), FACT(9)/FACT(6)/6 and FACT(9)/6/FACT(6)
yield exactly 84, as they should since 362880/720 (9!/6!) and 362880/6
(9!/3!) are both integers, and all factors are well within the
computational limitations of 64-bit floating point arithmetic.

I can only guess that COMBIN uses some approximation formula, which
might be more accurate for larger factors that exceed the
computational limitations. I am not aware of any such approximation
formula.

FYI, given the fact that COMBIN(9,3) does not return an exact integer,
the reason why COMBIN(9,3)-84 is different from COMBIN(9,3)-84-0 is
because of the half-baked heuristic described (poorly) in KB 78113.
See the section "Example When the Value Reaches Zero" at
support.microsoft.com/kb/78113.

=IF(COMBIN(9,3)=84,"same","different") and
=IF(COMBIN(9,3)-84=0,"same","different")
do not give identical

That is a common side-effect of the heuristic, which is why I call it
half-baked.

The work-around, as you might realize, is to compute ROUND(COMBIN(9,3),
0).

But like you, I would never have expected that is necessary for such
small numbers.


----- original message -----
 
joeu2004 said:
Interesting find! What that tells us is that COMBIN(9,3) is not
returning an integer(!). Indeed, COMBIN(9,3) returns exactly
83.9999999999999,857891452847979962825775146484375 in IEEE 64-bit
floating point form.


There are several ways to reproduce such errors in Excel. E.g.
NORMSINV(NORMSDIST(0)). Rather than the zero that one would expect, one
gets -1.39213763529183E-16.
 
There are several ways to reproduce such errors in Excel. E.g.
NORMSINV(NORMSDIST(0)). Rather than the zero that one would expect, one
gets -1.39213763529183E-16.

Or even NORMSINV(1/2), But you don't usually expect NORMSINV to return
an integer for other values. You do expect COMBIN to do so.
 
Dana DeLouis said:
It does seem strange...
?2^-46
1.4210854715202E-14

What's your point?

Yes, COMBIN(9,3) is 84 - 2^-46 (according to Excel). 2^-46 represents the
least-significant bit of the 64-bit floating-point representation of 84. So
COMBIN(9,3) differs from what we expect by only 1 bit, the LSB.

What's "strange" about that?

If COMBIN(n,k) were implemented as FACT(n)/FACT(n-k)/FACT(k), then
COMBIN(23,2) would be 253 + 2^-45, where 2^-45 represents the LSB.

So, yes, the non-integer result can be off by just 1 bit, the LSB, due to
the floating-point algorithm.

Of course, that shouldn't be the case for COMBIN(9,3), as I explained
previously.

And ironically, Excel's COMBIN(23,2) returns the exact integer 253. Now
__that__ I find "strange", given the result of COMBIN(9,3).


----- original message -----
 
Henry said:
Or even NORMSINV(1/2), But you don't usually expect NORMSINV
to return an integer for other values.

Or just about any expression involving numbers with decimal fractions.

(Although I would like an implementation of functions like NORMSINV to
recognize special cases and side-step the usual approximation method when an
exact value can be returned.)

you don't usually expect NORMSINV to return an integer
for other values. You do expect COMBIN to do so.

Yes, with the understanding that the ability to return an exact and accurate
integer depends, in part, on the magnitude of the values involved.

With the standard way that Excel represents numbers internally -- IEEE 754
64-bit floating point -- generally we can only be assured of representing
integers up to 15 digits accurately. (There are exceptions, as demonstrated
below.)

But the degree to which that impacts the result of COMBIN depends on its
internal implementation.

If COMBIN(n,k) were implemented internally effectively like
FACT(n)/FACT(n-k)/FACT(k), we might not expect exact and accurate integer
results for n>17 because FACT(18) exceeds 15 digits.

(The operative word is "expect". It depends on other computational factors,
as well. As it turns out, FACT(n) is accurate for n<=20.)

However, a "good" implementation of COMBIN(n,k) might be implemented as
PRODUCT[x, for x=MAX(k,n-k)+1,...,n]/FACT(MIN(k,n-k)). That would greatly
extend the range of n for which we can expect exact integer results within
the computational limitations of 64-bit floating-point arithmetic.

Empirically, incrementing n starting from 1 and for k=1 to n, with the first
(factorial) method implemented in VBA, n=23 and k=2 is the first combination
that should result in a non-integer result. With the second ("good")
method, n=31 and k=14 is the first combination that should result in a
non-integer result.

(I am only checking to see if the result is an integer. I did not vet the
correctness of the integer results.)

PS: This is a digression. None of the above explains why COMBIN(9,3)
returns a non-integer. As I noted previously, even FACT(9)/FACT(6)/FACT(3)
is well within the computational limitations of 64-bit floating-point
arithmetic.


----- original message -----

There are several ways to reproduce such errors in Excel. E.g.
NORMSINV(NORMSDIST(0)). Rather than the zero that one would expect, one
gets -1.39213763529183E-16.

Or even NORMSINV(1/2), But you don't usually expect NORMSINV to return
an integer for other values. You do expect COMBIN to do so.
 
I said:
As it turns out, FACT(n) is accurate for n<=20.

Actually, FACT(n) is accurate for n<=22. But the displayed value will
appear inaccurate due to Excel's rounding after 15 significant digits.


----- original message -----

Joe User said:
Henry said:
Or even NORMSINV(1/2), But you don't usually expect NORMSINV
to return an integer for other values.

Or just about any expression involving numbers with decimal fractions.

(Although I would like an implementation of functions like NORMSINV to
recognize special cases and side-step the usual approximation method when
an
exact value can be returned.)

you don't usually expect NORMSINV to return an integer
for other values. You do expect COMBIN to do so.

Yes, with the understanding that the ability to return an exact and
accurate
integer depends, in part, on the magnitude of the values involved.

With the standard way that Excel represents numbers internally -- IEEE 754
64-bit floating point -- generally we can only be assured of representing
integers up to 15 digits accurately. (There are exceptions, as
demonstrated
below.)

But the degree to which that impacts the result of COMBIN depends on its
internal implementation.

If COMBIN(n,k) were implemented internally effectively like
FACT(n)/FACT(n-k)/FACT(k), we might not expect exact and accurate integer
results for n>17 because FACT(18) exceeds 15 digits.

(The operative word is "expect". It depends on other computational
factors,
as well. As it turns out, FACT(n) is accurate for n<=20.)

However, a "good" implementation of COMBIN(n,k) might be implemented as
PRODUCT[x, for x=MAX(k,n-k)+1,...,n]/FACT(MIN(k,n-k)). That would greatly
extend the range of n for which we can expect exact integer results within
the computational limitations of 64-bit floating-point arithmetic.

Empirically, incrementing n starting from 1 and for k=1 to n, with the
first
(factorial) method implemented in VBA, n=23 and k=2 is the first
combination
that should result in a non-integer result. With the second ("good")
method, n=31 and k=14 is the first combination that should result in a
non-integer result.

(I am only checking to see if the result is an integer. I did not vet the
correctness of the integer results.)

PS: This is a digression. None of the above explains why COMBIN(9,3)
returns a non-integer. As I noted previously, even
FACT(9)/FACT(6)/FACT(3)
is well within the computational limitations of 64-bit floating-point
arithmetic.


----- original message -----

There are several ways to reproduce such errors in Excel. E.g.
NORMSINV(NORMSDIST(0)). Rather than the zero that one would expect, one
gets -1.39213763529183E-16.

Or even NORMSINV(1/2), But you don't usually expect NORMSINV to return
an integer for other values. You do expect COMBIN to do so.
 
Henry said:
In theory COMBIN works with integers in Excel (2003 and 2007).

So "=COMBIN(9,3)" gives 84 and so does "=COMBIN(9.7,3.6)".
Similarly "=COMBIN(9,3)-84" gives 0.

But "=COMBIN(9,3)-84-0" gives -1.42109E-14
and there are many similar examples suggesting that some sort of
rounding is involved.
The sign can even change so for example "=COMBIN(15,3)-455-0" gives
5.68434E-14

=IF(COMBIN(9,3)=84,"same","different") and
=IF(COMBIN(9,3)-84=0,"same","different") do not give identical
results.
Nor do =COMBIN(9,3)-67-17 and =COMBIN(9,3)-68-16.




It gets stranger:

As you have noted,
"=COMBIN(9,3)" gives 84
"=COMBIN(9,3)-84" gives 0
but
"=COMBIN(9,3)-84-0" gives -1.42109E-14
yet if I reverse "-84-0" to be "-0-84" I get the correct answer
"=COMBIN(9,3)-0-84" gives 0
Why is this so?
I use Excel 2003, Windows Vista Ultimate 32 bit
Brian (a novice)
 
someone said:
"=COMBIN(9,3)-84-0" gives -1.42109E-14
yet if I reverse "-84-0" to be "-0-84" I get the
correct answer
"=COMBIN(9,3)-0-84" gives 0
Why is this so?

I already answered that question in this thread -- well, as well as it can
be, I think.

It is due to the half-baked heuristic described (poorly) in KB 78113. See
the section "Example When the Value Reaches Zero" at
support.microsoft.com/kb/78113.

Your observation is a common side-effect of the heuristic, which is why I
call it
half-baked.

In a nutshell, the heuristic applies only if the last operation is
subtraction (or addition of operands with opposite signs). The exact nature
of the heuristic is difficult to infer. It is even more difficult to
explain, especially to anyone who is not familiar with computer arithmetic
at the bit level. For grins, see my response at
http://groups.google.com/group/microsoft.public.excel.worksheet.functions/msg/68caa51ae5025678 .
As I reread that today, I am not sure even that is a complete explanation.

Suffice it to say that the heuristic is intended to hide the small
aberrations that arise in IEEE 754 64-bit floating-point arithmetic. But it
is implemented so poorly that it creates more seemingly inexplicable
anomalies.

For example, =COMBIN(9,3)-84 results in exactly zero, but =(COMBIN(9,3)-84)
does not.

There simply is no good reason for such differences. It is simply a poor
implementation of a well-intentioned, albeit dubious heuristic.


----- original message -----
 
But "=COMBIN(9,3)-84-0" gives -1.42109E-14

Hi. Just to mention.
This behavior appears in some math programs as well.
Therefore, it would appear now that Excel is not really that far off.

At common low values, Excel and others agree (ie 9 & 3)

Binomial[9., 3] - 84

-1.42109*10^-14

So, I guess Excel is ok.

What I thought was strange occurs at higher values.
At some combinations, Excel was more exact, and math programs were off.
At other numbers, Math programs were more exact, and Excel was off.

Here is one that was flagged near the upper limit of machine precision.

=Combin(53, 21) - 0 - 317986441828055
=Combin(53, 21) - 317986441828055 - 0

Returns:
0
0.125

Math...

Binomial[53., 21] - 317986441828055
2.625

Hmmm. Excel was off by 1/8, and the Math program was off by 2+5/8.

Just for geewiz, I believe their documentation applies to
Arbitrary-precision... (and not to machine-precision above)

"...Binomial and related functions use a divide-and-conquer algorithm to
balance the number of digits in subproducts."
= = = = =
Interesting...
Dana DeLouis
 
Back
Top