Why does 28.08 show up as 28.080000000000002 in DataGridView

  • Thread starter Thread starter Daniel Manes
  • Start date Start date
Jon Skeet said:
How
you use something doesn't, to my mind, define what it *is*.

By that logic, a computer that hosts applications on a network shouldn't be
called a "server" and a computer that connects to it shouldn't be called a
"client", since they're all "computers" anyway and how you use them is
irrelevant to what they are? Some people might claim that how you use
something might be very closely tied to how we classify it.
32-bit integers I use in .NET will never actually have numbers above
1000 in them, but that doesn't mean that the type itself can't store
anything above 1000 - what it can and can't store is precisely defined
regardless of how I choose to use it.

However when you store 989 in it, you are certain to get back 989. Not 988
or 990 "approximations" of 989.
My concern is that float is being arbitrarily labeled as "approximate"
just because it can't store all values of one particular base, despite
the fact that decimal can't store all values of other bases.

Base 10 is what we use in SQL for floating point numbers, most likely
because no one has introduced a better system. Perhaps you should recommend
Base 3 to ANSI, or recommend that people work out their own binary
representations of floating point values and store them directly using the
BINARY data type?
Yes, base
10 is obviously the most widely used base, but it *is* just a base.
We're lucky that it's a multiple of 2 - otherwise exact float values
wouldn't be exactly representable in decimal...

I thought your argument was that Base was irrelevant? Now you have a bias
for Base 2? What about Base 3?
Any float value is exact in and of itself. If it's an approximation to
some other value it was originally converted from, so be it - that
doesn't, to my mind, make the type itself "approximate".

And in the case of 28.08, the float that is stored is *exactly* the wrong
number.
The value in the float is only approximate equal to the decimal value
which was originally converted, but in itself it is an exact number.
Given a floating point value, there is a well-defined, precise number
that value represents.

And in SQL we tend to concern ourselves with little things like data
integrity; i.e., am I going to get out what I put in? Or am I going to get
back something different?
Frankly, how many *decimals* a float stores is irrelevant to me. What
I'm interested in is whether the value stored in a float can be
regarded as "exact" or merely "approximate". Given a float
representing, say, 1.0101, that is an exact value.

The fact that you can't store many values that are coincident to the data
type exactly without data loss or introducing error (i.e., "approximation")
makes them "approximate".
 
Shuurai said:
So your argument is that they store SOME numbers exactly?

From his posts so far his argument seems to boil down to "all data types are
stored as bits which are either *exactly* on or *exactly* off".

Of course by that reasoning you could create a data type that completely
disregards anything you put in and stores something entirely different, but
it would be just as *exact* as any other data type. It would be interesting
(though admittedly pointless) to have a SURPRISE data type where you could
put in "28.08" and retrieve "A suffusion of yellow", "999999.9283746", "c^2
= a^2 + b^2", "3.141592", or "9i + 12"; depending on what the computer
decided to store in it at any given moment. But since bits are exactly
either on or off, this would be another "exact" data type.

Another one of those "It's technically accurate, but completely useless"
type things.
 
Erland Sommarskog said:
No, it is not arbitrary. It relates to what humans would like to be
able to store, but which is not always possible. The fact that the floating
number actually belongs to the set of decimal numbers is completely
rrelevant when it comes to usage.

In many cases, it's what humans want to store - but not always. What
about if I'm storing some data in the database when it's been computed
(or measured) using binary floating point to start with? At that point,
assuming you've got the right precision, you will lose no data at all.
 
Mike C# said:
Yet even with C# you can't always retrieve the *exact* value you originally
assigned. You retrieve an approximation of that value... You put 28.08
into the bucket and pull out 28.079999923706055, which is not the "right
answer".

You can retrieve the exact value which was actually assigned, but not
the "pre-conversion" value. That's only to be expected with a lossy
conversion.
I wouldn't choose to use the words "exact" or "approximate" in the
first place.

<Message Author="Jon Skeet [C# MVP]">
"No, by my definition pretty much all datatypes are exact."
</Message>

I'm still waiting to hear your definition, but only because you indicated
that you actually have one.

If I were forced into one, I'd say your definition of exact (which I
believe includes float) is pretty reasonable.
 
Shuurai said:
While making it more clear what can be expected as a result. Those of
us who know how it's working don't need the definition as much as
someone who's just looking to get the same number out that they put in.

Well, I don't think it makes it any clearer, but that's clearly a
matter of opinion.
And you'll note that 28.0 !=
28.0799999999999982946974341757595539093017578125

(I assume you meant 28.08 rather than 28.0?)

Yes, just as 28 != 28.08 in your integer example. Just as with integer
you are storing exactly 28 (i.e. the closest integer to 28.08 with
appropriate rounding behaviour), with float you're storing exactly the
closest float to 28.08 with the specified rounding behaviour.
Yes, it's an exact number. It's just not the exact number that we
stored.

Well, it's the exact number that you *stored*, just not the exact
number which was the source of the conversion.

My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?
So your argument is that they store SOME numbers exactly?

All types available only store *some* numbers exactly. Decimal doesn't
store all numbers exactly: "a third" is a number, but it can't be
stored exactly. Be careful to distinguish between "decimal numbers" and
"numbers".
Definitions need to have a balance between usefulness and accuracy.
Your definition, even if technically accurate (which I don't believe it
is) just isn't useful.

Unsurprisingly, I disagree. I don't expect either of us to persuade the
other.
Which is absolutely correct. The definition is used to describe what
can be expected of the data type. It doesn't address (or need to
address) anything that is going on in the background. The point of the
definition is to tell the user which types will store 28.08 as 28.08.

It's not clear to me what you're regarding as "absolutely correct". The
bit in quotes in my post was my suggested alternative to just saying
that float is an approximate type. If you agree that my text is
correct, and tells the user which type will store 28.08 as 28.08,
what's the disadvantage of using my text?
Not misleading; just not particularly useful.

What does it *not* address that would be useful?
 
Yet even with C# you can't always retrieve the *exact* value you
You can retrieve the exact value which was actually assigned,

BUT not the value that was assigned *by the user.* And that's the only
value *the user* tends to care about.

A
 
Mike C# said:
Jon Skeet said:
Frankly, I don't have the energy. I don't know whether you genuinely
don't believe it's possible or whether you really want me to waste my
time on it. There are plenty of descriptions of how floats are stored,
and it's far from difficult to reverse engineer that to work out what
*can* be stored.

Why would it possibly be worth my time writing it up?

<Message Author = "Jon Skeet [C# MVP]">
"Well, in that case I could (somewhat painstakingly) write a similar
description in terms of binary strings which are representable as floats."
</Message>
...
<Message Author = "Jon Skeet [C# MVP]">
"No, by my definition pretty much all datatypes are exact. "
</Message>
...
<Message Author = "Jon Skeet [C# MVP]">
"I'll do that (publish a complete list of valid values for the FLOAT and
REAL types) if you publish a complete list of valid values for
NUMERIC(38,20)."
</Message>

I'm only interested in your definitions and descriptions because you
indicate that you either have them on hand, or are able to produce them. If
you don't have any definitions and/or can't produce what you said you can,
then I'd say the whole issue is fairly well resolved.

Don't be absurd. It would take time to write down all the rules for
which binary strings are exactly representable as float due to
denormalisation etc, but I *hope* that from descriptions of the format
(eg in http://www.yoda.arachsys.com/csharp/floatingpoint.html) that
it's obvious that such a rule is *possible* to write down with effort.

If you accept that it's possible - that there is inherently such a
rule, even though it would take a while to write it out accurately -
then you should accept that float falls within your definition of an
exact type. What good would having the actual rule do?

Now, do you actually believe it's not possible to write down the rule
which says which values are allowed? (Heck, I've given you a program
which would generate them all given long enough - so the theoretical
rule could be "if it appears in this list"; the list clearly exists,
even if it takes a long time to produce.)
 
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?

Maybe it's the surprise factor.

Maybe it's that we are used to being able to supply *less* than the number
of acceptable significant digits, and not having the values change on us
after the fact.

Maybe it's because integers are simpler, and the rounding of 28.08 -> 28 is
much more intuitive, clear, expected, and predictable. This is stuff we all
learned in grade school. I think it takes a much better understanding of
mathematics to draw the same conclusions from the approximation of 28.08 ->
28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

A
 
Mike C# said:
By that logic, a computer that hosts applications on a network shouldn't be
called a "server" and a computer that connects to it shouldn't be called a
"client", since they're all "computers" anyway and how you use them is
irrelevant to what they are? Some people might claim that how you use
something might be very closely tied to how we classify it.

That's a fair point, although I'd argue there is a difference here when
you talk about a type *being* approximate.
However when you store 989 in it, you are certain to get back 989. Not 988
or 990 "approximations" of 989.

But if you store 989.1 in it, you won't get 989.1 back. No different to
float in that respect.
Base 10 is what we use in SQL for floating point numbers, most likely
because no one has introduced a better system. Perhaps you should recommend
Base 3 to ANSI, or recommend that people work out their own binary
representations of floating point values and store them directly using the
BINARY data type?

Well, communicating with SQL server from another program, if I were
storing or retrieving a float I almost certainly *wouldn't* use base 10
- I'd use base 2, as that's the "native" base of float within SQL
server and would almost certainly be the "native" base of my storage
format within the other program, too.

It's only SQL *literals* (and values of type decimal and numeric, of
course) which are explicitly base 10, isn't it?
I thought your argument was that Base was irrelevant? Now you have a bias
for Base 2? What about Base 3?

No, I don't have a bias for base 2 - I'm just saying that the fact that
we *can* represent all floats accurately as decimals comes from the
fact that 10 is divisible by 2. It's always nice when conversions *can*
be lossless.
And in the case of 28.08, the float that is stored is *exactly* the wrong
number.

Yes. Just as if I try to store 28.08 as an integer, I'll be storing
exactly the wrong number (28), and just as if I try to store 1.234567
in a numeric (2,3) I'll be storing exactly the wrong number.

As we've already agreed, you can only expect a type to store numbers
within a particular set of available values. That's true for *all* the
types involved.
And in SQL we tend to concern ourselves with little things like data
integrity; i.e., am I going to get out what I put in? Or am I going to get
back something different?

Which is why it's important for people to understand the difference
between float and decimal - but they can understand that without
believing that float doesn't store its values exactly. It just has a
different (but still well-defined) set of valid values.
The fact that you can't store many values that are coincident to the data
type exactly without data loss or introducing error (i.e., "approximation")
makes them "approximate".

No, you can store *all* values that are coincident to the data type.
You just can't store *any* values which *aren't* coincident to the data
type - which is the same for all other types as well.
 
Aaron Bertrand said:
Maybe it's the surprise factor.

Maybe it's that we are used to being able to supply *less* than the number
of acceptable significant digits, and not having the values change on us
after the fact.

FWIW, I've always thought that describing floating binary point types
as being accurate to a certain number of *decimal* digits is asking for
trouble.
Maybe it's because integers are simpler, and the rounding of 28.08 -> 28 is
much more intuitive, clear, expected, and predictable. This is stuff we all
learned in grade school. I think it takes a much better understanding of
mathematics to draw the same conclusions from the approximation of 28.08 ->
28.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Yes, I dare say it takes a bit more understanding if you want to know
exactly why it stores the values it does.

I would even have been *somewhat* happy with an original answer (when I
started this whole business) of "yes, we know that float is exact
really, but explaining it as approximate is easier than bringing
conversions into the conversation". I would still have argued against
that, *but* the fact is that everyone's been arguing that float really
*is* inherently approximate, and from a mathematical standpoint which
doesn't have a base 10 bias I just can't accept that as being true.
Just because a conversion takes a bit more understanding doesn't mean
the value stored isn't exact.

All the types we've talked about (integer, decimal, float) have the
following in common:

1) They have a well-defined set of values they can exactly represent.

2) If you ask them to store a value which isn't in that set, they will
store (exactly) an approximation to that value. The rules for that
approximation are also well-defined.

3) If you ask them to store a value which *is* in that set, they will
store that value exactly.
 
(I assume you meant 28.08 rather than 28.0?)
Yes.

Yes, just as 28 != 28.08 in your integer example. Just as with integer
you are storing exactly 28 (i.e. the closest integer to 28.08 with
appropriate rounding behaviour), with float you're storing exactly the
closest float to 28.08 with the specified rounding behaviour.

I don't want to store the closest anything to 28.08 -- I want to store
28.08. That is why DECIMAL is called an "exact" data type and FLOAT is
called an approximate data type. Listen to what you are saying above.
You spell it out yourself -- you're storing the *closest* thing to what
you intended to store.

As for the rest of your statement about INT, that is the intended
behavior. When you put 28.08 into an INT, you know it's going to be
converted to INT. You know that it's going to be exactly 28 when you
read it, not just "the closest thing" to 28.
Well, it's the exact number that you *stored*, just not the exact
number which was the source of the conversion.

You have to understand that we database types are kind of biased
towards knowing what we're storing and how to get it back. Part of
that is knowing that we've got to built the right container for what
we're storing.
My point is that the number stored in the float example is just as
exact as the number stored in the integer example. Both types have
rules for converting an arbitrary number to an exact value within the
set of representable values for that type. Where do you see the
difference that makes integer exact and float approximate?

The difference is that with an INT, I know that if I store a valid
number I'll get that exact same number back, not just the closest
number to it. And when I store 28.08 as an INT, I know that it's going
to be converted to the nearest integer - that's by design.
All types available only store *some* numbers exactly. Decimal doesn't
store all numbers exactly: "a third" is a number, but it can't be
stored exactly. Be careful to distinguish between "decimal numbers" and
"numbers".

DECIMAL will store all decimals within the range you specify, and will
store them exactly as entered. FLOAT will not. In other words, if I
define a column as DECIMAL(4,2) I know that when I store 28.08 it will
remain 28.08, and not just something close. I know that when I store
28.082 as DECIMAL(4,2) it will also be stored as 28.08 -- that is by
design.
Unsurprisingly, I disagree. I don't expect either of us to persuade the
other.

What good is an accurate definition if it isn't useful?
It's not clear to me what you're regarding as "absolutely correct". The
bit in quotes in my post was my suggested alternative to just saying
that float is an approximate type. If you agree that my text is
correct, and tells the user which type will store 28.08 as 28.08,
what's the disadvantage of using my text?

Again, read what you wrote... "Use the DECIMAL type instead to keep
the *exact* decimal value."
What does it *not* address that would be useful?

When the average user ought to use which data type.
 
I'm only interested in your definitions and descriptions because you
indicate that you either have them on hand, or are able to produce them. If
you don't have any definitions and/or can't produce what you said you can,
then I'd say the whole issue is fairly well resolved.

As I suspect we'll keep going for a very long time if I don't do this,
here's a first pass at a definition of what binary strings are exactly
representable in a float(53). There *may* be some off-by-one errors -
I've tried a "mock-up" on paper with a floating point type with a 2-bit
mantissa and a 3-bit exponent, so *hopefully* it's correct. There'd be
a similar definition for a 32-bit floating binary point value, but I at
least hope you will be satisfied with one specific type:

A binary number is representable in float(53) if either:

1) (Normal numbers) It has 53 or fewer significant digits, *and* it has
at most 1024 binary digits after the binary point when trailing
insignificant 0s after the binary point are discarded, and it has at
most 1024 binary digits before the binary point when leading
insignificant 0s are discarded.

or

2) (Subnormal numbers) It is of the form 0.xy where x is 1023 zeroes,
and y is up to 52 binary digits (any sequence is acceptable)

3) (Zero) It is 0.

I'm not certain of SQL Server's behaviour with regards to NaN and
infinity - we'd have to decide on whether they count as "numbers" in
the first place to work out whether or not to include them in the
rules. It's easy to do if you want to though.


Right, are you satisfied that float(53) is an exact type by your
definition then?
 
Daniel said:
Thanks, everyone, I think :) At least I can't complain that no one
responded to my question! Anyway, I definitely need to rethink when I
use float/double and when I use decimal. As for when to use numeric,
SQL Server docs says "numeric is functionally equivalent to decimal" so
I'm thinking...never?

I believe the ANSI standards makes a distinction between numeric and
decimal, although it's a bit of a hair-splitting thing. In SQL Server
numeric and decimal are equvivalent.

--
Erland Sommarskog, SQL Server MVP, (e-mail address removed)

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
Shuurai said:
I don't want to store the closest anything to 28.08 -- I want to store
28.08. That is why DECIMAL is called an "exact" data type and FLOAT is
called an approximate data type. Listen to what you are saying above.
You spell it out yourself -- you're storing the *closest* thing to what
you intended to store.

As for the rest of your statement about INT, that is the intended
behavior. When you put 28.08 into an INT, you know it's going to be
converted to INT.

If you put 28.08 into a FLOAT, you should jolly well be aware that it's
going to be converted to FLOAT too!
You know that it's going to be exactly 28 when you read it, not just
"the closest thing" to 28.

It's the closest thing to 28.08 though, and 28.08 is the number you
were trying to store, supposedly.

Really, other than awareness of how the conversion takes place (and
being able to spot by eye what value will actually be stored), what's
the difference here?
You have to understand that we database types are kind of biased
towards knowing what we're storing and how to get it back. Part of
that is knowing that we've got to built the right container for what
we're storing.

Which is why if you're trying to store exact decimal numbers, you
should indeed use the decimal type. I've never argued against that.
I've argued against the idea that float is approximate, just because
when it's asked to store something it can't store exactly, it stores
the closest thing to it (or whatever the rule for the type is - I'm
thinking of the different rounding mechanisms for int) - just as
decimal would if you could ask it to store "a third" and just as
integer does when you ask it to store 28.08.
The difference is that with an INT, I know that if I store a valid
number I'll get that exact same number back, not just the closest
number to it.

That's exactly the same with float though: your idea of "a valid
number" is presumably "an integer", i.e. one of the sets of exact
values that the type can hold. The same is true with float. "28.08"
isn't a valid int, and it isn't a valid float either.
And when I store 28.08 as an INT, I know that it's going
to be converted to the nearest integer - that's by design.

And so you should know that when you store 28.08 as a float, it's going
to be converted to the neareset float - that's by design too.
DECIMAL will store all decimals within the range you specify, and will
store them exactly as entered. FLOAT will not.

Indeed - the key thing being "all decimals". That's not the same as
"all numbers".
In other words, if I define a column as DECIMAL(4,2) I know that when
I store 28.08 it will remain 28.08, and not just something close. I
know that when I store 28.082 as DECIMAL(4,2) it will also be stored
as 28.08 -- that is by design.

Indeed, just as it's by design that when you ask FLOAT to store a value
which it can't cope with exactly, it will take the nearest value it
*can* cope with.
What good is an accurate definition if it isn't useful?

Perhaps I didn't make myself clear: I believe my definition is both
technically accurate *and* useful.
Again, read what you wrote... "Use the DECIMAL type instead to keep
the *exact* decimal value."

Yes - what's wrong with that? It gives practical advice based on the
previous two sentences. It doesn't go against anything I've said
anywhere in the thread.
When the average user ought to use which data type.

Okay, so we could expand slightly it to say when to use FLOAT. It
already says when to use DECIMAL (i.e. when you want to preserve an
exact decimal value).
 
Jon said:
In many cases, it's what humans want to store - but not always. What
about if I'm storing some data in the database when it's been computed
(or measured) using binary floating point to start with? At that point,
assuming you've got the right precision, you will lose no data at all.

That does not change anything. The float data type is designed to fit
a broad set of approxamite values. In the case of measurements of
physical phenomens, the measurements themselves are often approxamite.
If you are able to find a fringe case where you can actually use float
to represent something exact, that does not change anything.

As a parallel, consider pieces of paper with a number and some complex
pattern on it, and often with a small metal ribbon in it. We call such
paper "money". But in fact, it's perfectly possible to use that paper
to clean your rear parts when you have been to the loo. Nevertheless,
no one get the idea to refer to it as "toilet paper".



--
Erland Sommarskog, SQL Server MVP, (e-mail address removed)

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
Jon Skeet said:
FWIW, I've always thought that describing floating binary point types
as being accurate to a certain number of *decimal* digits is asking for
trouble.


Yes, I dare say it takes a bit more understanding if you want to know
exactly why it stores the values it does.

I would even have been *somewhat* happy with an original answer (when I
started this whole business) of "yes, we know that float is exact
really, but explaining it as approximate is easier than bringing
conversions into the conversation". I would still have argued against
that, *but* the fact is that everyone's been arguing that float really
*is* inherently approximate, and from a mathematical standpoint which
doesn't have a base 10 bias I just can't accept that as being true.
Just because a conversion takes a bit more understanding doesn't mean
the value stored isn't exact.

All the types we've talked about (integer, decimal, float) have the
following in common:

1) They have a well-defined set of values they can exactly represent.

2) If you ask them to store a value which isn't in that set, they will
store (exactly) an approximation to that value. The rules for that
approximation are also well-defined.

3) If you ask them to store a value which *is* in that set, they will
store that value exactly.

And the difference is: If you store a value in an INTEGER or DECIMAL that
is coincident with that type you do not risk introducing error or data loss.
With REAL or FLOAT you risk introducing error or losing data when storing
values that are coincident with the type.
 
Jon Skeet said:
You can retrieve the exact value which was actually assigned, but not
the "pre-conversion" value. That's only to be expected with a lossy
conversion.

LOFL. Now you want to tell me that when I type in "f = 28.08" in C# or "SET
@f = 28.08" that it is not an "assignment statement"? Are you serious?
Please do tell what you call it.
<Message Author="Jon Skeet [C# MVP]">
"No, by my definition pretty much all datatypes are exact."
</Message>

I'm still waiting to hear your definition, but only because you indicated
that you actually have one.

If I were forced into one, I'd say your definition of exact (which I
believe includes float) is pretty reasonable.

I have to say I'm pleasantly surprised that you feel my definition is
reasonable! Unfortunately by my definition not all data types are "exact".
It doesn't include FLOAT and REAL, as I'll show you here:

First we'll consider my definition of "exact" types, which you agree is
"pretty reasonable":

"Exact data types are those data types which can store and retrieve single
values coincident with their type without loss of data, and without the
introduction of error."

To further expand, let's consider a definition for "coincident". To
coincide with a data type, a singleton value should meet these requirements:

1) The value must be composed of digits/characters that are defined as
valid, in the proper format defined for the data type,
2) For numeric types, the scale (digits after the decimal point) of the
value needs to be less than or equal to the scale of the data type,
3) For numeric types, the difference between precision (total number of
digits) and scale of the value needs to be less than or equal to the
difference between the precision and scale of the data type,
4) For numeric types, the value must be greater than or equal to the lower
bound defined for the data type, and less than or equal to the upper bound
defined for the data type,
5) For non-numeric types, the length of the value must be less than or
equal to the length of the data type

I think we can agree that by this definition, the singleton value 28.08
coincides with the following data types:

* DECIMAL (4, 2) since it consists of valid characters and the precision and
scale are less than or equal to that defined for the data type,

* FLOAT since it consists of valid characters and the precision and scale
are less than that defined for the data type,

* REAL since it consists of valid characters and the precision and scale are
less than that defined for the data type

Use of the DECIMAL(4, 2) will not cause any loss of data or introduction of
error when storing values that coincide with the type, therefore by my
definition it is an "exact" type. Use of the FLOAT and REAL types can cause
lost data and introduction of error when attempting to store values that
coincide with the type, so by this definition they are not "exact".

Additionally by this definition 28.08 does not coincide with the following
data types:

* TINYINT, SMALLINT, INT, BIGINT since the scale of 2 is larger than that
defined for these data types (0)

* DECIMAL (3, 1) since the scale of 2 for the value is larger than the scale
of 1 defined for this type

Because 28.08 does not coincide with these data types, rounding errors and
data loss are to be expected and have no bearing on the "exactness" of the
data type.

Additionally, the result of the expression 1/3, by this definition, does not
coincide with any data types since it's scale is infinite and the scale for
all of these data types is finite. Therefore rounding errors and/or data
loss from trying to assign the result of this expression (or rather the
finite truncated/rounded result of this expression) to any of these data
types has no bearing on the "exactness" of the data type.
 
Jon Skeet said:
<Message Author = "Jon Skeet [C# MVP]">
"Well, in that case I could (somewhat painstakingly) write a similar
description in terms of binary strings which are representable as
floats."
</Message>
...
<Message Author = "Jon Skeet [C# MVP]">
"No, by my definition pretty much all datatypes are exact. "
</Message>
...
<Message Author = "Jon Skeet [C# MVP]">
"I'll do that (publish a complete list of valid values for the FLOAT
and
REAL types) if you publish a complete list of valid values for
NUMERIC(38,20)."
</Message>

I'm only interested in your definitions and descriptions because you
indicate that you either have them on hand, or are able to produce them.
If
you don't have any definitions and/or can't produce what you said you
can,
then I'd say the whole issue is fairly well resolved.

Don't be absurd. It would take time to write down all the rules for
which binary strings are exactly representable as float due to
denormalisation etc, but I *hope* that from descriptions of the format
(eg in http://www.yoda.arachsys.com/csharp/floatingpoint.html) that
it's obvious that such a rule is *possible* to write down with effort.

What I see as absurd is your promise to do exactly what you rail against
above when you never intended to do it; apparently under the assumption that
I either couldn't or wouldn't keep up my end of the bargain (which I did, by
the way). They have a saying down south - don't let your mouth write checks
your @$$ can't cash.
If you accept that it's possible - that there is inherently such a
rule, even though it would take a while to write it out accurately -
then you should accept that float falls within your definition of an
exact type. What good would having the actual rule do?

I'm sure that a stickler like yourself, with your demands for stringent
proofs and strict definitions, will fully understand that I'm under no
obligation to make any assumptions about what you can or cannot do, or what
is or is not possible at all. Just as you are not willing to take anyone
else's arguments at face value without proof, you shouldn't expect others to
take your word as Gospel.
Now, do you actually believe it's not possible to write down the rule
which says which values are allowed? (Heck, I've given you a program
which would generate them all given long enough - so the theoretical
rule could be "if it appears in this list"; the list clearly exists,
even if it takes a long time to produce.)

Like you, I have no reason to believe *anything* until I see it.
 
Jon Skeet said:
As I suspect we'll keep going for a very long time if I don't do this,
here's a first pass at a definition of what binary strings are exactly
representable in a float(53). There *may* be some off-by-one errors -
I've tried a "mock-up" on paper with a floating point type with a 2-bit
mantissa and a 3-bit exponent, so *hopefully* it's correct. There'd be
a similar definition for a 32-bit floating binary point value, but I at
least hope you will be satisfied with one specific type:

A binary number is representable in float(53) if either:

1) (Normal numbers) It has 53 or fewer significant digits, *and* it has
at most 1024 binary digits after the binary point when trailing
insignificant 0s after the binary point are discarded, and it has at
most 1024 binary digits before the binary point when leading
insignificant 0s are discarded.

or

2) (Subnormal numbers) It is of the form 0.xy where x is 1023 zeroes,
and y is up to 52 binary digits (any sequence is acceptable)

3) (Zero) It is 0.

I'm not certain of SQL Server's behaviour with regards to NaN and
infinity - we'd have to decide on whether they count as "numbers" in
the first place to work out whether or not to include them in the
rules. It's easy to do if you want to though.


Right, are you satisfied that float(53) is an exact type by your
definition then?

No, because it cannot store some values that coincide with the FLOAT type
without loss of data or introduction of error.
 
Jon Skeet said:
That's a fair point, although I'd argue there is a difference here when
you talk about a type *being* approximate.

LOL My first response included something about a knife being classified as a
"murder weapon", but decided to use the computer analogy instead. Figured I
might scare the locals :)
But if you store 989.1 in it, you won't get 989.1 back. No different to
float in that respect.

However, 989.1 does not coincide with the INTEGER data type since it's scale
is 1, and the INTEGER data type's scale is 0. 989.1 is well within the
precision and scale limitations of FLOAT, so it does coincide with that
type.
Well, communicating with SQL server from another program, if I were
storing or retrieving a float I almost certainly *wouldn't* use base 10
- I'd use base 2, as that's the "native" base of float within SQL
server and would almost certainly be the "native" base of my storage
format within the other program, too.

And when storing these numbers in Base-2, are you going to manually convert
them as well? Again what's the point of a FLOAT data type if you have to
create your own operators for it, and do your own conversions to and from
decimal?
It's only SQL *literals* (and values of type decimal and numeric, of
course) which are explicitly base 10, isn't it?

Yes, SQL *literals*, which just happen to be used to represent values of
type DECIMAL, NUMERIC, INT, FLOAT, REAL, etc.
No, I don't have a bias for base 2 - I'm just saying that the fact that
we *can* represent all floats accurately as decimals comes from the
fact that 10 is divisible by 2. It's always nice when conversions *can*
be lossless.

But you can't represent all floats accurately as decimals. Take, for
instance, 28.08...
Yes. Just as if I try to store 28.08 as an integer, I'll be storing
exactly the wrong number (28), and just as if I try to store 1.234567
in a numeric (2,3) I'll be storing exactly the wrong number.

Except that 28.08 has a larger scale (2) than the INTEGER data type (0), so
that assigning 28.08 to it has no bearing on the "exactness" of the data
type. NUMERIC(2, 3) is an invalid language construct, so you'll never be
able to create an column or variable of that type, much less assign a value
to it anyway. 28.08 and 1.234567 are both well within the precision and
scale limits for the FLOAT data type, but can FLOAT accurately store those
values or can it store only the exact wrong number?
As we've already agreed, you can only expect a type to store numbers
within a particular set of available values. That's true for *all* the
types involved.

You can expect exact types to store the set of all values that coincide with
the type. You can expect approximate types like FLOAT to not be able to
store some values from the set of values that coincide with the type.
Which is why it's important for people to understand the difference
between float and decimal - but they can understand that without
believing that float doesn't store its values exactly. It just has a
different (but still well-defined) set of valid values.

And it can only store approximations of many values that are coincident with
the type.
No, you can store *all* values that are coincident to the data type.
You just can't store *any* values which *aren't* coincident to the data
type - which is the same for all other types as well.

28.08 is coincident with the data type FLOAT, but it cannot be stored in a
FLOAT; only an "approximation" can be stored in it making the data type
"approximate".
 
Back
Top