Logical Test: Text Against Number

  • Thread starter Thread starter rslaughter5
  • Start date Start date
R

rslaughter5

Can someone provide some insight into why Excel will always return TRUE when,
for example, we have ="a">27 (or any other number) and FALSE when ="a"<27?
 
Text has a higher value than a number.

Try this...

Enter this data:

A1 = a
A2 = 1
A3 = b
A4 = 2
A5 = c

Sort A1:A5 in ascending order and you'll see how Excel orders the text as
being a higher value than the numbers.
 
ryguy7272 said:
Anything in quotes is going to be text,
and a small 'a' is equivalent to 97,
and 97>27...so...TRUE.

Then by your rationale, why does "a">255 also return TRUE?

I doubt that your rationale is even close to right.

BTW, how do you rationalize "12">37 returns TRUE, yet "12"-37>0 returns FALSE?

NOTE: I know that in the latter case, the numeric string is "converted" to
a number. The point of my question is: why isn't the numeric string
converted to a number in the comparison case?

It is really a rhetorical question. I doubt that there is a "good reason".
That is simply the way it is. Among UNIX programmers, there is a saying:
"don't ask why" :-).


----- original message -----
 
Teethless mama said:
any text is larger than largest number
eg. ="a">9.99999999999999E+307

I'm sure the OP knows that. I interpret the point of his/her posting to
mean: why did Excel choose to do it that way? Why isn't text always less
than a number? Why doesn't "12">37 give the same result as "12"-37>0?

For the latter, the answer is: numeric strings are "converted" to numbers
in numeric expression. The question is: why aren't numeric strings
converted to numbers in the first case as well? Or why isn't the number 37
treated as a string, and a string comparison performed?

One "good reason" for text to always be treated larger than any number is
that it facilitates the use of a formula to find the last cell of a column.

At least, that's how some people exploit Excel's choice of comparing text
and numbers. I'm not sure that's why Excel chose to do it that way.


----- original message -----
 
T. Valko said:
An over-simplified theory...
Excel stores numbers in the lower bits of a byte.

Of course! My point was (as I demonstrated), that theory does not hold
water.

And by the way, Excel does not store numbers in the lower bits of "a byte"
per se. As I'm sure you know, it stores numbers in a 64-bit floating-point
format.

The only way that Ryan could be rationalizing this is that for the "string"
comparison, Excel is converting the number to a string of bytes in some
manner.

But as I demonstrated with "a">255, that is just plain wrong.

Excel's behavior when comparing text and numbers is arbitrary. That should
not be misinterpreted to mean "wrong". It is simply a design decision that
needed to be made. There were at least 4 choices. They chose "door number
one". ;-)

Whether or not Excel designer has some "logical" reason for their choice,
I'm sure we will never know (unless someone can uncover a KB article that
explains their rationale). But I explained one possible explanation -- at
least one way that we exploit their choice, namely: it facilitates the use
of a formula to find the last value of a column.

(Which might be what Teethlessmama was trying to suggest.)


----- original message -----
 
any text is larger than largest number
I'm sure the OP knows that. I interpret the point of his/her posting to
mean: why did Excel choose to do it that way? Why isn't text always less
than a number? Why doesn't "12">37 give the same result as "12"-37>0?

For the latter, the answer is: numeric strings are "converted" to numbers
in numeric expression. The question is: why aren't numeric strings
converted to numbers in the first case as well? Or why isn't the number
37
treated as a string, and a string comparison performed?

The reason, I think, has to do with the underlying logic built into the
language by the programmer for handling comparison operators. As you
mentioned, for numerical expressions, the strings are converted to numbers
before the operation takes place. Why? Because you can't perform numerical
operations on strings, so the underlying logic tries to force all operands
to be numeric and raises an error if that can't be done. Now, comparisons
are a different story... you can check to see if "ABD" is greater than "ABC"
as well as if 123 is greater than 89. The underlying logic for comparisons
seems to be... if the two operands are numbers, then a numerical comparison
results; otherwise a string comparison takes place. This allows the
underlying code to skip over error checking for mis-matched operands (an
apparent programming choice made by the creators of Excel)... if one operand
is a string, then both are compared as strings. Why? There is really no way
for the program to "know" which type of comparison you really wanted if one
is a string representation of a number and the other is an actual number, so
it assumes for all cases, no matter what the operands are, if one of them is
a deliberately provided to the comparison as a string, then the Excel user
probably wanted to do a string comparison. If that is not what the
programmer wanted, then he/she has conversion functions that can used on the
operands to force a numerical comparison. This gives the Excel user the
maximum flexibility without having to deal with "extra" errors. Anyway, that
is how I make sense of it all.
 
While I know you know this, for the readers of this thread... string
comparisons are done character by character, left to right. So when
comparing "A">255, the actual comparison that Excel is performing is
"A">"2"... since that is True (the ASCII value of "A" is greater than the
ASCII value of "2"), no further characters need to be checked and a TRUE
result is returned from the comparison.
 
Rick Rothstein said:
While I know you know this, for the readers of this thread... string
comparisons are done character by character, left to right. So when
comparing "A">255, the actual comparison that Excel is performing
is "A">"2"...

Test your theory: then why does "1">3 return TRUE?

Answer: your theory is wrong. Text objects are always treated larger than
number objects. Period. As I have said repeatedly, there is no "logic" to
the choice. It is an arbitrary choice.


----- original message -----
 
Rick Rothstein said:
an apparent programming choice made by the creators of Excel

Which is all that I was saying.

This gives the Excel user the maximum flexibility without having to deal
with "extra" errors.

Excel is fickle about when it is and is not tolerant of "mismatches", as you
correctly describe it. Example: text in a range argument for SUMPRODUCT is
treated as a mismatch, resulting in a #VALUE error.

But IMHO, this particular choice (text is always larger than numbers) has
proved to be unfortunate, primarily because of common user error. How many
times in this forum have we had to explain (often only by inference) that
some comparison seems to work sometimes but not others because the cell
contains a numeric string instead of a number?

I am unwavering in my opinion that either Excel should treat text-number
comparisons as numeric expressions, resulting in a #VALUE error when the
text is not a numeric string, or Excel should treat it simply as a mismatch,
resulting in a #VALUE error all the time. Arguably, a "logical" third
choice is to treat the text as zero, as some Excel functions do. (I would
not like that choice in this context.)


----- original message -----
 
Hmm! You are right... there is no logic to it (a string is automatically
bigger than a number just doesn't make sense to me). What I posted is true
for the VB world... I made the mistaken assumption that it would apply over
in the worksheet world... and I was wrong!
 
Rick Rothstein said:
Hmm! You are right... there is no logic to it (a string is automatically
bigger than a number just doesn't make sense to me). What I posted is true
for the VB world... I made the mistaken assumption that it would apply over
in the worksheet world... and I was wrong!
....

Of course there's logic to it. You just may not appreciate the logic.

Excel comparisons are by TYPE first, then by value if the types on
both sides of the comparison operator are the same. For types, ERRORS
BOOLEAN > STRING > NUMERIC. Also note that Excel NEVER converts the
type of either operand in comparison operations, so "1" > 2 is TRUE
because "1" remains a STRING while 2 remains a NUMBER.

Within ERROR type, there's no order, i.e., Excel doesn't compare
different error types - the result of a comparison of different error
types is the error on the left side of the comparison operator.

Within BOOLEAN type, TRUE > FALSE.

Within STRING type, comparisons are made character by character USING
EXCEL'S CURRENT COLLATION SEQUENCE, *NOT* ASCII character codes.
That's why "Z" > "\" is TRUE while CODE("Z") > CODE("\") is FALSE, at
least for the standard US English collation sequence. This is also
different from VB[A].

Within NUMERIC type, comparisons work as expected given Excel's
limitations on representing real numbers.
 
Harlan Grove said:
Of course there's logic to it. You just may not appreciate the logic. [....]
ERRORS > BOOLEAN > STRING > NUMERIC.

Horsepucky! :-)

First, "ERROR > ..." is simply wrong. If ERROR > NUMERIC does not result in
TRUE. And if it did, then ERROR < NUMERIC would result in FALSE. Instead,
of course, both comparisons simply result in ERROR, neither TRUE nor FALSE.

(To give you the benefit of the doubt, perhaps you are using ">" to express
a precedence relationship. But this thread is about Excel's comparison
operator per se, not some similar-looking meta-notation.)

Second, you are simply regurgitating facts. "Logic" implies an explanation,
and in particular an explanation based on reasoning. "There are a
disproportionate number of black men in American prisons" is a statement of
fact. In response to "what's the logic behind that?" (or simply "why?"),
the answer is __not__ "because there is a higher percentage of black men in
prison than in the population". Well, that __is__ an answer; but it is not
what most people expect as a "logical" explanation. It is simply a
restatement of the fact. You didn't even give us that much.

BOOLEAN > STRING > NUMERIC is simply an arbitrary choice that Excel made.
(Actually, I suspect the choice was made by one of its predecessors in the
market.) There is no logic behind the choice. There is no "right" or
"wrong" choice based on reason.

(Well, I have offered a different choice that I think makes more sense (=
"logical") in some context. But given Excel's inconsistent treatment of
mixing boolean, text and numbers, it is hard for me say that it makes more
sense in Excel ;->.)


----- original message -----

Harlan Grove said:
Rick Rothstein said:
Hmm! You are right... there is no logic to it (a string is automatically
bigger than a number just doesn't make sense to me). What I posted is true
for the VB world... I made the mistaken assumption that it would apply
over
in the worksheet world... and I was wrong!
...

Of course there's logic to it. You just may not appreciate the logic.

Excel comparisons are by TYPE first, then by value if the types on
both sides of the comparison operator are the same. For types, ERRORS
BOOLEAN > STRING > NUMERIC. Also note that Excel NEVER converts the
type of either operand in comparison operations, so "1" > 2 is TRUE
because "1" remains a STRING while 2 remains a NUMBER.

Within ERROR type, there's no order, i.e., Excel doesn't compare
different error types - the result of a comparison of different error
types is the error on the left side of the comparison operator.

Within BOOLEAN type, TRUE > FALSE.

Within STRING type, comparisons are made character by character USING
EXCEL'S CURRENT COLLATION SEQUENCE, *NOT* ASCII character codes.
That's why "Z" > "\" is TRUE while CODE("Z") > CODE("\") is FALSE, at
least for the standard US English collation sequence. This is also
different from VB[A].

Within NUMERIC type, comparisons work as expected given Excel's
limitations on representing real numbers.
 
Back
Top