average, eliminating zero values

Chris · May 14, 2004

I am trying to calculate averages for rows and I don't
want the zero values contained within the rows to schew
the average. I have been trying the following formula
which I believed would work but does not:

=average(if(D9:AD9 <> 0,D9:AD9,""))

have any ideas?

thanks

Dave R. · May 14, 2004

that formula will work fine as long as you're entering it with CONTROL SHIFT
ENTER and not just 'enter'

the ,"" isn't necessary though

=AVERAGE(IF(D9:AD9<>0,D9:AD9))

in the formula bar will look like
{=AVERAGE(IF(D9:AD9<>0,D9:AD9))}

when entered correctly.

Harlan Grove · May 14, 2004

I am trying to calculate averages for rows and I don't
want the zero values contained within the rows to schew
the average. I have been trying the following formula
which I believed would work but does not:

=average(if(D9:AD9 <> 0,D9:AD9,""))

have any ideas?

If you want to average only positive values, try the array formula

=AVERAGE(IF(D9:AD9>0,D9:AD9))

which you need to enter by holding down [Ctrl] and [Shift] keys before pressing
the [Enter] key. If you have positive and negative values, zero values don't
skew the average.

Dave R. · May 14, 2004

Harlan Grove said:
the [Enter] key. If you have positive and negative values, zero values don't
skew the average.

Can you explain that last sentence?

Harlan Grove · May 14, 2004

Harlan Grove said:
Harlan Grove said:

[. . .] If you have positive and negative values, zero values don't
skew the average.

Click to expand...

Can you explain that last sentence?

If there are positive and negative values in set of observations to be averaged,
then it's possible the average will be zero. If the numbers come from a
continuous distribution, then it's necessarily the case that zero is also a
valid possible value. Excluding valid values from an average biases the result.

Dave R. · May 14, 2004

Well, all those sentences appear to be true- I was hoping for an
clarification (or a retraction) of the original sentence, which appears
false since including a value (0 or otherwise) in a set of numbers to
average will always 'skew' the average from where it was without that value,
unless it happens to be equal to the average of the original set (a
condition not covered with "If you have positive and negative values"). So
your original sentence seems like it would only apply when the set of +
and - numbers averages to 0, then you can add all the 0s you want and the
average wont be skewed.

Harlan Grove said:
Harlan Grove said:

[. . .] If you have positive and negative values, zero values don't
skew the average.

Click to expand...

Can you explain that last sentence?

Click to expand...

If there are positive and negative values in set of observations to be averaged,
then it's possible the average will be zero. If the numbers come from a
continuous distribution, then it's necessarily the case that zero is also a
valid possible value. Excluding valid values from an average biases the result.

Harlan Grove · May 14, 2004

Well, all those sentences appear to be true- I was hoping for an
clarification (or a retraction) of the original sentence, which appears
false since including a value (0 or otherwise) in a set of numbers to
average will always 'skew' the average from where it was without that value,
unless it happens to be equal to the average of the original set (a
condition not covered with "If you have positive and negative values"). So
your original sentence seems like it would only apply when the set of +
and - numbers averages to 0, then you can add all the 0s you want and the
average wont be skewed.

...

If the zero values are actually garbage rather than observed values, then they
shouldn't be included in the average. If the garbage values were -1 or +1, they
shouldn't be included either. However, if zero could be a legitmate observed
value as well as garbage, how would one decide which zeros to include and which
to exclude?

Unless the OP follows up to state what those zero values may be (noting that
blank cells are sometimes treated as zeros but are excluded from AVERAGE), I'm
going to assume they're legitimate observed values and, therefore, should be
included in any meaningful average, along with the positive and negative values.

You seem to be assuming that the zero values are obviously not observed values.
If there were only zeros and positive values, you'd have a basis for your belief
(and since I've already provided a formula to average only the positive values,
it's not an unreasonable inference that I might also believe that). However, if
the sample values include both positive and negative values, why would you
believe that zero values wouldn't also be possible observed values? If they were
observed values, then excluding them while including both positive *AND*
negative values would bias the average.

In a nut shell, we're arguing about whether the zero values are more reasonably
considered garbage or legitimate observed values. From my perspective, if there
were no negative values, then assuming zeros were garbage isn't unreasonable;
but if there were positive *and* negative values, then it's unreasonable just to
assume that all zeros are garbage rather than observed values. I'm not arguing
to include values that clearly aren't part of the sample, rather I'm arguing not
to exclude any values that fall between the highest and lowest sample values. In
other words, if the zeros are legit, excluding them biases the average. Unless
slight bias allows reduction of mean squared error (not obviously the case
here), bias is bad and should be avoided.

average, eliminating zero values

Chris

Dave R.

Harlan Grove

Dave R.

Harlan Grove

Dave R.

Harlan Grove