Well, all those sentences appear to be true- I was hoping for an
clarification (or a retraction) of the original sentence, which appears
false since including a value (0 or otherwise) in a set of numbers to
average will always 'skew' the average from where it was without that value,
unless it happens to be equal to the average of the original set (a
condition not covered with "If you have positive and negative values"). So
your original sentence seems like it would only apply when the set of +
and - numbers averages to 0, then you can add all the 0s you want and the
average wont be skewed.
...
If the zero values are actually garbage rather than observed values, then they
shouldn't be included in the average. If the garbage values were -1 or +1, they
shouldn't be included either. However, if zero could be a legitmate observed
value as well as garbage, how would one decide which zeros to include and which
to exclude?
Unless the OP follows up to state what those zero values may be (noting that
blank cells are sometimes treated as zeros but are excluded from AVERAGE), I'm
going to assume they're legitimate observed values and, therefore, should be
included in any meaningful average, along with the positive and negative values.
You seem to be assuming that the zero values are obviously not observed values.
If there were only zeros and positive values, you'd have a basis for your belief
(and since I've already provided a formula to average only the positive values,
it's not an unreasonable inference that I might also believe that). However, if
the sample values include both positive and negative values, why would you
believe that zero values wouldn't also be possible observed values? If they were
observed values, then excluding them while including both positive *AND*
negative values would bias the average.
In a nut shell, we're arguing about whether the zero values are more reasonably
considered garbage or legitimate observed values. From my perspective, if there
were no negative values, then assuming zeros were garbage isn't unreasonable;
but if there were positive *and* negative values, then it's unreasonable just to
assume that all zeros are garbage rather than observed values. I'm not arguing
to include values that clearly aren't part of the sample, rather I'm arguing not
to exclude any values that fall between the highest and lowest sample values. In
other words, if the zeros are legit, excluding them biases the average. Unless
slight bias allows reduction of mean squared error (not obviously the case
here), bias is bad and should be avoided.