Can someone explain how a SD of 2.7 is
different from a SD of 4.8.
There is not much that can said about an SD of 2.7 v. 4.8 per se.
SD is a measure of deviation from the mean over the entire data. A
larger SD means the data is more widely dispersed relative to the
mean.
But even though 4.8 is about 1.78 times greater than 2.7, it does not
mean the difference is significant. For example, if the mean is
10000, some portion of the data is between 10000+/-2.7 v. 10000+/-4.8
-- a small difference. On the other hand, if the mean is 5, some
portion of the data is between 5+/-2.7 v. 5+/-4.8 -- a large
difference.
So in general, it is useful to look at the __relative__ SD: SD
divided by the mean.
I read a SD of 3 includes 99% of the entries. But I do not get it.
You are not alone. Most people misuse such information without
understanding what they are talking about.
If you have a "normal" distribution, about 99.7% of the data is within
+/-3*SD of the mean. Note that that is 3*SD, not an SD of 3. Using
your numbers, 99.7% of the data would be within +/-8.1 or +/-14.4 of
the mean if the SD is 2.7 or 4.8 respectively.
But the operative words are "if you have 'normal' distribution".
Here, "normal" does not mean "typical" or "not abnormal". A "normal"
distribution is the characteristic of a specific curve derived from
the frequency of the data divided into "buckets" -- a histogram.
So in order to derive specific conclusions about the dispersion of
data -- that is, x% is within +/-y times SD of the mean -- first you
need to know if you have a "normal" distribution.
However, that is not to say that the SD is undefined and useless for
other kinds of distributions. The SD is still the average deviation
from the mean. It's just that we can say more about the data with
certain distributions -- the "normal" distribution being the most
common one.
Having said that, there are certain kinds of data that we know
__tend_to__ have a "normal" distribution. Therefore, sometimes we
assume a "normal" distribution even when the sample of data might not
be "normally" distributed. Nevertheless, it is prudent to look at a
histogram of the data sample to be sure that it is "somewhat" normally
distributed.
For example, it has been proven mathematically that if we take large-
enough samples of data, the averages of the samples are "normally"
distributed regardless of how the data is distributed. This is what
allows statisticians to say that there is +/-x% error in their
results.