Standard Deviation Question

  • Thread starter Thread starter Skip
  • Start date Start date
S

Skip

I can not figure SD out.
I have two sets of numbers (both with 50 entries)

Set 1 the Average is 8.2 and the SD is 3.98
Set 2 the Average is 8.7 and the SD is 4.89
What do you know about the 2 sets from this information?
Thanks
 
It can give you and indication how much the numbers in the set vary.

In set 1, about 2/3 of the numbers fall between 4.2 and 12.2
In set 2, about 2/3 of the numbers fall between 3.8 and 13.6

Set 1 seems to be a little more consistent.
(high school or college class?)
--
Jim Cone
Portland, Oregon USA
http://www.mediafire.com/PrimitiveSoftware

..
..
..

"Skip" <[email protected]>
wrote in message
I can not figure SD out.
I have two sets of numbers (both with 50 entries)

Set 1 the Average is 8.2 and the SD is 3.98
Set 2 the Average is 8.7 and the SD is 4.89
What do you know about the 2 sets from this information?
Thanks
 
It can give you and indication how much the numbers in the set vary.
In set 1, about 2/3 of the numbers fall between 4.2 and 12.2
In set 2, about 2/3 of the numbers fall between 3.8 and 13.6

Typical misunderstanding or jump to conclusion that most people make
about the standard deviation.

What you wrote would be correct (almost) __only_if__ both sets of data
(or the population from which sample data was taken) are "normally
distributed".

Skip said nothing about that; and it is incorrect to jump to that
conclusion.

Set 1 seems to be a little more consistent.

That much is a correct, given the similar magnitude of the two
averages.

But if the average for set 2 were, say, 87, set 2 would be "more
consistent" (less widely dispersed) despite the larger SD.
 
I can not figure SD out.
 I have two sets of numbers (both with 50 entries)
Set 1 the Average is 8.2 and the SD is 3.98
Set 2 the Average is 8.7 and the SD is 4.89
What do you know about the 2 sets from this information?

See my more complete response to your earlier question in the thread
at http://groups.google.com/group/micr...functions/browse_frm/thread/08856119bd7ba24a#.

SD is a measure of the average differences from the mean of the data.

Given the similar magnitude of the means of the two sets, the larger
SD suggests that the data in set 2 are more widely dispersed -- or at
least further from the mean.

The reason that I equivocate is: in set 1, half the data could be
4.22 and half the data could be 12.18; and in set 2, half the data
could be 3.81 and half the data could be 13.59. Is either really
"more widely disperse"?

Rhetorical question; the answer depends on your definition of "widely
dispersed".

In both sets, the data are hypothetically organized into two
clusters. The only difference is: the clusters in set 2 are farther
apart than the clusters in set 1.

In any case, the point is: the average and SD tell us very little out
of context.

It would tell us more if you knew (or reasonably assumed) that the
data -- or the population from which the sample data was randomly
selected --- is "normally distributed".

Here, "normal" does not mean "typical" or "not unusual". It is a
technical term that describes a particular organization of the data --
the so-called "bell curve". See http://en.wikipedia.org/wiki/Normal_distribution.
 
I can not figure SD out.
I have two sets of numbers (both with 50 entries)

Set 1 the Average is 8.2 and the SD is 3.98
Set 2 the Average is 8.7 and the SD is 4.89
What do you know about the 2 sets from this information?

The first set averages a little less than the second and has a little
less variability. If these were two quizzes given to the same
students, for example, the first quiz would be a little harder and
there would be less difference between good and bad performers.

Without knowing more about the data, that's about all you can say.
It is *not* correct to say that 2/3 of the numbers are between 4.2
and 12.2 in set 1, because that rule applies only to the normal
distribution (bell curve), and you haven't been told it's a normal
distribution.
 
It can give you and indication how much the numbers in the set vary.

In set 1, about 2/3 of the numbers fall between 4.2 and 12.2
In set 2, about 2/3 of the numbers fall between 3.8 and 13.6

"Objection, Your Honor! Assuming facts not in evidence!"

The empirical rule or 68-95-99.7 rule (68% of data, about 2/3, are
within one s.d. of the mean) applies only to normal distributions,
and the OP didn't tell us it was a normal distribution. Even if it
was drawn from a normal distribution, a measly 50 data points are
going to have a lot of bumpiness.
Set 1 seems to be a little more consistent.

This is correct. If the two were two quizzes given to the same
students, then the second one, having more variability, does a better
job of differentiating between students who know the subject well and
those who don't.
 
Back
Top