Statistics 2.4-2.6 Quiz
What is the primary disadvantage of using the range to compare the variability of data sets?
It is a rather insensitive measure of data variation.
Chebyshev's Rule
for any number k >_ 1, at least 100(1-1/k squared)% of the observations in any data set are within k standard deviations of the mean; the percentage value is typically conservative in that the actual percentages often considerably exceed the stated lower bound
Explain how the relationship between the mean and median provides information about the symmetry or skewness of the data's distribution.
The mean is affected by extreme values, while the median is not. If the data set is skewed to the right, then the median is less than the mean. If the data set is symmetric, the mean equals the median. If the data set is skewed to the left, the mean is less than the median.
Empirical Rule
The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve
Describe the sample variance using words rather than a formula. Do the same with the population variance.
The sample variance is the sum of the squared deviations from the mean divided by the number of measurements minus one. The population variance is the average of the squared distances of the measurements on all units in the population from the mean.
Can the variance of a data set ever be negative? Explain. Can the variance ever be smaller than the standard deviation? Explain.
The variance of a data set cannot be negative because it is the sum of the squared deviations divided by a positive value. Variance can be smaller than the standard deviation if the variance is less than 1.