3-2
Round-off Rule for the Coefficient of Variation
Round the coefficient of variation to one decimal place (such as 25.3%).
Notation Summary
s = sample standard deviation s² = sample variance σ = population standard deviation σ ² = population variance
Notation
s = sample standard deviation σ = population standard deviation
Standard Deviation of a Population
A different formula is used to calculate the standard deviation σ of a population: Instead of dividing by n − 1 for a sample, we divide by the population size N.
Range
The range of a set of data values is the difference between the maximum data value and the minimum data value Range = (maximum data value) − (minimum data value)
Chebyshev's Theorem
- The proportion of any set of data lying within K standard deviations of the mean is always at least 1-1/k^2 Where K is any positive number greater than 1 - For K=2 and K=3, we get the following statements: - At least 3/4 (or 75%) of all values lie within 2 standard deviations of the mean - At least 8/9 (or 89%) of all values lie within 3 standard deviations of the mean
Biased and Unbiased Estimators
- The sample standard deviation s is a biased estimator of the population standard deviation s, which means that values of the sample standard deviation s do not tend to center around the value of the population standard deviation σ. - The sample variance s² is an unbiased estimator of the population variance σ², which means that values of s² tend to center around the value of σ² instead of systematically tending to overestimate or underestimate σ².
Range Rule of Thumb for Estimating a Value of the Standard Deviation s
-Range Rule of Thumb for Estimating a Value of the Standard Deviation -To roughly estimate the standard deviation from a collection of known sample data, use
Range Rule of Thumb for Identifying Significant Values
-Significantly low values are µ − 2σ or lower. -Significantly high values are µ + 2σ or higher. -Values not significant are between (µ − 2σ ) and (µ + 2σ).
Important Property of Range
-The range uses only the maximum and the minimum data values, so it is very sensitive to extreme values. The range is not resistant. - Because the range uses only the maximum and minimum values, it does not take every value into account and therefore does not truly reflect the variation among all of the data values.
Important Properties of Standard Deviation
-The standard deviation is a measure of how much data values deviate away from the mean. -The value of the standard deviation s is never negative. It is zero only when all of the data values are exactly the same. -Larger values of s indicate greater amounts of variation. -The standard deviation s can increase dramatically with one or more outliers. -The units of the standard deviation s (such as minutes, feet, pounds) are the same as the units of the original data values. -The sample standard deviation s is a biased estimator of the population standard deviation σ, which means that values of the sample standard deviation s do not center around the value of σ.
Important Properties of Variance
-The units of the variance are the squares of the units of the original data values. -The value of the variance can increase dramatically with the inclusion of outliers. (The variance is not resistant.) -The value of the variance is never negative. It is zero only when all of the data values are the same number. -The sample variance s² is an unbiased estimator of the population variance σ².
Variance
-The variance of a set of values is a measure of variation equal to the square of the standard deviation. -Sample variance: s² = square of the standard deviation s. -Population variance: σ² = square of the population standard deviation σ.
Why Divide by (n - 1)?
-There are only n − 1 values that can assigned without constraint. With a given mean, we can use any numbers for the first n − 1 values, but the last value will then be automatically determined. -With division by n − 1, sample variances s² tend to center around the value of the population variance σ²; with division by n, sample variances s² tend to underestimatethe value of the population variance σ².
Coefficient of Variation
The coefficient of variation (or CV) for a set of nonnegative sample or population data, expressed as a percent, describes the standard deviation relative to the mean, and is given by the following: (picture)
Empirical Rule for Data with a Bell-Shaped Distribution
The empirical rule states that for data sets having a distribution that is approximately bell-shaped, the following properties apply. -About 68% of all values fall within 1 standard deviation of the mean. -About 95% of all values fall within 2 standard deviations of the mean. -About 99.7% of all values fall within 3 standard deviations of the mean.
Range Rule of Thumb
The range rule of thumb is a crude but simple tool for understanding and interpreting standard deviation. The vast majority (such as 95%) of sample values lie within 2 standard deviations of the mean.
Standard Deviation of a Sample
The standard deviation of a set of sample values, denoted by s, is a measure of how much data values deviate away from the mean.
Round-off Rule for Measures of Variation
When rounding the value of a measure of variation, carry one more decimal place than is present in the original set of data.