Stats. Ch. 3
Coefficient of Variance
- allows us to compare the variation of two or more different variables (because there are no units due to it canceling out) - larger CV = riskier
Sample Variance (Define and give formula.)
- an approximate average of the squared deviations of the data values from the sample mean S^2 = (Σ(x - x̅)^2) ÷ (n - 1)
Interquartile Range (Define and give formula.)
- measures the spread of the middle 50% of an ordered data IQR = Q3 - Q1
Variance and Standard Deviation (What they provide, when are they small, and when are they large.)
- provides information on how the data vary about the mean - both will be small when data are clustered about the mean - both are large when the data are widely scattered about the mean
Population Standard Deviation (Define and give formula.)
- the (+) square root of the population variance σ = √σ^2
Sample Standard Deviation (Define and give formula.)
- the (+) square root of the sample variance; tells the average distance data falls from the mean S = √S^2
The Mean Absolute Deviation (MAD) (Define and give formula.)
- the average of the absolute deviations from the mean of the data set MAD = (Σ |x - x̅|) ÷ n
Population Variance (Define and give formula.)
- the average squared deviations of the data values from the population mean σ = (Σ(x - μ)^2) ÷ N
Range (Define and give formula.)
- the difference between the largest and smallest values in the data set - simplest measure of dispersion - not a very useful measure of spread/variance because it does not use all of the information in a data set R = largest value - smallest value
If all observations have the same value and sample variance (sample deviation) = ?
0 (no variability)
Population Coefficient of Variance Formula
Population CV = (σ ÷ μ) x 100%
Sample Coefficient of Variance Formula
Sample CV = (s ÷ x̅) x 100%
What are the most common and useful measures of variability?
Variance and Standard Deviation
One Sigma Rule
about 68% of the data lies within 1 standard deviation from the mean
Two Sigma Rule
about 95% of the data lies within 2 standard deviations from the mean
Three Sigma Rule
about 99.7% of the data lies within 3 standard deviations from the mean
Empirical Rule
gives some general statements relating to the mean and the standard deviations of a bell-shaped distribution
If Data Set A's MAD > Data Set B's MAD, then the values in Data Set A are _____ (variable) then the values in Data Set B.
more spread out
A measure of variability for a collection of data values is a number that is meant to convey the idea of spread for the data set. The most commonly used measures of variability for sample data are the _____, _____, _____, _____ or _____, and _____. Are affected by outliers.
range, interquartile range, mean absolute deviation, variance or standard deviation, and coefficient of variation