Chapter 4: Summarizing Data: Variability
How does calculating the sample variance differ from calculating the population variance?
The denominator for sample variance is (n-1), not N.
A sample of 60 scores is distributed with SS=240. What is the sample variance and sample standard deviation for this distribution?
s²= 240 ÷ (60-1)= 4.07; SD=√4.07=2.02
A researcher measures the following data: 3,3,3,4,4,4. What is the sample variance for these data?
s²=0.3
When data are divided into four equal parts the data are split into ___.
Quartiles
A researcher collects the following scores: 1,2,3,4,5,6,7,8. What is the range of these scores?
Range = 8-1=7
___ are four equal parts or sections, each containing 25% of the data.
quartiles
The ___ is most informative for data sets without outliers.
range
The ___ is the difference between the largest value (L) and smallest value (S) in a data set.
range
The ___ ___ ___ is the sum of the squared deviations of scores from their mean. The SS is the numerator in the variance formula.
sum of squares (SS)
If the value of the SS remains constant, state whether each of the following will increase, decrease, or have no effect on the sample variance. (a) the sample size increases. (b) the degrees of freedom decrease (c) the size of the population increases
(a) decrease (b) increase (c) no effect
Suppose the population variance for a given population is 36. If we select all possible samples of a certain size from this population, then, on average, what will be the value of the sample variance?
36
For normal distributions, most scores ___ fall within one standard deviation of the mean, and almost all scores ___ fall within three standard deviation of the mean.
68%, 99.7%
How many scores are free to vary in a sample?
All scores, except one, are free to vary.
A researcher records the number of times that 10 students cough during a final exam. He records the following data: 0,0,0,3,3,5,5,7,8,11. True/False: In this example, the range will be smaller than the interquartile range.
False (The range will be larger than the IQR)
True/False: The mean is a preferred measure of variability because all scores are included in its computation.
False (The variance is the preferred measure)
A ___ is the difference of each score from its mean.
deviation
The empirical rule is stated for data with what type of distribution?
Normal distribution
How do you compute standard deviation?
Take the square root of the variance.
How is the computational formula different from the definitional formula for variance?
The computational formula does not require that we compute the mean to compute SS in the numerator.
The standard deviation is a measure used to determine the average distance that each score deviates from ___.
The mean
State four characteristics of the standard deviation.
The standard deviation is always positive; is used to describe quantitative variables, typically reported with the mean; and is affected by the value of every score in a distribution.
What does the standard deviation measure?
The standard deviation measures the average distance that scores deviate from their mean.
The expression x - µ is a ___; it is the difference of each score from its mean.
deviation
Why do we square each deviation in the numerator of variance?
We want to compute how far scores are from their mean without ending up with a solution equal to zero every time, and taking the square root of variance can correct for squaring each deviation in the numerator.
At least ___ of all scores lie within one standard deviation of the mean. a. 68% b. 95% c. 99.7%
a. 68%
At least ___ of all scores lie within two standard deviations of the mean. a. 68% b. 95% c. 99.7%
b. 95%
When we divide SS by (n-1), the sample variance is an ___ estimator of the population variance. This is our reason why researchers place (n-1) in the denominator of sample variance. a. biased b. unbiased c. neither are correct
b. unbiased
The ___ ___ is the range of values between the upper (Q₃) and lower (Q₁) quartiles of a data set.
interquartile range (IQR)
The standard deviation is most informative when reported with the ___.
mean
___ ___ is a measure of variability for the average squared distance that scores in a sample deviate from the mean. It is computed when only a portion or sample of data is measured in a population.
sample variance
The ___ standard deviation is a measure of variability for the average distance that scores in a population deviate from their mean. It is calculated by taking the square root of the population variance.
population
The ____ variance is represented by the square of the Greek symbol σ (or σ², stated as "sigma squared").
population
True/False: The value for the standard deviation is affected by the value of each score in a distribution.
true
How many scores are included to compute the range?
two scores (the largest and smallest score in a distribution)
The ___ ___ is a measure of variability for the average distance that scores deviate from their mean. It is calculated by taking the square root of the variance.
standard deviation
The ___ ___ is the square root of the variance. It is used to determine the average distance that scores deviate from their mean.
standard deviation
A behavioral scientist measures attention in a sample of 31 participants. To measure the variance of attention, she computes SS=120 for this sample. Compute the variance and standard deviation.
s²=120÷30=4.0; SD=√4=2
A researcher measures the following sample of scores (n=3): 1,4,7. Use the computational formula to calculate variance.
9
A researcher measures the following sample of scores (n=3): 1,4,7. Use the definitional formula to calculate variance.
9
True/False: Calculations of the range consider only the largest value in a data set.
False (only the largest value and smallest value in a data set)
What is the formula for finding the range?
Range = Longest Value - Shortest Value
A social psychologist records the age (in years) that a sample of eight participants first experienced peer pressure. The recorded ages for the participants are 14,20,17,16,12,16,15,16. Compute the SS, the variance, and the standard deviation for this sample using the definitional and computational formula.
SS=77.50; S²=5.36; SD=2.31
An instructor measures the following quiz scores; 6,8,7,9 (SD=1.29). If the instructor subtracts two points from each quiz score, how will the value for the standard deviation change?
The standard deviation will not change (SD=1.29)
Why is variance, as a measure of variability, preferred to the range, the IQR, and the SIQR?
The variance is preferred because it includes all scores to estimate variability
What does it mean to say that the sample variance is unbiased?
The variance of the sample will equal the variance of the population from which the sample was selected, on average.
True/False: Multiplying or dividing each score using the same constant will cause the standard deviation to change by that constant.
True
True/False: The range is the difference between the largest value and smallest value in a distribution.
True
True/False: The sample variance is unbiased when dividing SS by (n-1)
True
True/False: To compute the sample variance, we divide the SS by one less than the sample size (n-1)
True
Adding or subtracting the same constant to each score will not change the distance that scores deviate from the mean. Hence, the standard deviation remains ___. a. changed b. unchanged c. neither are correct
b. unchanged
A(n) ___ estimator is any sample statistic, such as the sample variance when we divide SS by n, obtained from a randomly selected sample that does not equal the value of its respective population parameter, such as a population mean, on average.
biased
___ ___ is a measure of variability for the average squared distance that scores in a population deviate from the mean. It is computed only when all scores in a given population are recorded.
population variance
Computations of SS are the same for ___ variance and ___ variance. The change in computation is whether we divide SS by N or n-1.
population, sample
The standard deviation is always ___.
positive
A(n) ___ estimator is any sample statistic, such as the sample variance, such as the sample variance when we divide SS by n-1, obtained from a randomly selected sample that equals the value of its respective population parameter, such as a population variance, on average.
unbiased
___ is a measure of the dispersion or spread of scores in a distribution and ranges from 0 to +∞.
variability
The computational formula for ___ is a quicker way to compute variance by hand.
variance
___ is a measure of variability for the average squared distance that scores deviate from their mean.
variance
The population variance is 121. What is the standard deviation for this population?
11
The sample variance is 121. What is the standard deviation for this sample?
11
Why is the variance a preferred measure of variability?
Because it includes all scores in its computation.
___ theorem defines the percentage of data from any distribution that will be contained within any number of standard deviations (where SD>1).
Chebyshev's
True/False: A scientist measures the following data: 23,23,23,23,23,23. The value for the sample variance and population variance will be the same.
True (because the variance is 0)
A student computes a variance of 9. Will the standard deviation differ if 9 is the value for a population versus a sample variance? Explain.
No, in both cases, the standard deviation is the square root of the variance. Hence, the population and the sample standard deviation will be √9=3.0
A school administrator has students rate the quality of their education on a scale from 1 (poor) to 7 (exceptional). She claims that 99.7% of students rated the quality of their education between 3.5 and 6.5. It the mean rating is 5.0, then what is the standard deviation assuming the data are normally distributed?
SD=0.5
A psychologist measures a sample of scores on a love quiz, where SD=4 points. State the new value for SD if the psychologist adds 2 points to each quiz score.
SD=4
A psychologist measures a sample of scores on a love quiz, where SD=4 points. State the new value for SD if the psychologist doubles each quiz score.
SD=8
The __ produces the smallest possible positive value for deviations of scores from their mean. It is computed in the same way for sample variance and population variance.
SS
Describe the sum of squares (SS) in words.
SS is the sum of the squared deviations of scores from their mean
A researcher measures the following scores: 12, 14, 16, 18, 20. Compute the SS for these data.
SS=40
True/False: The interquartile range (IQR) is the range of scores between the upper and lower quartiles of a distribution.
True
True/False: Variance can be computed for data in populations and samples.
True
True/False: When all scores in a population are the same, the variance will always be equal to 0.
True
True/False: With identical data sets, the definitional and computational formula for sample variance will always produce the same solution, give or take rounding error.
True
At least ___ of all scores lie within three standard deviations of the mean. a. 68% b. 95% c. 99.7%
c. 99.7%
Each of the following is a characteristic of standard deviation, except: a. the standard deviation is always positive. b. the standard deviation is affected by the value of every score in a distribution. c. the standard deviation is used to describe qualitative variables. d. standard deviations are almost always reported with the mean.
c. the standard deviation is used to describe qualitative variables.
The ___ for variance is a way to calculate the population variance and sample variance without needing to sum the squared differences of scores from their mean to compute the SS in the numerator.
computational formula
To compute the population variance a. the SS is added to the population size (N) b. the SS is subtracted from the population size (N) c. the SS is multiplied by the population size (N) d. the SS is divided by the population size (N)
d. the SS is divided by the population size (N)
The ___ for variance is a way to calculate the population variance and sample variance that requires summing the squared differences of scores from their mean to compute the SS in the numerator.
definitional formula
The ___ for sample variance are the number of scores in a sample that are free to vary. All scores except one are free to vary in a sample: n-1.
degrees of freedom (df)
A behavioral scientist measures attention in a sample of 31 participants. To measure the variance of attention, she computes SS=120 for this sample. What are the degrees of freedom for variance?
df=30
The ___ ___ states that for data that are normally distributed, at least 99.7% of data lie within three standard deviations of the mean, at least 95% of data lie within two standard deviations of the mean, and at least 68% of data lie within one standard deviation of the mean.
empirical rule
The standard deviation is used to describe ___ data.
quantitative
Another name for computational formula for variance is ___.
raw scores method for variance
Another name for the standard deviation is___.
root mean square deviation
The ___ standard deviation is a measure of variability for the average distance that scores in a sample deviate from their mean. It is calculated by taking the square root of the sample variance.
sample
The degrees of freedom for ___ variance tell us that all scores are free to vary in a sample except one (n-1). This term is placed in the denominator of the formula for sample variance.
sample
The ___ ___ is a measure of half the distance between the upper quartile (Q₃) and lower quartile (Q₁) of a data set, and is computed by dividing the IQR in half.
semi-interquartile range (SIQR)
A researcher records five scores: 3,4,5,6,x. If the mean in this distribution is 5, then what is the value for x?
x=7
How many standard deviations from the mean will contain at least 99% of data for any type of distribution?
±10 SD
Suppose you want to determine how much your six closest friends actually know about you. This is the group of interest to you; your six closest friends constitute the population of interest. You quiz all six close friends about 10 facts you think they should know about you. Their scores on the quiz are 5, 10, 3, 7, 2, 3. Calculate the population variance (σ²) of these scores.
σ²=7.67 Step 1: Calculate the SS µ= (5+10+3+7+2+3) ÷ 6 = 5. Then, x=5:(5-5)²=0; x=10:(10-5)²=25; x=3:(3-5)²=4 x=7:(7-5)²=4; x=2:(2-5)²=9; x=3:(3-5)²=4 Sum: SS=0+25+4+4+9+4=46 Step 2: Divide the SS by the population size. The population variance is SS divided by N. σ²= SS÷N = 46 ÷ 6 =7.67
A researcher selects a population of eight scores where SS=72. What is the population variance in this example?
σ²=9