Week 3 Homework
What is interquartile range (IQR)?
Q3 minus Q1. The range of the middle 50% of the participants.
What is an advantage that the standard deviation has over the interquartile range?
An advantage of the standard deviation is that it uses all the observations in its computation.
Explain the circumstances for which the interquartile range is the preferred measure of dispersion.
The interquartile range is preferred when the data are skewed or have outliers.
The _______ represents the number of standard deviations an observation is from the mean.
Z-score
What can be said about a set of data with a standard deviation of 0?
All of the observations are the same value. If all observations have the same value, then that value will also be the mean of the data. Therefore, the sum of the squared differences from the mean will be 0, and the standard deviation will be 0.
T/F: When comparing two populations, the larger the standard deviation, the more dispersion the distribution has, provided that the variable of interest from the two populations has the same unit of measure.
True, because the standard deviation describes how far, on average, each observation is from the typical value. A larger standard deviation means that observations are more distant from the typical value, and therefore, more dispersed.
What are the steps for finding quartiles?
1. Arrange the data in ascending order. 2. Determine the median (M) or second quartile (Q2). 3. Divide the data into two halves. Q1 is the median of the bottom half. Q3 is the median of the top half.
What are the steps for checking for outliers using quartiles?
1. Determine Q1 and Q3. 2. Compute the IQR. 3. Determine fences (cutoff points). Lower fence = Q1 - (1.5 * IQR). Upper fence = Q3 + (1.5 * IQR). 4. If data value is less than the lower fence or greater than the upper fence, it is considered an outlier.
What does it mean when the 10th percentile of the weight of males 36 months of age in a certain city is 13.0 kg?
10% of 36-month-old males weigh 13.0 kg or less, and 90% of 36-month-old males weigh more than 13.0 kg.
If a variable has a distribution that is bell-shaped with mean 22 and standard deviation 3, then according to the Empirical Rule, 99.7% of the data will lie between which values?
13 (22-3-3-3) and 31 (22+3+3+3)
Which of the following are resistant measures of dispersion?
Interquartile Range
For a distribution that is skewed right, the median is _______ of the box.
Left of the center
A cellular phone company monitors monthly phone usage. The following data represent the monthly phone use in minutes of one particular customer for the past 20 months. (a) Determine the standard deviation and interquartile range of the data. (b) Suppose the month in which the customer used 325 minutes was not actually that customer's phone. That particular month the customer did not use their phone at all, so 0 minutes were used. How does changing the observation from 325 to 0 affect the standard deviation and interquartile range? (c) What property does this illustrate?
(a) s = 59.83, IQR = 84.5 (b) The standard deviation increases and the interquartile range is not affected. (c) Resistance
The U.S. Department of Housing and Urban Development (HUD) uses the median to report the average price of a home in the United States. Why do you think HUD uses the median?
HUD uses the median because the data are skewed to the right, and the median is better for skewed data.
The standard deviation is used in conjunction with the ______ to numerically describe distributions that are bell shaped. The ______ measures the center of the distribution, while the standard deviation measures the ______ of the distribution.
Mean, mean, spread
Which of the following are resistant measures of central tendency?
Median
What values make up the five-number summary?
Minimum value, Q1, Median, Q3, Maximum value
A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median? Why?
The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.
The sum of the deviations about the mean always equals _______.
Zero
For a distribution that is skewed left, the left whisker is _______ the right whisker.
Longer than
How do you calculate a z-score?
(value - mean) / standard deviation
The following is a summary of the violent-crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Q1 = 271.8 Q2 =387.4 Q3 =529.7 What do these numbers mean?
25% of the states have a violent-crime rate that is 271.8 crimes per 100,000 population or less. 50% of the states have a violent-crime rate that is 387.4 crimes per 100,000 population or less. 75% of the states have a violent-crime rate that is 529.7 crimes per 100,000 population or less.
According to the Empirical Rule, if a distribution is bell-shaped, then approximately _______ of the data will lie within 1 standard deviation of the mean; approximately _______ of the data will lie within 2 standard deviations of the mean; approximately _______ of the data will lie within 3 standard deviations of the mean.
68%, 95%, 99.7%
What does it mean if a statistic is resistant?
Extreme values (very large or small) relative to the data do not affect its value substantially.
T/F: A data set will always have exactly one mode.
False. The mode of a variable is the most frequent observation of the variable that occurs in the data set. To compute the mode, tally the number of observations that occur for each data value. The data value that occurs most often is the mode. A set of data can have no mode, one mode, or more than one mode. If no observation occurs more than once, the data have no mode.
T/F: The standard deviation is a resistant measure of spread.
False. Since extreme values will increase the standard deviation greatly, the standard deviation cannot be a resistant measure of spread.
Why is the median resistant, but the mean is not?
The mean is not resistant because when data are skewed, there are extreme values in the tail, which tend to pull the mean in the direction of the tail. The median is resistant because the median of a variable is the value that lies in the middle of the data when arranged in ascending order and does not depend on the extreme values of the data.
For a distribution that is symmetric, the left whisker is _______ the right whisker.
The same length as
T/F: The standard deviation can be negative.
False. There is no way that the calculation of the population or sample standard deviation can produce a negative number. This makes intuitive sense because the standard deviation measures the spread of the data from the mean.