Chapter 2 Homework
Use the given frequency distribution to find the (a) class width. (b) class midpoints. (c) class boundaries. Temperature Frequency 40-44 ----------1 45-49 ----------3 50-54 ----------5 55-59 ----------11 60-64 ----------7 65-69 ----------7 70-74 ----------1
(a) 5 (b) 42,47,52,57,62,67,72 (c) 39.5-44.5,44.5-49.5,49.5-54.5,54.5-59.5,59.5-64.5,64.5-69.5,69.5-74.5
During a quality assurance check, the actual contents (in grams) of six containers of protein powder were recorded as 1531, 1530, 1498, 1510, 1526, and 1509. (a) Find the mean and the median of the contents. (b) The third value was incorrectly measured and is actually 1511. Find the mean and the median of the contents again. (c) Which measure of central tendency, the mean or the median, was affected more by the data entry error?
(a) The mean is 1517.3 The median is 1518 (b) The mean is 1519.5 The median is 1518.5 (c) After correcting the error, the mean increased by 2.2 and the median increased by .5. The mean was affected more.
Use the stem-and-leaf plot to list the actual data entries. What is the maximum data entry? What is the minimum data entry? Key: 2 | 8= 28 2 | 8 3 | 3 4 | 1 2 2 5 7 7 8 5 | 0 1 1 2 3 3 3 4 4 4 4 5 6 6 8 9 6 | 8 8 8 7| 3 8 8 8 | 5
Actual Data Entries: 28,33,41,42,42,45,47,47,48,50, 51,51, 52,53, 53,53, 54,54, 54,54, 55,56, 56,58, 59, 68,68,68,73,78,78,85 Maximum data: 85 (largest number) Minimum Data: 28 (smallest number)
The coefficient of variation CV describes the standard deviation as a percent of the mean. Because it has no units, you can use the coefficient of variation to compare data with different units. Find the coefficient of variation for each sample data set. What can you conclude? CV= Standard deviation/Mean *100%
CV height = 7.0% CV weight = 10.1% Weight is more variable the height.
Use the box-and-whisker plot to identify the five-number summary.
Min = 11 Q1= 14 Q2= 16 Q3= 17 Max= 22
(a) Find the five-number summary, and (b) draw a box-and-whisker plot of the data. 3 8 8 5 2 9 8 7 9 6 9 3 2 6 2 9 8 7 7 9
Min = 2 Q1= 4 Q2= 7 Q3= 8.5 Max= 9
What is the difference between relative frequency and cumulative frequency?
Relative frequency of a class is the percentage of the data that falls in that class, while cumulative frequency of a class is the sum of the frequencies of that class and all previous classes.
How is a Pareto chart different from a standard vertical bar graph?
The bars are positioned in order of decreasing height with the tallest bar on the left.
T OR F The difference between two consecutive midpoints is equal to the class width.
The statement is true. The midpoint is the sum of the lower and upper limits of the class divided by two. The distance between two consecutive lower (or upper) class limits is equal to the class width.
T OR F : A data set can have the same mean, median, and mode.
True
T OR F : The mean is the measure of central tendency most likely to be affected by an outlier.
True
T OR F : When each data class has the same frequency, the distribution is symmetric.
True
T OR F Class boundaries ensure that consecutive bars of a histogram touch.
True
The medal counts for five countries at a recent international sports competition include countries A (41 medals), B (68 medals), C (114 medals), D (63 medals), and E (86 medals). Use a Pareto chart to display the data. Describe any patterns.
Which Pareto chart below displays the data? (Photo) Country C won the greatest number of medals and country A won the fewest.
The numbers of courses taught per semester by a random sample of university professors are shown in the histogram. Make a frequency distribution for the data. Then use the table to estimate the sample mean and the sample standard deviation of the data set.
f= 3,19,21,15 The sample mean is x= 2.8 The sample standard deviation is s= .9
T OR F In a frequency distribution, the class width is the distance between the lower and upper limits of a class.
False. In a frequency distribution, the class width is the distance between the lower or upper limits of consecutive classes.
T OR F The second quartile is the mean of an ordered data set.
False. The second quartile is the median of an ordered data set.
Without performing any calculations, determine which measure of central tendency best represents the graphed data. Explain your reasoning.
The mean is the best measure because the data are approximately symmetric.
The numbers of regular season wins for 10 football teams in a given season are given below. Determine the range, mean, variance, and standard deviation of the population data set. 2,8,15,5,11,6,11,8,3,6
The range is 13 The population mean is 7.5 The population variance is 14.3 The population standard deviation is 3.8
T OR F It is impossible to have a z-score of 0.
The statement is false. A z-score of 0 is a standardized value that is equal to the mean.
After constructing an expanded frequency distribution, what should the sum of the relative frequencies be? Explain.
The sum should be 1, since in the sum of the relative frequencies ∑f/n, the numerator will be n=∑f.
Why is the standard deviation used more frequently than the variance?
The units of variance are squared. Its units are meaningless.
The scores and their percent of the final grade for a statistics student are given. What is the student's weighted mean score?
The student's weighted mean score is 93.4
Given a data set, how do you know whether to calculate σ or s?
When given a data set, one would have to determine if it represented the population or if it was a sample taken from the population. If the data are a population, then σ is calculated. If the data are a sample, then s is calculated.
Use the accompanying data set to complete the following actions. a. Find the quartiles. b. Find the interquartile range. c. Identify any outliers. 40 52 37 44 41 37 39 46 43 37 34 55 44 35 15 51 39 50 30 30
(a) Find the quartiles. Q1 = 36 Q2 = 39.5 Q3 = 63 (b) Find the interquartile range. 9 c. Identify any outliers. There exists at least one outlier in the data set at 15
Use the accompanying data set to complete the following actions. (a) Find the quartiles. (b) Find the interquartile range. (c) Identify any outliers. 63 58 57 59 64 62 61 56 62 57 58 54 57 58 75
(a) Find the quartiles. Q1 = 57 Q2 = 58 Q3 = 62 (b) Find the interquartile range. 5 c. Identify any outliers. There exists at least one outlier in the data set at 75
Use the Empirical Rule. The mean speed of a sample of vehicles along a stretch of highway is 65 miles per hour, with a standard deviation of 5 miles per hour. Estimate the percent of vehicles whose speeds are between 55 miles per hour and 75 miles per hour. (Assume the data set has a bell-shaped distribution.)
Approximately 95% enter your response here% of vehicles travel between 55 miles per hour and 75 miles per hour.
What is the difference between class limits and class boundaries?
Class limits are the least and greatest numbers that can belong to the class. Class boundaries are the numbers that separate classes without forming gaps between them. For integer data, the corresponding class limits and class boundaries differ by 0.5.
Explain how to find the range of a data set. What is an advantage of using the range as a measure of variation? What is a disadvantage?
Explain how to find the range of a data set. The range is found by subtracting the minimum data entry from the maximum data entry. What is an advantage of using the range as a measure of variation? It is easy to compute. What is a disadvantage of using the range as a measure of variation? It uses only two entries from the data set.
The number of credits being taken by a sample of 13 full-time college students are listed below. Find the mean, median, and mode of the data, if possible. If any measure cannot be found or does not represent the center of the data, explain why. 9 11 12 12 9 10 8 8 8 8 8 8 9
Find the mean. Select the correct choice below and, if necessary, fill in the answer box to complete yo ur choice. The mean is 9.2 Does the mean represent the center of the data? The mean represents the center. Find the median. 9 Does the median represent the center of the data? The median represents the center. Find the mode. 8 Does (Do) the mode(s) represent the center of the data? The mode(s) does (do) not represent the center because it (one) is the smallest data value.
A student receives the following grades, with an A worth 4 points, a B worth 3 points, a C worth 2 points, and a D worth 1 point. What is the student's weighted mean grade point score? B in 3 three-credit classes D in 1 two-credit class A in 1 four-credit class C in 1 three-credit class
Mean grade point score is 2.8 In this data set, grades for courses with more credits have a greater effect on the mean than grades for courses with less credits. Therefore, to find the mean of this data set, one should find the weighted mean. Part 2 A weighted mean is given by x=Σ(x•w)/Σw where w is the weight of each entry x. Part 3 Let x be the number of points received in each class based on the letter grade received, and let w be the number of credits received for the corresponding class. Part 4 Note that the sum of the weights is equal to the total number of class credits. Find the sum of the weights. Σw=3•3+4•1+2•1+3•1=18 Part 5 Multiply each score by its weight and find the sum of these products. Σ(x•w)=3•(3•3)+4•(4•1)+1•(2•1)+2•(3•1)=51 Part 6 Find the weighted mean, rounding to the nearest tenth. 51/18≈2.8 Part 7 Thus, the student's mean grade point score is 2.8.
The gas mileages (in miles per gallon) for 27 cars are shown in the frequency distribution. Approximate the mean of the frequency distribution. Gas Mileage (in miles per gallon) Frequency 27-30--------------------------------- 11 31-34--------------------------------- 12 35-38--------------------------------- 1 39-42--------------------------------- 3
The approximate mean of the frequency distribution is 31.9
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum=7, maximum=90, 6 classes
The class width is 14 90-7/6 lower class limits: 7, 21, 35, 49, 63 7+14=21+14=35 upper class limit: 20,34,48,62,76 7+14-1=20+14=34+14=48
Explain how the interquartile range of a data set can be used to identify outliers.
The interquartile range (IQR) of a data set can be used to identify outliers because data values that are greater than Q3 + 1.5 (IQR) or less than Q1 - 1.5 (IQR).
Without performing any calculations, determine which measure of central tendency best represents the graphed data. Explain your reasoning.
The mean is the best measure because the data are skewed.
The ages (in years) of a random sample of shoppers at a gaming store are shown. Determine the range, mean, variance, and standard deviation of the sample data set. 12,21,23,15,13,17,20,16,14,17
The range is 11 The population mean is 16.8 The population variance is 12.8 *The deviation of an entry, x, in a data set is the difference between the entry and the sample mean. The sample variance is the sum of the squares of these deviations divided by the number of entries minus one (n−1). s^2=Σ(x−x)^2/n−1 The population standard deviation is 3.6
Determine whether the approximate shape of the distribution in the histogram shown is symmetric, uniform, skewed left, skewed right, or none of these. Justify your answer.
The shape of the distribution is approximately uniform because the bars are approximately the same height.
Determine whether the approximate shape of the distribution in the histogram shown is symmetric, uniform, skewed left, skewed right, or none of these. Justify your answer.
The shape of the distribution is skewed left because the bars have a tail to the left.
Determine whether the approximate shape of the distribution in the histogram shown is symmetric, uniform, skewed left, skewed right, or none of these. Justify your answer.
The shape of the distribution is skewed right because the bars have a tail to the right.
Determine whether the approximate shape of the distribution in the histogram shown is symmetric, uniform, skewed left, skewed right, or none of these. Justify your answer.
The shape of the distribution is symmetric, but not uniform, because a vertical line can be drawn down the middle, creating two halves that look approximately the same.
Explain the relationship between variance and standard deviation. Can either of these measures be negative? Explain.
The standard deviation is the positive square root of the variance. The standard deviation and variance can never be negative. Squared deviations can never be negative.
T OR F : Some quantitative data sets do not have medians.
The statement is false. All quantitative data set have medians.
T OR F An outlier is any number above Q3 or below Q1.
This statement is false. A true statement is "An outlier is any number above Q3+1.5(IQR) or below Q1−1.5(IQR) are considered outliers."
What are some benefits of representing data sets using frequency distributions? What are some benefits of using graphs of frequency distributions?
What are some benefits of representing data sets using frequency distributions? Organizing the data into a frequency distribution can make patterns within the data more evident. What are some benefits of using graphs of frequency distributions? It can be easier to identify patterns of a data set by looking at a graph of the frequency distribution.
Discuss the similarities and the differences between the Empirical Rule and Chebychev's Theorem.
What is a similarity between the Empirical Rule and Chebychev's Theorem? Both estimate proportions of the data contained within k standard deviations of the mean. What is a difference between the Empirical Rule and Chebychev's Theorem? The Empirical Rule assumes the distribution is aproximately symmetric and bell-shaped and Chebychev's Theorem makes no assumptions.
What is an advantage of using a stem-and-leaf plot instead of a histogram? What is a disadvantage?
What is an advantage of using a stem-and-leaf plot instead of a histogram? Stem-and-leaf plots contain original data values where histograms do not. What is a disadvantage? Histograms easily organize data of all sizes where stem-and-leaf plots do not.
Find the mean, median, and mode of the data, if possible. If any of these measures cannot be found or a measure does not represent the center of the data, explain why. A sample of seven admission test scores for a professional school are listed below. 10.4 11.4 9.6 9.6 10.6 9.6 11.2
What is the mean score? 10.3 Does the mean represent the center of the data? The mean represents the center. Find the median. 10.4 Does the median represent the center of the data? The median represents the center. What is the mode of the scores? 9.6 Does (Do) the mode(s) represent the center of the data? The mode(s) does (do) not represent the center because it (one) is the smallest data value.
The ages of the winners of a cycling tournament are approximately bell-shaped. The mean age is 28.7 years, with a standard deviation of 3.4 years. The winner in one recent year was 28 years old. (a) Transform the age to a z-score. (b) Interpret the results. (c) Determine whether the age is unusual.
(a) Transform the age to a z-score. -0.21 (b) An age of 28 is 0.21 standard deviation below the mean (c) No, this value is not unusual. A z-score between -2 and 2 is not unusual.
The ages of the winners of a cycling tournament are approximately bell-shaped. The mean age is 28.9 years, with a standard deviation of 3.8 years. The winner in one recent year was 24 years old. (a) Transform the age to a z-score. (b) Interpret the results. (c) Determine whether the age is unusual.
(a) Transform the age to a z-score. -1.29 (b) An age of 28 is 1.29 standard deviation below the mean (c) No, this value is not unusual. A z-score between -2 and 2 is not unusual.
The ages of the winners of a cycling tournament are approximately bell-shaped. The mean age is 27.6 years, with a standard deviation of 3.7 years. The winner in one recent year was 33 years old. (a) Transform the age to a z-score. (b) Interpret the results. (c) Determine whether the age is unusual.
(a) Transform the age to a z-score. 1.46 (b) An age of 28 is 1.46 standard deviation above the mean (c) No, this value is not unusual. A z-score between -2 and 2 is not unusual.