STATS HW3
At one hospital there is some concern about the high turnover of nurses. A survey was done to determine how long (in months) nurses had been in their current positions. The responses (in months) of 20 nurses were as follows.
See notebook.
Consider the following ordered data. 2 5 5 6 7 7 8 9 10 (a) Find the low, Q1, median, Q3, and high. (b) Find the interquartile range. (c) Make a box-and-whisker plot.
See notebook.
What symbol is used for the standard deviation when it is a sample statistic? What symbol is used for the standard deviation when it is a population parameter?
Sample statistic: s. Population parameter: 𝜎.
Which average - mean, median, or mode - is associated with the standard deviation?
Mean
How large is a wolf pack? The following information is from a random sample of winter wolf packs. Winter pack size are given below. Compute the mean, median, and mode for the size of winter wolf packs. (Round your mean to four decimal places.)
Mean = 11.1667 Median = 11 Mode = 12
Find the mean, median, and mode of the data set. 9 6 8 6 7
Mean = 7.2 Median = 7 Mode = 6
What symbol is used for the arithmetic mean when it is a sample statistic? What symbol is used when the arithmetic mean is a population parameter?
statistic, x̄; parameter, 𝜇
Consider the mode, median, and mean. (a) Which average represents the middle value of a data distribution? (b) Which average represents the most frequent value of a data distribution? (c) Which average takes all the specific values into account?
(a) median (b) mode (c) mean
Each of the following data sets has a mean of x = 10. (i) 8 9 10 11 12 (ii) 7 9 10 11 13 (iii) 7 8 10 12 13 (a) Without doing any computations, order the data sets according to increasing value of standard deviations. (b) Why do you expect the difference in standard deviations between data sets (i) and (ii) to be greater than the difference in standard deviations between data sets (ii) and (iii)? Hint: Consider how much the data in the respective sets differ from the mean.
(a) (i), (ii), (iii) (b) The data change between data sets (i) and (ii) increased the squared difference Σ(x - x)2 by more than data sets (ii) and (iii).
Angela took a general aptitude test and scored in the 88th percentile for aptitude in accounting. (a) What percentage of the scores were at or below her score? (b) What percentage were above?
(a) 88% (b) 12%
Consider a data set of 15 distinct measurements with mean A and median B. (a) If the highest number were increased, what would be the effect on the median and mean? (b) If the highest number were decreased to a value still larger than B, what would be the effect on the median and mean? (c) If the highest number were decreased to a value smaller than B, what would be the effect on the median and mean?
(a) The mean would increase while the median would remain the same. (b) The mean would decrease while the median would remain the same. (c) Both the mean and median would decrease.
In your biology class, your final grade is based on several things: a lab score, scores on two major tests, and your score on the final exam. There are 100 points available for each score. However, the lab score is worth 19% of your total grade, each major test is worth 24.5%, and the final exam is worth 32%. Compute the weighted average for the following scores: 61 on the lab, 77 on the first major test, 99 on the second major test, and 86 on the final exam.
Weighted avg = (61*0.19+77*0.245+99*0.245+86*0.32)/(0.19+0.245+0.245+0.32) = 82.23
When computing the standard deviation, does it matter whether the data are sample data or data comprising the entire population? Explain.
Yes. The formula for s is divided by n − 1, while the formula for 𝜎 is divided by N.
One indicator of an outlier is that an observation is more than 2.5 standard deviations from the mean. Consider the data value 80. (a) If a data set has mean 70 and standard deviation 5, is 80 a suspect outlier? (b) If a data set has mean 70 and standard deviation 3, is 80 a suspect outlier?
(a) (mean + s*2.5) = (70+5*2.5)=82.5 where 80<82.5; therefore, No, since 80 is less than 2.5 standard deviations above the mean. (b) (mean + s*2.5) = (70+3*2.5)=77.5 where 80<82.5; therefore, Yes, since 80 is more than 2.5 standard deviations above the mean.
Consider two data sets. Set A: n = 5; x = 4 Set B: n = 50; x = 4 (a) Suppose the number 20 is included as an additional data value in Set A. Compute x for the new data set. Hint: x = nx. To compute x for the new data set, add 20 to x of the original data set and divide by 6. (Round your answer to two decimal places.) (b) Suppose the number 20 is included as an additional data value in Set B. Compute x for the new data set. (Round your answer to two decimal places.) (c) Why does the addition of the number 20 to each data set change the mean for Set A more than it does for Set B?
(a) (n*x+20)/n+1 = (5*4+20)/6 = 6.67 (b) (n*x+20)/n+1 = (50*4+20)/51 = 4.31 (c) Set B has a larger number of data values than set A, so to find the mean of B we divide the sum of the values by a larger value than for A.
One standard for admission to Redfield College is that the student must rank in the upper quartile of his or her graduating high school class. What is the minimal percentile rank of a successful applicant?
75%
The town of Butler, Nebraska, decided to give a teacher-competency exam and defined the passing scores to be those in the 70th percentile or higher. The raw test scores ranged from 0 to 100. Was a raw score of 82 necessarily a passing score? Explain.
No, it might have a percentile rank less than 70.
When a distribution is mound-shaped symmetrical, what is the general relationship among the values of the mean, median, and mode?
The mean, median, and mode are approximately equal.
What is the age distribution of adult shoplifters (21 years of age or older) in supermarkets? The following is based on information taken from the National Retail Federation. A random sample of 895 incidents of shoplifting gave the following age distribution. Estimate the mean age, sample variance, and sample standard deviation for the shoplifters. For the class 41 and over, use 45.5 as the class midpoint. (Round your answers to one decimal place.)
x̄ = 35.33 s^2 = 60.93 s = 7.81
Consider a data set with at least three data values. Suppose the highest value is increased by 10 and the lowest is decreased by 10. (a) Does the mean change? Explain. (b) Does the median change? Explain. (c) Is it possible for the mode to change? Explain.
(a) No, the sum of the data does not change. (b) No, changing the extreme data values does not affect the median. (c) Yes, depending on which data value occurs most frequently after the data are changed.
Given the sample data. x: 21, 17, 13, 32, 25 (a) Find the range. (b) Verify that Σx = 108 and Σx2 = 2,548. (c) Use the results of part (b) and appropriate computation formulas to compute the sample variance s2 and sample standard deviation s. (Round your answers to two decimal places.) (d) Use the defining formulas to compute the sample variance s2 and sample standard deviation s. (Round your answers to two decimal places.) Suppose the given data comprise the entire population of all x values. Compute the population variance 𝜎2 and population standard deviation 𝜎. (Round your answers to two decimal places.)
(a) Range = 19 (b) Σx = 108 and Σx2 = 2,548 (c) s^2 = 53.8, s = 7.33 (d) s^2 = 53.8, s = 7.33 (e) 𝜎^2 = 43.04, 𝜎 = 6.56
Consider the data set. 2, 4, 7, 8, 9 (a) Find the range. (b) Use the defining formula to compute the sample standard deviation s. (Round your answer to two decimal places.) (c) Use the defining formula to compute the population standard deviation 𝜎. (Round your answer to two decimal places.)
(a) Range = 7 (b) s = 2.92 (c) 𝜎 = 2.61
Some data sets include values so high or so low that they seem to stand apart from the rest of the data. These data are called outliers. Outliers may represent data collection errors, data entry errors, or simply valid but unusual data values. It is important to identify outliers in the data set and examine the outliers carefully to determine if they are in error. One way to detect outliers is to use a box-and-whisker plot. Data values that fall beyond the limits Lower limit: Q1 − 1.5 ✕ (IQR) Upper limit: Q3 + 1.5 ✕ (IQR) where IQR is the interquartile range, are suspected outliers. In the computer software package Minitab, values beyond these limits are plotted with asterisks (*). Students from a statistics class were asked to record their heights in inches. The heights (as recorded) were as follows. 65, 72, 68, 64, 60, 55, 73, 71, 52, 63, 61, 74, 69, 67, 74, 50, 4, 75, 67, 62, 66, 80, 64, 65 (a) Make a box-and-whisker plot of the data. (b) Find the value of the interquartile range (IQR). (c) Multiply the IQR by 1.5 and find the lower and upper limits. (Enter your answers to one decimal place.) (d) Are there any data values below the lower limit? Above the upper limit? List any suspected outliers. What might be some explanations for the outliers?
(a) See notebook. (b) IQR = 10 (c) lower limit = 46.5, upper limit = 86.5 (d) Yes, 4 is below the lower limit and is probably an error.
Consumer Reports rated automobile insurance companies and gave annual premiums for top-rated companies in several states. The figure below shows box plots for annual premiums for urban customers (married couple with one 17-year-old son) in three states. The box plots were all drawn using the same scale on a TI-84Plus/TI-83Plus calculator. (a) Which state has the lowest premium? The highest? (b) Which state has the highest median premium? (c) Which state has the smallest range of premiums? the smallest interquartile range? (d) The other set of figures give the five-number summaries generated on the TI-84Plus/TI-83Plus calculators for the box plots. Match the five-number summaries to the appropriate box plots.
(a) lowest, California; highest, Pennsylvania (b) Pennsylvania (c) smallest range, California; smallest IQR, Texas (d) Texas (a); Pennsylvania (b); California (c)
What is the relationship between the variance and the standard deviation for a sample data set?
The standard deviation is the square root of the variance.
Find the weighted average of a data set where 10 has a weight of 2, 20 has a weight of 3, and 30 has a weight of 5.
Weighted avg = (10*2+20*3+30*5)/(2+3+5) = 23