6B measures of variation
Describe the process of calculating a standard deviation.
Compute the mean of the data set. Then find the deviation from the mean for every data value by subtracting the mean from the data value. Find the squares of all the deviations from the mean, and then add them together. Divide this sum by the total number of data values minus 1. The standard deviation is the square root of this quotient. S.D of 2,3,4,4,6 = 1.483 If all of the sample values are the same, then the standard deviation is 0
What are the quartiles of a distribution? How do we find them?
The quartiles are values that divide the data distribution into quarters. The lower quartile is the median of the data values in the lower half of a data set. The middle quartile is the overall median. The upper quartile is the median of the data values in the upper half of data set.
Consider two grocery stores at which the mean time in line is the same but the variation is different. At which store would you expect the customers to have more complaints about the waiting time?
The customers would have more complaints about the waiting time at the store that has more variation because some customers would have longer waits and might think they are being treated unequally.
A report claims that the returns for the investment portfolios with a single stock have a standard deviation of 0.51, while the returns for portfolios with 35 stocks have a standard deviation of 0.329 Explain how the standard deviation measures the risk in these two types of portfolios.
A lower standard deviation means more certainty in the return and less risk. Hence, the returns for portfolios with 35 stocks have less risk than the ones with a single stock.
Each night you total the day's sales and the total volume of ice cream sold in your shop. You notice that when an employee named Ben works, the mean price of the ice cream sold is $2.95 per pint with a standard deviation of $0.35 On nights when an employee named Jerry works, the mean price of the ice cream sold is $2.90 per pint with a standard deviation of $0.10 Which employee likely receives more complaints that his servings are too small? Explain.
The means are nearly equal. However, because Ben standard deviation is larger, he likely serves more small portions.
how is the range of a distribution is defined and calculated
The range of a data set is the difference between its highest and lowest data values. range equals=highest value (max) minus−lowest value (min)
Two airlines have data on the arrival times of their flights. An arrival time of +8 minutes means the flight arrived 8 minutes early. An arrival time of -6 minutes means the flight arrived 6 minutes late. Skyview Airlines has a mean arrival time of −4 minutes with a standard deviation of 4 minutes. SkyHigh Airlines has a mean arrival time of 1.5 minutes with a standard deviation of 9.3 minutes. Explain the meaning of these figures and why they would affect your choice of airline.
If you want to avoid long delays, Skyview Airlines might be the better choice. Skyview Airlines on average runs slightly behind schedule, but it tends to have smaller delays. While SkyHigh Airlines on average runs ahead of schedule, it can also have long delays (late arrivals).
After recording the pizza delivery times for two different pizza shops, you conclude that one pizza shop has a mean delivery time of 45 minutes with a standard deviation of 3 minutes. The other shop has a mean delivery time of 43 minutes with a standard deviation of 19 minutes. Interpret these figures. If you liked the pizzas from both shops equally well, which one would you order from? Why?
The means are nearly equal, but the variation is significantly greater for the second shop than for the first. Choose the first shop. The delivery time is more reliable because it has a lower standard deviation.
Briefly describe the use of the range rule of thumb for interpreting the standard deviation. What are its limitations?
The standard deviation is approximately the range divided by four. The range rule of thumb does not work well when the highest or lowest value is an outlier.
Decide whether the following statement makes sense or does not make sense. The mean gas mileage of the compact cars we tested was 34 miles per gallon, with a standard deviation of 5 gallons.
The statement does not make sense because the standard deviation should have the same units as the mean and the data. The standard deviation defines the distance a value is from the mean, so it must have the same units as the mean.
Decide whether the following statement makes sense or does not make sense. For the 30 students who took the test, the high score was 80, the median was 75, and the low score was 40.
The statement makes sense because it is possible that when sorting the 30 scores from low to high, the first value was 40, the highest value was 80, and 75 was halfway between the 15th and the 16th score.
Decide whether the following statement makes sense or does not make sense. The highest exam score was in the upper quartile of the distribution.
The statement makes sense because the highest score will be in the highest quartile.
Decide whether the following statement makes sense or does not make sense. I examined the data carefully, and the range was greater than the standard deviation.
The statement makes sense because the range is approximately four times the standard deviation. The range rule of thumb states that the standard deviation is approximately equal to the range divided by four.
Decide whether the following statement makes sense or does not make sense. The standard deviation for the heights of a group of 5-year-old children is smaller than the standard deviation for the heights of a group of children who range in age from 3 to 15.
The statement makes sense because the range of data for the heights of a group of 5-year-old children is smaller than the range of data for the heights of a group of children who range in age from 3 to 15.
Decide whether the following statement makes sense or does not make sense Both exams had the same range, so they must have had the same median.
This does not make sense because the range is the difference between the highest and lowest data values. It has nothing to do with the median.
Define the five-number summary, and explain how to depict it visually with a boxplot.
low value, lower quartile, median, upper quartile, and high value Step 1: Draw a number line that spans all the values in the data set. Step 2: Enclose the values from the lower to upper quartile in a box. The thickness of the box has no meaning. Step 3: Draw a vertical line through the box at the median. Step 4: Add "whiskers" extending to the low and high values.