Unit 6
What was the mean price of the pasta sauce, as seen in the video?
$2.00 (Add the amounts and divide by the number of amounts)
According to the video in a new tab, how much is the average wage of pharmacists as stated by management?
$70.00
What is the purpose of a hypothesis test? How do we formulate the null and alternative hypotheses for a test?
A hypothesis test is used to determine if your null hypothesis should be rejected for an alternative hypothesis. *How do we formulate the null and alternative hypotheses for a test? - null hypothesis, alternative hypothesis
What is a normal distribution? Briefly describe the conditions that make a normal distribution.
A normal distribution is a symmetric, bell-shaped distribution with a single peak. Its peak corresponds to the mean, median, and mode of the distribution. Its variation is characterized by the standard deviation of the distribution.
In a normal distribution, what is true about data values farther from the mean?
Data values farther from the mean are less common than data values close to the mean by the definition and bell-shaped construction of a normal distribution.
The weights of babies born at Belmont Hospital are normally distributed with a mean of 6.8 pounds and a standard deviation of 7 pounds.
Does not make sense.
Which of the following is a true statement about the graphs of normal distributions?
Graphs of normal distributions always have the same characteristic bell shape because its one central peak corresponds to the mean, median, mode of the distribution and variation is characterized by the standard deviation.
The lower quartile for wages at a coffee shop is $9.25, and the upper quartile is $11.25. What can you conclude?
Half the workers earn between $9.25 and $11.25, because the lower quartile has one-quarter of the workers below it while the upper quartile has three-quarters of the workers below it.
Interpret the results. The test ▼ significant at the 0.05 level. It ▼ reasonable to conclude that the preparation course improves the score. The test ▼ evidence for rejecting the null hypothesis.
Is, Is, offers moderate
An airline with a 93% on-time departure rate has 11 out of 200 flights with late departures.
It is not significant because, the difference between the expected number of latedepartures, 14, and the actual number of latedepartures, 11, can be explained by natural variation.
For the following event, state whether you think the difference between what occurred and what you would expect by chance is statistically significant. Explain. Nine winners of the Power Ball lottery in a row bought their tickets at the same 7-Eleven store.
It is significant because, one winner isunlikely, two winners is veryunlikely, and 9 winners is certainly below a 0.01 probability level.
The distribution of wages at a company is right-skewed with outliers at high values. Assuming you would like a high wage, would you hope that your wage was closer to the mean, median or mode?
It should be closer to the mean because the mean is greater than the median and mode when the distribution is right-skewed.
The heights of male basketball players at a local college are normally distributed with a mean of 6 feet 3 inches and a standard deviation of 3 inches.
Makes sense
What type of distribution has a negative standard deviation?
None. The standard deviation cannot be negative, because each deviation from the mean is squared.
State whether you would expect the following data set to be normally distributed or not. Scores on an easy statistics exam
Not normally distributed
What is statistical inference? Why is it important?
Statistical inference is the process of making a conclusion about a population from results for a sample. It is important because the goal of most statistical studies is to learn something about an entire population.
According to the video in a new tab, what type of distribution would represent the prices of 24-ounce containers of 30 assorted brands of dish soap?
Symmetric
Consider two grocery stores at which the mean time in line is the same but the variation is different. At which store would you expect the customers to have more complaints about the waiting time? Explain.
The customers would have more complaints about the waiting time at the store that has more variation because some customers would have longer waits and might think they are being treated unequally.
The scores on a psychology exam were normally distributed with a mean of 58 and a standard deviation of 7. What is the standard score for an exam score of 67?
The standard score is 1.29. Standard score Calculator | Calculate Standard score (calculatoratoz.com)
The scores on a psychology exam were normally distributed with a mean of 52 and a standard deviation of 8. What is the standard score for an exam score of 64?
The standard score is 1.5.
The mean gas mileage of the compact cars we tested was 34 miles per gallon, with a standard deviation of 5 gallons.
The statement does not make sense because the standard deviation should have the same units as the mean and the data.
Consider the distribution of exam scores (graded from 0 to 100) for 84 students when 37 students got an A, 27 students got a B, and 20 students got a C. Complete parts (a) through (d) below.
There would probably be one peak because there are no obvious reasons why the exam scores would form different groups. *Make a sketch of the distribution. Choose the correct answer below. -A Smallest stroke with small peak. *What shape would you expect for the distribution? -The distribution would probably be left-skewed because many of the students got an A, and very few got a C. *What variation would you expect in the distribution? -The variation would probably be large because many students got an A, some got a B, and a small number got a C, and so the data are not clustered.
In a normal distribution, where do about 2/3 of the data values fall?
They fall within 1 standard deviation of the mean. This is because by the 68-95-99.7 rule, approximately 68% or two-thirds of all data points in a normal distribution fall within 1 standard deviation of the mean.
One hundred students take a chemistry exam. All but two of the students score between 58 and 70 points, but one student gets a 26 and one student gets a 96. What do the scores 26 and 96 represent?
They represent outliers. One is much lower than most of the values, and one is much higher than most of the values.
Decide whether the following statement makes sense or does not make sense. Explain your reasoning. My professor graded the final score on a curve, and she gave a grade of A to anyone who had a standard score of 2 or more.
This makes sense because a standard score of 2 or more corresponds to roughly the 97th percentile. Though this curve is stingy on giving out A's to students, it is still giving the top students the highest grade.
Briefly describe two possible sources of confusion about the "average."
Two possible sources of confusion are not knowing whether the reported average is the mean or the median, and not having enough information about how the average was computed.
An acquaintance tells you that his IQ is in the 102nd percentile. What can you conclude from this information?
You can conclude that he doesn't understand percentiles because it is impossible to be in the 102nd percentile.
Exam results for 100 students are given below. For the given exam grades, briefly describe the shape and variation of the distribution. median=85, mean=78, low score=19, high score=87
left-skewed, high
The numbers of words defined on randomly selected pages from a dictionary are shown below. Find the mean, median, and mode of the listed numbers. 43 71 78 58 45 74 44 49 63 52
57.7 Mean Calculator | Average, Median, Mode, Range Calculator (mean-calculator.net) *What is the median? 55 *What is(are) the mode(s)? There is no mode.
Do the following for the case below. a. State the null and alternative hypotheses for a hypothesis test. b. Describe the two possible outcomes of the test, using the context of the given situation. The governor claims that the percentage of adults over 21 who have graduated from high school is greater than 73%, the national average.
The percentage of adults over 21 who have graduated from high school is 73% *What is the alternative hypothesis? -The percentage of adults over 21 who have graduated from high school is greater than 73%. *What is the outcome if the null hypothesis is rejected? -There is evidence that the percentage of high school graduates exceeds 73%. *What is the outcome if the null hypothesis is not rejected? -There is insufficient evidence to conclude that the percentage of high school graduates exceeds 73%.
After recording the pizza delivery times for two different pizza shops, you conclude that one pizza shop has a mean delivery time of 46 minutes with a standard deviation of 2 minutes. The other shop has a mean delivery time of 45 minutes with a standard deviation of 18 minutes. Interpret these figures. If you liked the pizzas from both shops equally well, which one would you order from? Why?
The means are nearly equal, but the variation is significantly greater for the second shop than for the first. *If you liked the pizzas from both shops equally well, which one would you order from? Why? Choose the correct answer below. -Choose the first shop. The delivery time is more reliable because it has a lower standard deviation.
Explain why accepting the null hypothesis is not a possible outcome. Choose the correct answer below.
The null hypothesis is the starting assumption. The hypothesis test may not give us reason to reject this starting assumption, but it cannot by itself give us reason to conclude that the starting assumption is true.
Assume that a set of test scores is normally distributed with a mean of 110 and a standard deviation of 15. Use the 68-95-99.7 rule to find the following quantities.
The percentage of scores less than 110 is 50%. *The percentage of scores greater than 125 is 16%. *The percentage of scores between 80 and 125 is 81.5%. Empirical rule calculator - Find Ranges 1, 2, 3 From the Mean (calculator-online.net)
What is a percentile? Describe how the accompanying table allows you to relate standard scores and percentiles. Click the icon to view the table of standard scores and percentiles.
The percentile of a data value is the percentage of all data values in a data set that are less than or equal to it. For the standard scores given in the table, the table gives the percentage of values in the distribution less than or equal to that value.
The distribution of grades was left-skewed, but the mean, median, and mode were all the same.
This does not make sense because the mean and median should lie somewhere to the left of the mode if the distribution is left-skewed.
Define the five-number summary, and explain how to depict it visually with a boxplot.
low value, lower quartile, median, upper quartile, and high value. *Explain how to depict the five numbers visually with a boxplot. Choose the correct answer below. Select all that apply. -Draw a number line that spans all the values in the data set. Enclose the values from the lower to upper quartile in a box. Draw a vertical line through the box at the median. Add "whiskers" extending to the low and high values.
Big Bank (three lines): 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0
mean = sum of all values/total number of values *The sum of all values is 79.2 * The total number of values is 11. *The mean is 7.2 *Notice that the data are given in ascending order. How should the median be found in this case? Select the correct choice below and fill in the answer box(es) to complete your choice. -The median is the 6th value in the sorted data set. *Thus, the median is 7.2 *The mean is equal to the median.
Describe the process of calculating a standard deviation. Give a simple example of its calculation (such as calculating the standard deviation of the numbers 2, 3, 4, 4, and 6). What is the standard deviation if all of the sample values are the same?
Compute the mean of the data set. Then find the deviation from the mean for every data value by subtracting the mean from the data value. *Find the squares of all the deviations from the mean, and then add them together. *Divide this sum by the total number of data values minus 1. *The standard deviation is the square root of this quotient. *The standard deviation of the numbers 2,3,4,4, and 6 is approximately 1.483. *If all of the sample values are the same, then the standard deviation is 0.
Assume an example where the population size is 100 people, and 80% of those people attended college. You are interested in determining the portion of the population that attended college. Interpret the margin of error in terms of 95% confidence.
The margin of error is 1100=0.1=10%. Therefore, you can say with 95% confidence that between 70% and 90% of the population attended college.
In a data set of 10 exam scores, in which no two people got the same score, the mean was equal to the score of the person with the third highest grade.
The statement makes sense because the data could be skewed towards the higher scores, causing the mean to be equal to the third highest grade.
The highest exam score was in the upper quartile of the distribution.
The statement makes sense because the highest score will be in the highest quartile.
Is the result of the test statistically significant at the 0.01 level?
The test is not statistically significant at the 0.01 level, because the chance of a sample mean at least as extreme as the observed one is greater than 0.01, assuming the null hypothesis is true.
Is the result of the test statistically significant at the 0.05 level?
The test is statistically significant at the 0.05 level, because the chance of a sample mean at least as extreme as the observed one is less than 0.05, assuming the null hypothesis is true.
Consider the following set of three distributions, all of which are drawn to the same scale. Identify the two distributions that are normal. Of the two normal distributions, which one has the larger variation?
The two normal distributions are b and c where c has the larger standard deviation.
If you compared the distribution of weights of 20 elite female gymnasts to the distribution of weights of 20 randomly selected women, you would expect the variation of the weights to be greater for the gymnasts, greater for the randomly selected women, or the same for both groups?
The variation of the weights should be greater for the randomly selected women. Because female gymnasts tend to be petite and train so hard, their weights are all very similar. This is not the case for the female population as a whole, whose weights are all very different.
Briefly describe the use of the formula for margin of error. Give an example in which you interpret the margin of error in terms of 95% confidence.
The 95% confidence interval is found by subtracting and adding the margin of error from the sample proportion. You can be 95% confident that the true population proportion lies within this interval.
State, with an explanation, whether you would expect the following data set to be normally distributed. The delay in departure of trains from a station (note that trains, buses, and airplanes cannot leave early)
This data set is not normally distributed. There is no reason to assume that the mean, median, and mode of this distribution are centered at a central peak in a symmetric, bell-shaped distribution. It is possible that trains never experience delay or frequently encounter very above average delays at this particular station.
Exam results for 100 students are given below. For the given exam grades, briefly describe the shape and variation of the distribution. median=67, mean=74, low score=64, high score=81
right skewed - low
According to the video in a new tab, about how many people paid between $15,000 and $16,000 for the Kia Soul?
13,600
A survey asks students to state how many sodas they drink per week. The results show a mean of 13 sodas per week and a median of 9 sodas per week. What can be concluded?
At least one student drinks more than 17 sodas per week. An outlier value that is higher than most of the other values would explain why the mean is higher than the median.
The histogram in the figure shows times between eruptions of a geyser. Draw a smooth curve that captures its important features. Then classify the distribution according to its number of peaks, symmetry or skewness, and variation.
B -Close examination of the given histogram shows that the distribution is double-peaked (or bimodal). The distribution has 2 peaks, is not symmetric and has moderate to large variation.
You want to find the median weight of the apples in a barrel. What do you need to do?
Lay the apples out in order of increasing weight and find the weight of the apple in the middle. The definition of median is the middle value in a sorted data set.
Suppose you are given the mean and just one data value from a distribution. What can you calculate?
The deviation for the single value can be calculated because it depends only on that value can the mean.
The histogram of a sample of the weights of 363 rugby players is shown to right. Draw a smooth curve that captures its important features. Then classify the distribution according to its number of peaks, symmetry or skewness, and variation. * One curve at 90 in the middle of the graph.
The distribution has 1 peak, is symmetric and has fairly high variation.
Which data set would you expect to have the highest standard deviation: heights (lengths) of newborn infants, heights of all elementary-school children, or heights of first-grade boys?
The heights of all elementary-school children, since it has the largest range.
According to the video, which statement is false?
The high value of 167 is an outlier.
What is a standard score? How do you find the standard score for a particular data value?
A standard score is the number of standard deviations a data value lies above or below the mean. *Choose the correct formula for computing a standard score below. -The standard score for a particular data value is given by z= data value=mean/standard deviation
What do we mean when we say that a distribution is symmetric? Give simple examples of a symmetric distribution, a left-skewed distribution, and a right-skewed distribution.
A single-peaked distribution is symmetric if its left half is a mirror image of its right half. *Give a simple example of a symmetric distribution. Choose the correct answer below. The heights of a sample of 100 women is a symmetric distribution. *Give a simple example of a left-skewed distribution. Choose the correct answer below. The speed of cars on a road where a visible patrol car is using radar to detect speeders is a left-skewed distribution. *Give a simple example of a right-skewed distribution. Choose the correct answer below. The number of books read during the school year by fifth graders is a right-skewed distribution.
What are the two possible conclusions of a hypothesis test? Explain, and also explain why accepting the null hypothesis is not a possible outcome.
Not rejecting the null hypothesis, in which case we lack sufficient evidence to support the alternative hypothesis. This is the correct answer. Rejecting the null hypothesis, in which case we have evidence in support of the alternative hypothesis.
A study of 85 students who took an SAT preparation course concluded that the mean improvement on the SAT was 30 points. If we assume that the preparation course has no effect, the probability of getting a mean improvement of 30 points by chance is 0.04, or 4 in 100. Discuss whether this preparation course results in statistically significant improvement. Use significance levels of 0.05 and 0.01.
Null hypothesis: The preparation course has no effect on the score Alternative hypothesis: The preparation course improves the score
The owner of a rental car company claims that the mean annual mileage for the population of all cars in his fleet is more than 11,659 miles (which is the mean annual mileage for all cars in the country). A random sample of n=271 cars from his fleet has a mean annual mileage of 11,834 miles. Assuming that the mean annual mileage for all cars in his fleet is 11,659 miles, the probability of selecting a random sample of this size with a mean annual mileage of 11,834 or more is 0.0808.
The mean annual mileage of all cars in the owner's fleet is 11,659 miles. *Formulate the alternative hypothesis. Choose the correct answer below. -The mean annual mileage of all cars in the owner's fleet is greater than 11,659 miles. *Discuss whether the sample provides evidence for rejecting or not rejecting the null hypothesis. Choose the correct answer below. -The test does not provide sufficient grounds for rejecting the null hypothesis.
Define and distinguish among mean, median, and mode.
The mean is the sum of all the values divided by the number of values. It can be strongly affected by outliers. Choose the correct description of the median below. The median is the middle value in a data set. It is not affected by outliers. Choose the correct description of the mode below. The mode is the most common value in a data set. It is not affected by outliers. Choose the correct answer below. An outlier in a data set is a value that is much higher or much lower than almost all other values. An outlier can change the mean of a data set, but does not affect the median or mode.
For the following distribution, decide whether you expect the mean, median, or mode to give the best representation of the center of the distribution, and explain why. Per capita earnings in New York City.
The median because it is unaffected by outliers.
What are the quartiles of a distribution? How do we find them?
The quartiles are values that divide the data distribution into quarters. *How do we find them? -The lower quartile is the median of the data values in the lower half of a data set. The middle quartile is the overall median. The upper quartile is the median of the data values in the upper half of data set.
The lowest score on an exam was 63, the median score was 73, and the high score was 93. What was the range?
The range is 30, because that is the difference between the lowest score and the highest score.
What is the 68-95-99.7 rule for normal distributions? Explain how it can be used to answer questions about frequencies of data values in a normal distribution.
The rule states that about 68%, 95%, and 99.7% of the data points in a normal distribution lie within 1, 2, and 3 standard deviations of the mean, respectively.
Two extremely tall people skewed the distribution of heights to the smaller values.
The statement does not make sense because the tall people are outliers at the higher values, and so will skew the data towards the higher values.
I made a distribution of 15 apartment rents in my neighborhood. One apartment had a much higher rent than all of the others, and this outlier caused the mean rent to be higher than the median rent.
The statement makes sense because an outlier with a large value increases the mean, but does not affect the median.
For the 30 students who took the test, the high score was 80, the median was 75, and the low score was 40.
The statement makes sense because it is possible that when sorting the 30 scores from low to high, the first value was 40, the highest value was 80, and 75 was halfway between the 15th and the 16th score.
Decide whether the following statement makes sense (or is clearly true) or does not make sense (or is clearly false). Explain your reasoning. The distribution of grades was left-skewed, but the mean, median, and mode were all the same.
This does not make sense because the mean and median should lie somewhere to the left of the mode if the distribution is left-skewed.
Both exams had the same range, so they must have had the same median.
This does not make sense because the range is the difference between the highest and lowest data values. It has nothing to do with the median.