chapter 9-15 statistics
What proportion of observations under a smoothed normal curve are exactly over the point 1.5 (z = 1.5)? - 0 - 0.0668 - 0.9332
0 For a standard Normal (or any Normal distribution) the chance of the variable exactly equaling any value is 0. This is a consequence of smoothing histograms from data. We are interested in exactly 1.5 as the value of the variable.
Which of the following is a conservative choice for α? - 0.50 - 0.01 - 0.0 - 0.25
0.01 This value is a conservative value for α.
Normal body temperature for a healthy adult has a standard deviation of 0.7 ºF. We can be 95% confident that the average body temperature for 100 healthy adults will be at most degrees away from the true mean body temperature. - 1.40 - 0.70 - 0.14 - 0.07
0.14
Which of the following is not needed to calculate the sample size for a desired margin of error in a confidence interval for μ when σ is known? - value of the population standard deviation (σ) - value of the population mean (μ) - size of desired margin of error - confidence level
value of the population mean (μ) If we knew the value of the population mean, we would not need to estimate it, and we would not need to determine the sample size.
A density curve can be obtained by smoothing a __________ of data. Please choose the correct answer from the following choices, and then select the submit answer button. - scatterplot - histogram - bar graph - line graph
histogram
The P-value is the probability of obtaining a statistic as extreme or more extreme than what was actually __________ if the null hypothesis were true.
observed The P-value is the probability of obtaining certain values of the statistic, given that the null hypothesis is true, not a probability on either hypothesis. It is NOT the probability of rejecting a true null hypothesis.
Usually the null hypothesis is a statement of _____.
"no effect" or "no difference"
Scientists discovered a new group of proteins in an animal species. They found that the distribution of the number of amino acids these proteins were made of was approximately Normal, with mean 530 and standard deviation 80. Approximately what percent of these new proteins will be between 370 and 690 amino acids long? - 95% - 68% - 99.7%
95%
critical values
The values that lie exactly on the boundary of the region of rejection.
Decreasing the level of confidence will ___________ the sample size required to maintain the same margin of error.
decrease Decreasing the level of confidence will results in a decrease in z*, which will require a smaller sample size.
Suppose we are testing H0: µ = 100 versus Ha: µ < 100. Which of the following will have the smallest P-value? - x¯ = 100 - x¯ = 80 - x¯ = 90 - x¯ = 70
x¯ = 70 Since this test is one-sided with less than in the alternative, an x¯ of 70 is the farthest below µ = 100 and will have the smallest P-value.
Sample counts and sample proportions are common statistics when studying __________. - categorical data - quantitative data - both categorical and quantitative data - neither categorical nor quantitative data
categorical data Sample counts and proportions are useful for summarizing categorical data. For quantitative data, the sample mean is frequently used.
Use a _____ when your goal is to estimate a population parameter
confidence interval
Spontaneous genetic mutations are one cause of cancer. They can happen at any time for various reasons such as exposure to certain carcinogens. Researchers are working to understand more about what causes them. If X is how many mutations there are in a person's body at time Y, X is a __________ random variable.
discrete Because we can think of counting the mutations, this is a discrete random variable.
If P(A or B) = P(A) + P(B), then these events must be __________. - disjoint - unlikely - complementary - independent
disjoint This is the addition rule for disjoint events.
For random samples of size 100 from a population, the mean of the sample means from all possible samples is __________ to the population mean.
equal What a wonderful fact! The mean of all possible sample means equals the mean of the population.
Decreasing the desired margin of error will __________ the required sample size.
increase Decreasing the desired margin of error will require a larger sample size. The desired margin of error is in the denominator of the formula for the sample size. Making that smaller makes the result larger.
An airline wants to know the average time it takes their passengers to claim their luggage. The time to claim luggage for this airline is known to be Normally distributed with mean μ and standard deviation σ = 5 minutes. The airline took a simple random sample of 10 passengers and calculated a 95% confidence interval to be 21.9 to 28.1 minutes. If the airline increased their confidence level from 95% to 99%, the margin of error would __________.
increase Increasing the confidence level while keeping the sample size fixed will result in an increased z* which will increase the margin of error and confidence interval width.
When we use a z-procedure for inference, we are assuming that the population distribution: - is bimodal. - is Normal. - is skewed. - has no particular shape.
is normal
Chance behavior does not have expected patterns therefore we say chance behavior is ________ in the short run. - predictable - known - not random - unpredictable
unpredictable But over the long run, chance behaviors have predictable patterns.
Suppose we collect a random sample of size n from a population with standard deviation σ and, from the data collected, compute a 95% confidence interval for the mean of the population. Which of the following would produce a new confidence interval with smaller width (smaller margin of error) based on these same data? - Increase σ. - Use a smaller sample size. - Use a lower confidence level.
use a lower confidence level
Suppose we are testing a claim that µ = 100 and that the standard deviation of the sampling distribution of x¯ equals 5. Which of the following sample means gives the least evidence against the claim? - x¯ = 106 - x¯ = 108 - x¯ = 97 - x¯ = 83
x¯ = 97 This is less than one standard deviations of x¯ from the claim and as such, is the closest to the claim. It is the option that provides the least evidence against the claim that the mean is 100.
The test statistic z measures .....
how far the observed sample mean ¯x deviates from the hypothesized population value μ0, in relative terms.
The distribution of bladder volumes in men is approximately Normal with mean 550 milliliters (ml) and standard deviation 100 ml. In women, bladder volumes are approximately Normal with mean 400 ml and standard deviation 75 ml. Which of the two distributions is the widest? Should the widest distribution also be the tallest or the shortest? - The two distributions are identical, because both are approximately Normal. - The women's bladder volume distribution is the narrowest and the shortest. - The women's bladder volume distribution is the widest and the shortest. - The men's bladder volume distribution is the widest and the shortest.
- The men's bladder volume distribution is the widest and the shortest The men's bladder volume distribution is the widest and the shortest. The men's distribution has a larger standard deviation, which controls the spread. The area under each Normal curve is equal to 1, so a larger spread results in a shorter curve.
Researchers are testing a new type of surgical technique that they hope will decrease the rate of post-surgical medical complications among patients. Suppose that (unknown to the researchers) 2% of all patients will develop post-surgical medical complications when the new technique is used. The researchers apply the new technique to a random sample of 50 surgical patients in order to test its effectiveness. What is the standard deviation of the sample proportion of patients who develop complications, p̂ ? Enter your answer to four decimal places.
0.0198 Calculate the standard deviation using the formula √((p(1-p))/n). We are told that the population proportion, p, is 0.35, and the sample size, n, is 100. So the standard deviation is √((.02(1-.02))/50)= √(0.0196/50)= 0.0198
Mensa is the "high IQ" society. Their rules for eligibility for membership state that an individual must have an IQ in the upper 2%. The Wechsler Adult Intelligence Scale is approximately Normal with mean 100 and standard deviation 15. On this scale, what IQ will qualify a person for membership in Mensa? (IQ's do not have decimal places.) The upper 2% (which means 98% is below) relates to a z-score of _____which translates to a qualifying membership score of 131. (Round to two decimal places.)
2.05 Using the Normal table, locate 0.98 as closely as possible in the body of the table (2.05 has area 0.9798 to the left of it). Now, use X = zσ + μ to solve for X = 130.8, or 131.
Below are four confidence intervals for the mean head circumference for adult males, all based on a sample size of 25. Which interval has the highest level of confidence? - 22.8 ± 0.66 inches - 22.8 ± 0.44 inches - 22.8 ± 0.88 inches - 22.8 ± 0.22 inches
22.8 ± 0.88 inches
A consumer advocate is interested in evaluating the claim that a new granola cereal contains 4 ounces of cashews in every bag. The advocate recognizes that amounts of cashews will vary slightly from bag to bag, but she suspects that the mean amount of cashews per bag is less than 4 ounces. To check the claim, the advocate purchases a random sample of 40 bags of cereal and calculates a sample mean of 3.79 ounces of cashews. She finds the P-value to be 0.12 for testing H0: µ = 4 versus Ha: µ < 4. The P-value is the probability of observing a sample mean less than or equal to _____ ounces, assuming that the true mean amount of cashews per box is _____ ounces. - 3.79; 3.79 - 3.79; 4 - 4; 3.79 - 4; 4
3.79; 4 This interpretation is correct because it has both of the following: (1) the probability is about getting the sample statistic (sample mean of 3.68 or less) and (2) the probability is based on the null hypothesis being true (true mean is 4 ounces).
The "bell" of a Normal curve is approximately _______ standard deviations wide. - 8 - 2 - 6 - 4
6
Three Normal distributions all have mean 20. Distribution A has standard deviation 1, distribution B has standard deviation 5, and distribution C has standard deviation 10. The distribution with the sharpest peak is distribution ___.
A Because this distribution has the smallest standard deviation, it means that more observations will be clustered near the mean.
We always express __________ in terms of population parameters.
hypotheses Hypotheses are claims about population parameters, not sample outcomes.
A statistical hypothesis test is analogous to a criminal trial in the American justice system. There, the null hypothesis is that the defendant is innocent. In this case, convicting an innocent person is a Type _____ error.
I A Type I error is when the null hypothesis is wrongly rejected. Here, the null hypothesis is that the person is innocent.
One measure of the physical abilities of football players is the time in which they run a 40 yard dash. In one league, the mean time is 4.4 seconds with standard deviation 0.15 seconds. The fastest anyone has ever run the 40 yard dash in this league is 4.2 seconds. Could the variable "Time to run the 40 yard dash" be Normally distributed in this league? Also, why or why not? - Yes, because it only takes positive values. - No, because it is continuous. - No, because it doesn't obey the 68-95-99.7 rule. - Yes, because it is continuous.
No, because it doesn't obey the 68-95-99.7 rule. If it were Normally distributed, the variable would have to obey the 68-95-99.7 Rule. The fastest time ever is only 1.33 standard deviations below the mean time of 4.4 seconds.
If the mean µ is increased to 5 what will happen to the curve? - The curve will not change. - The peak of the curve will get higher. - The peak of the curve will get lower. - The curve will move to the right along the horizontal axis.
The curve will move to the right along the horizontal axis. Increasing the mean will cause the curve to move to the right along the horizontal axis, so that the peak of the curve will be over the value 5.
A very large school district in Connecticut wanted to estimate the average IQ of this year's graduating class. The district took a simple random sample of 100 seniors and calculated the 95% confidence interval for µ as 95 to 105 points. Consider this interpretation of the confidence interval: For all graduating seniors, the mean IQ is between 95 and 105 points with 95% confidence. What needs to be done to fix this statement? - The true parameter needs to be stated correctly and in the correct setting. - The actual confidence interval is not reported. - The value of the estimate is not given. - Nothing needs to be fixed; the interpretation is correct.
The true parameter needs to be stated correctly and in the correct setting. Because the sample came from only one school district, we can only address the average SAT score in this district, and not all graduating seniors as the interpretation implies. A better interpretation is, With 95% confidence, the mean IQ of all graduating students in this Connecticut school district is between 95 and 105 points.
A test of significance uses evidence provided by sample data to assess whether a claim about __________ is supported or refuted. - a confidence interval - a confidence level - the value of a sample statistic - a parameter
a parameter This is what a test of significance does in a nutshell.
A _____ is designed to assess the strength of the evidence against the null hypothesis.
hypothesis test
In which of these cases would the confidence interval become wider? - If the sample size decreased - None of the above - If the confidence level decreased - Both of the above
if the sample size decreased
To determine whether a test is one-sided or two-sided, we look at the ___________ hypothesis.
alternative This is the hypothesis gives the direction against the claim in the null hypothesis.
A certain manufacturer of paints uses an additive to get the drying time for a specific paint to be 75 minutes. If there's too much additive, the drying time could be longer than specified but too little additive will decrease the drying time. In testing the amount of additive, they use these hypotheses: H0: μ = 7 vs Ha: μ ≠ 7 ml. An implication of having a ______ α would be concluding that the mean amount of additive is different from 7 ml more often.
larger When the α level of a test increases, it is easier to reject the null hypothesis.
We take _____ of size n, and rely on the known properties of the sampling distribution
one random sample
margin of error
represents the accuracy of our guess for the parameter
confidence level
the estimated probability that a population parameter lies within a given confidence interval
A __________ distribution of the sample mean tells us the values of the sample mean from all possible samples.
sampling This is exactly what a sampling distribution of the sample mean tells us.
The __________ distribution of a statistic is the distribution of values taken by a statistic in all possible samples of the same size from the same population.
sampling This is the definition of a sampling distribution.
The distribution of the averages calculated for repeated samples of size 10 of two-year-old male children is called a: - population distribution. - parametric distribution. - statistical distribution. - sampling distribution.
sampling distribution
α is called the __________ level.
significance The name for α is level of significance for the test.
The _____ is an arbitrary threshold that can be used when reporting whether a P-value is statistically significant.
significance level (α)
Results that are unlikely to happen due to chance if a claim were true are called __________.
significant Significance does not mean important in statistics. Rather it means that the difference between the observed statistic and the claimed parameter value is too BIG to be due to chance.
If P-value < α, the results are statistically __________.
significant We declare the results to be significant whenever P-value is less than α.
Using technology or other random procedures to imitate chance behavior is called __________.
simulation This is what we use to imitate taking many, many random samples to investigate the behavior of the sample mean.
if the critical value increases
the confidence level increases
null hypothesis
the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error The specific claim about a population of interest tested by a hypothesis test.
how to correctly interpret the p-value?
the probability is about getting the statistic the assumption is on the null hypothesis being true (EX) The P-value is the probability of observing a sample mean less than or equal to 32.0 olives, assuming that the true mean number of olives per can is 33 olives.
The P-value is .....
the probability of obtaining certain values of the statistic, given that the null hypothesis is true, not a probability on either hypothesis
In statistics, the odds of an outcome is __________. - the probability of the outcome - the probability that the outcome does not occur - the probability that the outcome does not occur, divided by the probability that the outcome does occur - the probability that the outcome occurs, divided by the probability that the outcome does not occur
the probability that the outcome occurs, divided by the probability that the outcome does not occur
Let X be a random variable whose distribution is Normal with a mean of 100 and a standard deviation of 10. Which of the following is equivalent to the proportion of observations above 115? - the proportion less than or equal to 115 - the proportion less than 115 - the proportion less than 85 - the proportion between 85 and 115
the proportion less than 85 The proportion more than 15 units below the mean is the same as the proportion more than 15 units above the mean, because of symmetry in Normal distributions.
Two or more events are disjoint if they have _____ outcomes in common.
zero Events that are disjoint cannot happen together.
Select the statements that describe a normal distribution. - The normal distribution is a continuous distribution. - The density curve is a flat line extending from the minimum value to the maximum value. - The normal distribution is a discrete distribution. - Two parameters define a normal distribution—the median and the range. - The density curve is symmetric and bell‑shaped. - Approximately 32% of values fall more than one standard deviation from the mean.
- The normal distribution is a continuous distribution - The density curve is symmetric and bell-shaped - Approximately 32% of values fall more than one standard deviation from the mean
Suppose a radar device tracks the speeds of cars traveling through a city intersection. After recording the speeds of over 10,000 different cars, local police determine that the speeds of cars through this intersection, in kilometers per hour, follow a normal distribution with a mean μ= 45 and standard deviation σ= 5. The area under the normal curve between 40 and 50 is equal to 0.68. Select all of the correct interpretations regarding the area under the normal curve. - In any sample of cars from this city intersection, 68% travel between 40 and 50 km/h. - The probability that a randomly selected car is traveling between 40 and 50 km/h is equal to 0.68. - The proportion of cars traveling faster than 40 km/h is equal to 0.68. - In the long run, 68% of cars passing through this city intersection travel either 55 or 45 km/h. - The long-run proportion of all cars traveling between 40 and 50 km/h is equal to 0.68.
- The probability that a randomly selected car is traveling between 40 and 50 km/h is equal to 0.68. - The long-run proportion of all cars traveling between 40 and 50 km/h is equal to 0.68.
A level C confidence interval for a parameter has two parts:
- a confidence interval (estimate ± margin of error), where the estimate is a sample statistic and the margin of error represents the accuracy of our guess for the parameter - A confidence level C (ex. 95%), which gives the probability that the interval will capture the true parameter value in repeated samples. That is, the confidence level is the success rate for the method
Select the statements that would lead to a smaller margin of error, assuming the other factors remain the same
- the population standard deviation turns out to be lower than expected - the researcher lowers the confidence level - the researcher increases the sample size
The amount of caffeine consumed per day by children aged eight to twelve years old has a right skewed distribution with mean μ = 110 mg and standard deviation σ = 30 mg. The standard deviation of the sampling distribution of x¯ for a random sample size of 9 is _____ mg.
10 The standard deviation of the sampling distribution of x¯ equals σ/n‾√ where σ = 30/3 = 10 is the standard deviation of the population.
For a Normal distribution, the chance of the variable z equaling exactly 1.5 is _____.
0 For a standard Normal (or any Normal distribution) the chance of the variable exactly equaling any value is 0.
The Normal approximation for the sampling distribution of p̂ is least accurate when p is close to _____. - 0 - 1 - 0 or 1 - The value of p does not change the accuracy of the Normal approximation.
0 or 1 The approximation is least accurate when p is close to 0 or 1.
The distribution of comprehensive scores of all students taking an IQ test in 2010 is approximately Normal with mean μ = 100 and standard deviation σ = 15. The probability of getting a sample mean from a random sample size of 16 that is greater than 112 is _____. Use the standard Normal table or software. Give your answer to four decimal places.
0.0007 To find the probability, we first compute a z-score and look it up on the standard Normal table. z=(x¯−μ)/(α/n‾√)=(112−100)/(15/16‾‾‾√)=3.2. Look up 3.2 to get 0.9993. Since we want the "greater than" probability, subtract from 1 to get 1 - 0.9993 = 0.0007.
Four ounce bags of cashews packaged in a cashew plant have a standard deviation of 0.03 ounce and have a Normal distribution. Suppose a random sample of 9 bags is to be taken from production and the mean weight of the sample computed. Consider all possible values that this sample mean could be. Approximately 95% of these possible sample means will be between µ - m and µ + m. What is the value of m? - 0.01 ounces - 0.02 ounces - 0.06 ounces - 0.03 ounces
0.02 ounces 95% of all possible sample means will be within two standard deviations of the mean. So, m = 2 × σ divided by n‾√.
According to the 68-95-99.7 Rule, approximately 99.7% of the area in any Normal distribution is within 3 standard deviations of the mean μ. The actual number z of standard deviations for 99.7% of all observations within z σ of μ is 2.97. This is different from the Empirical Rule by _____ standard deviations.
0.03 If 99.7% is in the center of the distribution, there is 0.0015 (0.15%) in either tail. Read the Standard Normal table to find that z = 2.97. This is a difference of 3 - 2.97 = 0.03 from the Empirical Rule.
In a certain marathon, the average time to complete the race was Normally distributed with mean μ = 4.15 hours and standard deviation σ = 0.84 hours. If we wanted to give medals to the 4% of the runners with the best times, what time in hours would be the slowest time to get a medal? Note: The "best" times are the smallest times, we are looking for the cut-off score for the bottom 4% of the distribution of running times. Therefore, you start by locating _____ in the z-score table to find the accompanying z-score. This leads to the "best" times of 2.68 hours. Round your answer to two decimal places.
0.04 Locate 0.0400 as closely as possible in the body of the table (-1.75 has area 0.0401 to the left of it). Read to the left for -1.7 and up for .05. Now, use X = zσ + μ to solve for X = 2.68 hours. When you substitute -1.75 into the equation you have -1.75*0.84 + 4.15 = 2.68.
Suppose that (unknown to scientists) 35% of all Western White Pine trees in a national forest have white pine blister rust (a fungus disease). A researcher samples 100 trees in order to estimate the proportion of trees with the disease. What is the standard deviation of the sample proportion, p̂ ? - 0.048 - 0.182 - 0.002 - 0.228
0.048 Calculate the standard deviation using the formula √((p(1-p))/n). We are told that the population proportion, p, is 0.35, and the sample size, n, is 100. So the standard deviation is √((.35(1-.35))/100)= √((.2275)/100)= √0.002275= 0.0477 which rounds to 0.048.
The amount of caffeine consumed per day by children aged eight to twelve years old has a right skewed distribution with mean μ = 110 mg and standard deviation σ = 30 mg. The mean of the sampling distribution of x¯ for a random sample size of 9 is _____ mg.
110 The mean of the sampling distribution of x¯ equals the mean of the population.
Weights of adult female California sea lions are approximately Normal with mean μ = 220 pounds and standard deviation σ = 8 pounds. The mean of the sampling distribution of x¯ created from random samples of size 36 is _____.
220.0 The mean of the sampling distribution of x¯ equals μ, the mean of the population.
Researchers are testing a new type of surgical technique that they hope will decrease the rate of post-surgical medical complications among patients. Suppose that (unknown to the researchers) 2% of all patients will develop post-surgical medical complications when the new technique is used. The researchers apply the new technique to a random sample of 20 surgical patients in order to test its effectiveness. What is the probability that more than 5% of patients in the sample will develop post-surgical medical complications? - 0.8315 - 0.0313 - 0.9687 - 0.1685
0.1685 We are told the sample size is n = 20 and the population proportion is p =0.02. The sample proportion p̂ has mean 0.02 (equal to p) and standard deviation √((p(1-p))/n) = √((0.02*0.98)/20) = 0.0313. We want the probability p̂ is 0.05 or greater. Standardize p̂ by subtracting the mean and dividing by the standard deviation. This gives the z-score z = (0.05-0.02)/0.0313 = 0.96. Using table B in the back of the book we see the area to the left of z = 0.96 is 0.8315. We need the area to the right, so subtract this value from 1: 1-0.8315 = 0.1685. If the researchers take repeated samples of size 20, more than 5% of the patient will develop complications about 16.85% of the time.
Soda bottles are filled with volumes that are normally distributed with mean volume 12 ounces and a standard deviation of 0.15 ounce. Acceptable limits of volume are between 11.80 and 12.20 ounces. What proportion of bottles are filled to an unacceptable limit, given to 4 decimal places?
0.1836 Because of the symmetry of the Normal distribution we can find the area to the left of 11.8 ounces in this distribution and double it. The z-score for 11.8 is z = (11.8 - 12)/.15 = -1.33. From the table, the area to the left of -1.33 is 0.0918. 2(0.0918) = 0.1836. Using technology (not rounding the z-score), we get a slightly different value of 0.1824.
Normal body temperature for a healthy adult has a standard deviation of 0.7 ºF. We can be 95% confident that the average body temperature for 49 healthy adults will be at most degrees away from the true mean body temperature - 1.4 - 0.2 - 0.7 - 0.1
0.2
Mensa is a society for "geniuses." One way to qualify for membership is having an IQ at least 2.5 standard deviations above average. IQs are approximately Normally distributed, according to the WAIS scale. The area to the left of a qualifying Mensa score is 0.9938, what percent of people should qualify?
0.62% To qualify, a person's z-score must be at least 2.5. Find the area above z = 2.5 by subtracting the table entry (0.9938) from 1. This gives 1-0.9938 = 0.0062. Now multiply by 100 to convert to a percent: 100*0.0062 = 0.62%
Use the standard normal table (use the table reader, the table in your textbook, or a graphing calculator) to find the area between the following two z-scores: -1.99 and 0.50.
0.6682 The area to the left of 0.50 is 0.6915. The area to the left of -1.99 is 0.0233. For the area between the two, subtract 0.6915 - 0.0233 = 0.6682.
The length of human pregnancies from conception to birth is known to be normally distributed with a mean of 266 days and standard deviation of 16 days. The proportion of pregnancies that last between 250 and 274 days is 0.5328 because _____ - 0.1587 = 0.5328.
0.6915 The z-score for 274 is (274 - 266)/16 = 8/16 = 0.50. The z-score for 250 is (250 - 266)/16 = -1.00. The area to the left of 0.50 is 0.6915. The area to the left of -1 is 0.1587. Subtract to find 0.6915 - 0.1587 = 0.5328.
Maggie believes that the probability of her research being published on the front page of this month's International Research Journal of Applied Life Sciences is approximately 45%. Robert believes that his chances of being published under the same circumstances are 35%. Sophia thinks that she has a 20% chance. The personal probability that either Robert or Maggie gets published on the front page is _____.
0.80
Weights of adult female California sea lions are approximately Normal with mean μ = 220 and standard deviation σ = 8. The probability of getting a sample mean from a random sample size of 9 adult female California sea lions that is less than 223 pounds is ___________. Use the standard Normal table or software. Give your answer to four decimal places.
0.8686 To find the probability, we first compute a z-score and look it up on the standard Normal table. We use z=(x¯−μ)/(α/n‾√)=(223−220)/(8/9‾√)=1.125 to find the z-score for a sample mean. Look up 1.125 to get 0.8686. Since we want the "less than" probability, this is the answer.
Suppose that (unknown to scientists) 20% of all Western White Pine trees in a national forest have white pine blister rust (a fungus disease). A researcher samples 100 trees in order to estimate the proportion of trees with the disease. What is the probability that at least 15% of the trees in the sample have white pine blister rust? - 0.8944 - 0.1056 - 0.0016 - 0.9984
0.8944 We are told the sample size is n = 100 and the population proportion is p = 0.20. The sample proportion p̂ has mean 0.20 (equal to p) and standard deviation √((p(1-p))/n) = √((0.2*0.8)/100) = 0.04. We want the probability p̂ is 0.15 or greater. Standardize p̂ by subtracting the mean and dividing by the standard deviation. This gives the z-score z = (0.15-0.20)/0.04 = -1.25. Using table B in the back of the book we see the area to the left of z = -1.25 is 0.1056. We need the area to the right, so subtract this value from 1: 1-0.1056 = 0.8944. If the researcher takes repeated samples of size 100 from the forest, more than 10% of the trees sampled will have white pine blister rust about 89.44% of the time.
The amount of caffeine consumed per day by children aged eight to twelve years old has a right skewed distribution with mean μ = 110 mg and standard deviation σ = 30 mg. The probability that the mean caffeine consumption of a random sample of 36 eight to twelve year olds is greater than 97 mg is _____. (Use the standard Normal table or software that provides Normal probabilities.) Give your answer to four decimal places.
0.9953 The sample size of 36 is large enough to apply the central limit theorem. To solve, we first compute z=(x¯−μ)/(σ/n‾√)=(97−110)/(30/36‾‾‾√)=−2.60. Looking up the z-score on the standard Normal table, we get 0.0047. Since we want "greater than", the probability is 1 - 0.0047 = 0.9953.
Hemoglobin is the compound in red blood cells that carries oxygen to the body. The distribution of hemoglobin in women in g/dl of blood is approximately normally distributed with mean 14 and standard deviation 1. Too little hemoglobin (below 12), and you're anemic. Too much (above 15), and (unless you live at high altitudes), you can have other problems. Using the 68-95-99.7% rule, 2.5% of women will have hemoglobin levels below 12 (be anemic) because __________. - 12 is one standard deviation below the mean - 12 is three standard deviations below the mean - 12 is the mean - 12 is two standard deviations below the mean
12 is two standard deviations below the mean 12 is 2 standard deviations below the mean. Normal distributions have 95% of their area within two standard deviations of the mean, so there is 5% total below 12 and above 16. Because of symmetry, there will be 2.5% below 12.
A researcher is estimating the mean resting heart rate of a certain type of rabbit. She samples 58 rabbits and calculates the mean heart rate in the sample is x¯ = 230 beats per minute. Suppose the resting heart rates of all rabbits in this population are Normally distributed with standard deviation σ = 50 beats per minute. Using the exact z* value from Table C, a 95% confidence interval for the mean is 217.14 to 242.86 beats per minute. The margin of error is _____.
12.86 The formula for a confidence interval is x¯ ± z * σ / √ n. We are told x¯ = 230, σ = 50, and n = 58. For a 95% confidence interval z* = 1.96. This gives 230±1.96 50/√58. Which is 230±12.86. Subtracting gives 230 - 12.86 = 217.14; adding gives 230 + 12.86 = 242.86. But the margin of error is just 12.86.
Scientists discovered a new group of proteins in an animal species. They found that the distribution of the number of amino acids these proteins were made of was approximately Normal, with mean 530 and standard deviation 80. Approximately what percent of these new proteins will be more than 690 amino acids long? - 16% - 5% - Do not have enough information - 2.5%
2.5%
A researcher would like to estimate the mean number of hours students at a large university sleep per night. She takes a simple random sample of 113 students and asks them to report the number of hours they slept the previous night. The mean of the sample is x¯ = 7.2 hours. Assume that the number of hours of sleep for all students in the population has an exactly Normal distribution with standard deviation 1.4 hours. The 95 part of the 68-95-99.7 rule says that x¯ is within _________ hours of the mean µ in 95% of all samples.
2.8 The 95 part of the 68-95-99.7 rule says that x¯ is within 2 standard deviations of the mean µ in 95% of all samples.
For any Normal distribution, the two z-scores that divide the area into the middle 99.7% are z = ±_____. (Round to two decimal places.) - 1.96 because there is 2.5% in each tail - 2.05 because there is 2% in each tail - 2.75 because there is 0.3% in each tail - 2.97 because there is 0.15% in each tail
2.97 because there is 0.15% in each tail If 99.7% is in the middle of the distribution, there is 0.0015 (0.15%) in either tail. Read the Standard Normal table to find that z = 2.97.
A researcher is estimating the mean resting heart rate of a certain type of rabbit. She samples 12 rabbits and calculates the mean heart rate in the sample is x¯ = 230 beats per minute. Suppose the resting heart rates of all rabbits in this population are Normally distributed with standard deviation σ = 50 beats per minute. Using the exact z* value from Table C, a 90% confidence interval for the mean is: - 218 to 242 beats per minute. - 206.3 to 253.7 beats per minute. - 211.7 to 248.3 beats per minute. - 118.6 to 341.4 beats per minute.
206.3 to 253.7 beats per minute. The formula for a confidence interval is x¯ ± z * σ / √ n. We are told x¯ = 230, σ = 50, and n = 12. For a 90% confidence interval z* = 1.645. This gives 230±1.645 50/√12. Which is 230±23.7. Subtracting gives 230 - 23.7 = 206.3; adding gives 230 + 23.7 = 253.7.
The distribution of comprehensive scores of all students taking the ACT exam in 2010 is approximately Normal with mean μ = 21 and standard deviation σ = 5. The z-score for the probability that a randomly selected student scored less than 25 is (25 - ____)/5. Fill in the missing value.
21 To find the probability, we first compute a z-score and look it up on the standard Normal table. We use z=(x¯−μ)/σ=(25−21)/5=0.8 to find the z-score for an individual. Look up 0.8 to get 0.7881. Since we want the "less than" probability, this is the answer.
A researcher is estimating the mean resting heart rate of a certain type of rabbit. She samples 58 rabbits and calculates the mean heart rate in the sample is x¯ = 230 beats per minute. Suppose the resting heart rates of all rabbits in this population are Normally distributed with standard deviation σ = 50 beats per minute. Using the exact z* value from Table C, a 95% confidence interval for the mean is - 223.4 to 236.6 beats per minute. - 172 to 288 beats per minute. - 217.1 to 242.9 beats per minute. - 130 to 230 beats per minute.
217.1 to 242.9 beats per minute. The formula for a confidence interval is x¯ ± z * σ / √ n. We are told x¯ = 230, σ = 50, and n = 58. For a 95% confidence interval z* = 1.96. This gives 230±1.96 50/√58. Which is 230±12.86. Subtracting gives 230 - 12.86 = 217.14; adding gives 230 + 12.86 = 242.86.
High blood cholesterol increases your risk of heart attack and stroke. Cholesterol levels in young women are approximately Normal with mean 189 mg/dl and standard deviation 40 mg/dl. About 34% of women will have levels between the mean, 189, and _______________. - 229 - 269 - 309 - 349
229 In a Normal distribution, about 68% of all observations are within one standard deviation of the mean. 229 is one standard deviation above the mean and 189 is the mean. Half of the 68% will be between 189 and 229.
Suppose you roll a six-sided die 1000 times. The average face value of the die will most likely be close to _____.
3.5 You can have face values of 1, 2, 3, 4, 5 or 6. The population mean μ value is 21/6=3.5. The Law of Large Numbers dictates that a large sample size will yield a sample mean that is close to the population mean.
The amount of caffeine consumed per day by children aged eight to twelve years old has a right skewed distribution with mean μ = 110 mg and standard deviation σ = 30 mg. The probability statement that the mean caffeine consumption of a random sample of 36 eight to twelve year olds is greater than 118 mg is 1 - P(Z < (118 - 110)/(_____/6)). Fill in the missing value.
30 The sample size of 36 is large enough to apply the central limit theorem. To solve, we first compute z=(x¯−μ)/(σ/n‾√)=(118−110)/(30/36‾‾‾√)=−1.60. Looking up the z-score on the standard Normal table, we get 0.9452. Since we want "greater than", the probability is 1 - 0.9452 = 0.0548.
The label on a can of a particular brand of extra large olives states that there are about 33 olives in each can. A gourmet cook feels that the claim of 33 olives per can is too high, and that the average number of olives per can is less than 33. He samples 35 cans and finds x¯ = 32.9 with a P-value of 0.119 for testing H0: µ = 33 versus Ha: µ < 33. Is the following interpretation of this P-value correct or incorrect? The P-value is the probability of observing a sample mean less than or equal to _____ olives, assuming that the true mean number of olives per can is 33 olives.
32.9 This interpretation is correct because it has both of the following: (1) the probability is about getting the sample statistic (sample mean of 32.9 olives or less) and (2) the assumption is on the null hypothesis being true (true mean is 33 olives).
We can use a Normal distribution to model the length of gestation for pregnant women. The model has a mean of 280 days and a standard deviation of 20 days. Ninety-five percent of women will have gestation periods between 240 and ________ days. - 340 - 280 - 360 - 320
320
Researchers worry that exposure to pollutants from an environmental waste site may lead to an increase in the incidence of Disease X among residents of the nearby community. In a random sample of 7,900 community members, 316 people had this disease. Among members of this community, the odds of having the disease are _____%. Round to the nearest tenth of a percent.
4.2 The proportion of community members with the disease is 316/7900 = 0.04. The proportion of community members without the disease is: 1-0.04 = 0.96. To find the odds of the disease divide: 0.04/0.96 = 0.0416 which rounds to 0.042 or 4.2%.
If a coin is flipped four times and we are interested in the event "two heads and two tails," the event will have _____ outcomes in it. - 8 - 6 - 2 - 16
6
A researcher would like to estimate the mean number of hours students at a large university sleep per night. She takes a simple random sample of 561 students and asks them to report the number of hours they slept the previous night. The mean of the sample is x¯ = 6.4 hours. Assume that the number of hours of sleep for all students in the population has an exactly Normal distribution with standard deviation 2.4 hours. The 95 part of the 68-95-99.7 rule says that ___________is an approximate 95% confidence interval for µ. - 6.4 ± 2.4 - 6.4 ± 1.2 - 1.2 ± 2 - 1.2 ± 0.8
6.4 ± 2.4 The 95 part of the 68-95-99.7 rule says that x¯ is within 2 standard deviations of the mean µ in 95% of all samples. The standard deviation of the sample mean here is 2.4/µ = 1.2, and 2*1.2 = 2.4 hours. The 95 part of the 68-95-99.7 rule says that 6.4 ± 2.4 is an approximate 95% confidence interval for µ.
According to the empirical rule, if 𝑧 is a random variable that has a standard normal distribution, then approximately 68% of all values fall between -1 and 1. Use a table of 𝑧‑critical values or software to find this percentage precise to at least two decimal places.
68.26% One method for finding the percentage of values in a standard normal distribution that will fall between -1 and 1, or 𝑃 (−1 ≤ 𝑧 ≤ 1), is to begin by finding the percentage of values that fall below 1, or 𝑃 (𝑧 ≤ 1). You can find this probability using a table of 𝑧‑critical values. Because you want only values that are between -1 and 1, as opposed to all values below 1, subtract the percentage of values that fall below -1, or 𝑃 (𝑧 ≤ −1). Thus, calculate the percentage of values in a standard normal distribution that will fall between -1 and 1 as follows. 𝑃 (−1 ≤ 𝑧 ≤ 1) = 𝑃 (𝑧 ≤ 1)−𝑃 (𝑧 ≤ −1) = 84.13%−15.87%=68.26%
If a coin is flipped three times, and the outcome of each flip is recorded in order, the sample space will have _____ possible outcomes. - 8 - 4 - 3 - 2
8 Each flip has two possible outcomes (heads or tails). There are two possibilities for the first, then two for the second, then two for the third, so the total number of possible outcomes is 2*2*2=8. We can list out the outcomes as: TTT, THH, HTH, HHT, TTH, THT, HTT, HHH
Weights of adult female California sea lions are approximately Normal with mean μ = 220 pounds and standard deviation σ = 8 pounds. The standard deviation of the sampling distribution of x¯ created from random samples of size 36 is _____/6. Fill in the missing value.
8 The standard deviation of the sampling distribution of x¯ equals σ/n‾√ which equals 8/& sqrt; 36 or 1.33.
The distribution of bladder volume in men is approximately Normal with mean 550 ml and standard deviation 100 ml. What percent of men have a bladder volume larger than 450 ml? Round to the nearest whole number.
84% 450 ml is one standard deviation below the mean. We expect about 68% of all men to have bladder volumes within one standard deviation of 550. The remaining 32% is split in half because of symmetry. Add the upper 16% to the 68% in the middle of the distribution.
Let X be a random variable whose distribution is Normal with a mean of 100 and a standard deviation of 10. The proportion of observations above 115 is equal to the proportion below what value? Use the fact that the normal distribution is symmetric about the mean.
85 The proportion more than 15 units below the mean is the same as the proportion more than 15 units above the mean, because of symmetry in Normal distributions.
Soda bottles are filled with volumes that are normally distributed with mean volume 12 ounces and a standard deviation of 0.15 ounce. Acceptable limits of volume are between 11.80 and 12.20 ounces. The proportion of bottles that are under filled is __________. Please choose the correct answer from the following choices, and then select the submit answer button. - 91% - 9% - 82% - 18%
9% We find the area to the left of 11.8 ounces in this distribution. The z-score for 11.8 is z = (11.8 - 12)/.15 = -1.33. From the table, the area to the left of -1.33 is 0.0918.
The distribution of bladder volume in men is approximately Normal with mean 550 ml and standard deviation 100 ml. _____% of all men will be expected to have bladder volumes between 350 and 750 ml because this is two standard deviations from the mean.
95 350 is two standard deviations below the mean of 550 and 750 is two standard deviations above the mean. Approximately 95% of the area in a Normal distribution is within two standard deviations of the mean.
A researcher would like to estimate the mean number of hours students at a large university sleep per night. She takes a simple random sample of 312 students and asks them to report the number of hours they slept the previous night. Using the sample data she reports that she is 95% confident that the true average amount of nightly sleep students at the university get is between 6.1 and 8.1 hours. In statistics, the phrase "95% confident" means that these estimates were obtained using a method that gives correct results _____% of the time.
95 This is the meaning of the phrase "95% confident" in statistics
You compare P-value with α in the __________ step of the four-step process for a test on a population mean.
CONCLUDE In order to draw conclusions, you have to compare the P-value with α in this step.
Three Normal distributions all have mean 20. Distribution A has standard deviation 1, distribution B has standard deviation 5, and distribution C has standard deviation 10. The distribution with the flattest peak is - Distribution C. - Distribution B. - They will all have the same shape. - Distribution A.
Distribution C. Because this distribution has the largest standard deviation, it will have the flattest peak in order to have the entire area under the curve remain 1.
HYPOTHESIS TESTS: THE FOUR-STEP PROCESS
STATE: What is the practical question that requires a statistical test? PLAN: Identify the parameter, state the null and alternative hypotheses, and choose the type of test that fits your situation. SOLVE: Carry out the test in two phases: Check the conditions for the test you plan to use. Obtain the test statistic and the P-value. CONCLUDE: Return to the practical question to describe your results in this setting.
If you set alpha to 0.05 and the p-value is 0.06, what would you conclude? - Fail to reject the null hypothesis; we will continue to assume it. - The null hypothesis has been proven. - Reject the null hypothesis in favor of the alternative hypothesis.
Fail to reject the null hypothesis; we will continue to assume it.
In 1995, the average level of mercury uptake in wading birds in the Everglades was 15 parts per million. A researcher is looking for evidence that this average has changed. What is the correct set of hypotheses? - H0: µ = 15, Ha: µ > 15 - H0: µ = 15, Ha: µ < 15 - H0: µ = 15, Ha: µ ≠ 15
H0: µ = 15, Ha: µ ≠ 15
A consumer advocate is interested in evaluating the claim that a new granola cereal contains ''4 ounces of cashews in every bag.'' The advocate recognizes that amounts of cashews will vary slightly from bag to bag, but she suspects that the mean amount of cashews per bag is actually less than 4 ounces. To check the claim, the advocate purchases a random sample of 40 bags of cereal and calculates a sample mean of 3.68 ounces of cashews. What null and alternative hypotheses should she test? - H0: µ = 4 versus Ha: µ > 4 - H0: µ = 4 versus Ha: µ < 4 - H0: µ < 4 versus Ha: µ = 4 - H0: µ = 4 versus Ha: µ ≠ 4
H0: µ = 4 versus Ha: µ < 4 She really suspects that the mean amount of cashews per bag is less than 4 ounces, so she wants to determine if this is correct. There would be no cause for her to be concerned if there were actually more cashews than claimed.
The times for untrained rats to run a standard maze have an N (65, 15) distribution, where the times are measured in seconds. The researchers hope to show that training improves the times, and they will do so by testing statistical hypotheses. What is the appropriate alternative hypothesis? - Ha: m < 65 seconds - Ha: m > 65 seconds - Ha: m < 15 seconds - Ha: m = 65 seconds
Ha: m < 65 seconds
An endocrinologist is interested in the effects of depression on the thyroid. It is believed that healthy subjects have a mean thyroxin (a hormone related to thyroid function) level of 7.0 micrograms/100 ml and a standard deviation of 1.6 micrograms/100 ml. The endocrinologist wants to assess whether the mean thyroxin level is different for those with depression. She samples 35 subjects with depression and obtains a sample mean of 7.82 micrograms/100 ml for thyroxin. What alternative hypothesis does she want to test? H0: µ = 7 Ha: µ = 7 Ha: µ ≠ 7 H0: µ ≠ 7
Ha: µ ≠ 7 She wants to assess whether the mean thyroxin level differs from 7.0 micrograms/100 ml, so she wants ≠ in the alternative hypothesis.
A statistical hypothesis test is analogous to a criminal trial in the American justice system. There, the null hypothesis is that the defendant is innocent. In this case, letting a guilty person go free is a Type _____ error.
II A Type II error is when the null hypothesis is wrongly NOT rejected. Here, the null hypothesis is that the person is innocent.
Which of the following statements concerning areas under the standard Normal curve is true? - If a z-score is negative, the area to its right is greater than 0.5. - If the area to the right of a z-score is less than 0.5, the z-score is negative. - If a z-score is positive, the area to its left is less than 0.5
If a z-score is negative, the area to its right is greater than 0.5. If a z-score is negative, the value is below the mean. The area to the right will be more than 0.5.
A research study tested the effectiveness of a new weight loss program for obese adults. 86 obese adults participated in the program for 90 days and the amount of weight each individual lost was recorded. The researchers used the results of the study to test the null hypothesis that the mean weight loss is equal to 0 pounds, versus the alternative hypothesis that mean weight loss is not equal to 0 pounds. They rejected the null hypothesis at the 5% significance level. Which of the following could NOT be a 95% confidence interval for the mean weight loss? - Interval A: 1.5 pounds to 7.4 pounds - Interval B: 15.6 pounds to 37.8 pounds - Both interval A and interval B - Interval C: -10.4 to 12.8 pounds.
Interval C: -10.4 to 12.8 pounds. This interval contains the value 0. If this was the 95% confidence interval the researchers would not have been able to reject the null hypothesis that mean weight loss was 0 pounds.
What effect does increasing the sample size have on margin of error? - It increases margin of error. - It decreases margin of error. - It has no effect on margin of error. - None of the other choices are correct.
It decreases margin of error. Increasing the sample size increases the denominator in the margin of error formula, which decreases the margin of error and the width of the confidence interval.
Decreasing the sample size has what effect on the margin of error? - It decreases the margin of error. - It has no effect on the margin of error. - It increases the margin of error. - The effect depends on the amount of decrease in the sample size.
It increases the margin of error. The sample size appears in the formula in the denominator as a square root. Decreasing the sample size will increase the margin of error, because dividing by a larger quantity makes the result smaller.
Suppose a botanist grows many individually potted eggplants, all treated identically and arranged in groups of four pots on the greenhouse bench. After 30 days of growth, she measures the total leaf area X of each plant. Assume that the population distribution of X is approximately Normal, with mean = 800 cm2 and SD = 90 cm2. Which of the following is a true statement about the sampling distribution for the average of each group of four plants? - Mean = 800 cm2 and SD = 90/4 cm2 - Mean = 800/4 cm2 and SD = 90/4 cm2 - Mean = 800 cm2 and SD = 90/2 cm2 - Mean = 800 cm2 and SD = 90 cm2
Mean = 800 cm2 and SD = 90/2 cm2
Researchers are testing a new drug that they believe will help patients recover from infections more quickly. A group of patients are randomly selected to receive the new drug, while another group of patients are treated with an existing drug. The recovery times for patients in each group are recorded. On average patients who were treated with the new drug recovered 15 days sooner than patients who were treated with the old drug. Researchers will use this data to test the null hypothesis that the recovery time in both groups are the same, versus the alternative hypothesis, that patients who received the new drug recover faster. Because patients who were treated with the new drug recovered 15 days sooner than patients treated with the old drug, we know that: - There is a statistically significant difference between the groups. - We need to know more about the sample standard deviations before a result can be determined. - These is not a statistically significant difference between the groups - More information is needed to know if the difference between the groups is statistically significant.
More information is needed to know if the difference between the groups is statistically significant. The effect size was large (15 days) but this alone does not tell us if the difference is statistically significant. Even a large effect size may not be statistically significant if the sample sizes are small.
Rachael got a 550 on the analytical portion of the Graduate Record Exam (GRE). If GRE scores are normally distributed and have mean μ = 600 and standard deviation σ = 30, what is true about other test takers? - More than 50% of test takers scored higher than Rachael. - Less than 50% of test takes scored higher than Rachael. - More than 50% of test takes scored lower than Rachael. - Exactly 50% of test takers scored higher than Rachael.
More than 50% of test takers scored higher than Rachael. Rachael scored below the mean, so more than 50% of test takers scored higher than her.
If the population is N(μ, σ), the sampling distribution is _____
N(μ, σ/√n) (If not, the sampling distribution is ~N(μ, σ/√n) if n is large enough)
A group of researchers wanted to know if there was a difference in average yearly income taxes paid between residents of two very large cities in the Midwestern United States. The average for the first city was $6,505 and for the second city, it was $6,511. Can we conclude the difference is statistically significant because the difference of $6 is greater than 0? - Yes, since 6 > 0. - No, since significance can only be determined by performing a statistical test. - Yes, since there was a difference in the means. - No, since no sample sizes were provided
No, since significance can only be determined by performing a statistical test. You cannot base your conclusion on the observed difference. Is that difference large enough that it should (or should not) have arisen by chance?
Another friend thinks it means that "we can be 95% confident that the true mean needle length of Torrey pine trees is 27 cm." Is that right?
No. It is the whole interval that estimates the population mean needle length
One friend believes it means that "95% of all Torrey pine needles have lengths between 25 and 29 cm." Is that right?
No. The confidence interval refers to the population mean needle length μ
The label on a can of a particular brand of extra large olives states that there are about 33 olives in each can. A gourmet cook feels that the claim of 33 olives per can is too high, and that the average number of olives per can is less than 33. He samples 35 cans and finds x¯ = 32.9. Assuming that the P-value is 0.119 for testing H0: µ = 33 versus Ha: µ < 33, what value of α would you reject the null hypothesis at? - 0.01 - 0.05 - 0.1 - None of these values are correct.
None of these values are correct. Since P-value = 0.119 > α = 0. 1, the results are not statistically significant for any of the values.
The shape of the sampling distribution of p̂ becomes more nearly __________ as the size of the sample increases
Normal When the sample size is large, the shape of the distribution is approximately Normal.
Consider this confidence interval interpretation: We are 90% confident that the true mean weight of all black bears is between 344.8 and 361.2 pounds. What needs to be done to fix this statement? - The true parameter needs to be stated correctly and in context. - The actual confidence interval is not reported. - The confidence level is either not reported or not reported correctly. - Nothing needs to be fixed; the interpretation is correct.
Nothing needs to be fixed; the interpretation is correct. The confidence level is given, the parameter is stated correctly and in context, and the interval is given. This is a correct interpretation of the confidence interval.
__________ is the probability of rejecting the null hypothesis when the alternative hypothesis is, in fact, correct.
Power Power is the probability of rejecting the null hypothesis when the alternative hypothesis is, in fact, correct.
A consumer advocate is interested in evaluating the claim that a new granola cereal contains 4 ounces of cashews in every bag. The advocate recognizes that amounts of cashews will vary slightly from bag to bag, but she suspects that the mean amount of cashews per bag is less than 4 ounces. To check the claim, the advocate purchases a random sample of 40 bags of cereal and calculates a sample mean of 3.68 ounces of cashews. Assuming that the P-value is 0.026 for testing H0: µ = 4 versus Ha: µ < 4, what conclusions should be drawn at α = 0.05? - Reject H0: µ = 4 and conclude that the mean amount of cashews is less than 4 ounces. - Reject H0: µ = 4 and conclude that the mean amount of cashews is 4 ounces. - Fail to reject H0: µ = 4 and conclude that the mean amount of cashews is not statistically different from 4 ounces. - Fail to reject H0: µ = 4 and conclude that the mean amount of cashews is less than 4 ounces.
Reject H0: µ = 4 and conclude that the mean amount of cashews is less than 4 ounces. Since P-value = 0.026 < α = 0.05, we reject H0: µ = 4, and conclude that Ha: µ < 4 is correct.
We are interested in the process "flip a coin twice." The sample space for what is observed in this process is __________. - S = {Heads, Tails} - S = {Head and Tail, Tail and Head} - S = {one head, two heads, no heads} - S = {Head and Head, Head and Tail, Tail and Head, Tail and Tail}
S = {Head and Head, Head and Tail, Tail and Head, Tail and Tail}
You calculate the test statistic in the __________ step of the four-step process for a test on a population mean.
SOLVE
You calculate the test statistic in the __________ step of the four-step process for a test on a population mean. (The answer is the name of the step.)
SOLVE You need to calculate the test statistic in order to find the P-value in the SOLVE step.
CONFIDENCE INTERVALS: THE FOUR-STEP PROCESS
STATE: What is the practical question that requires estimating a parameter? PLAN: Identify the parameter and choose a level of confidence. SOLVE: Carry out the work in two phases: Check the conditions for the interval you plan to use. Obtain the confidence interval. CONCLUDE: Return to the practical question to describe your results in this setting.
An endocrinologist is interested in the effects of depression on the thyroid. It is believed that healthy subjects have a mean thyroxin (a hormone related to thyroid function) level of 7.0 micrograms/100 ml and a standard deviation of 1.6 micrograms/100 ml. The endocrinologist wants to assess whether the mean thyroxin level is different for those with depression. She samples 35 subjects with depression and obtains a sample mean of 7.82 micrograms/100 ml for thyroxin. She finds the P-value to be 0.0024 for testing H0: µ = 7 versus Ha: µ ≠ 7. What is a correct interpretation of this P-value in context? - The P-value is the probability of observing a sample mean more extreme than 7.82 micrograms/100 ml, assuming that the true mean thyroxin is 7 micrograms/100 ml. - The P-value is the probability of observing a sample mean that is more extreme than 7 micrograms/100 ml, assuming that the true mean thyroxin is 7.82 micrograms/100 ml. - The P-value is the probability that the true mean thyroxin is different from 7 micrograms/100 ml, assuming random sampling from the population of subjects suffering from depression. - The P-value is the probability that the true mean thyroxin is 7 micrograms/100 ml, assuming random sampling from the population of subjects suffering from depression.
The P-value is the probability of observing a sample mean more extreme than 7.82 micrograms/100 ml, assuming that the true mean thyroxin is 7 micrograms/100 ml. This interpretation is correct because it has both of the following: (1) the probability is about getting the sample statistic (sample mean of 7.82) or more extreme and (2) the probability is conditional on the null hypothesis being true (true mean is 7).
potato chip company packages bags labeled as containing 1.5 ounces of chips. These are actually filled according to a Normal distribution with σ = 0.2 ounces and mean μ. A consumer advocate suspects the company of under-filling the bags. She obtains a random sample of n = 50 bags and weighs the contents of each, obtaining a sample mean x¯ = 1.45 ounces. Her P-value is 0.038 from a test statistic z = -1.77. The company refuses to accept the results of her test as conclusive, and says the results can be attributed to random sampling error. Why? - The alternate hypotheses are different. - She had bad luck in obtaining her sample; anyway, who would notice a difference of 0.05 ounces? - The advocate miscalculated the P-value. - The company is lying to us. They are purposely under-filling the packages
The alternate hypotheses are different. The advocate is only concerned with packages being underweight (she is testing H0: μ = 1.5 and Ha: μ < 1.5) while the company is concerned with both under- and over-filling bags and uses H0: μ = 1.5 and Ha: μ ≠ 1.5. This example points out the difference between a one- and a two-sided alternate. Rejecting a two-sided alternate requires more evidence.
When we say that our confidence is 95%, what is our confidence in? - The sample estimate. - The confidence interval procedure. - The probability that a specific confidence interval contains µ. - Our confidence interval.
The confidence interval procedure. Our confidence is in the procedure used to calculate our confidence interval. For 95% confidence, the confidence interval procedure creates confidence intervals that capture the population parameter 95% of the time.
alternative hypothesis
The hypothesis that states there is a difference between two or more sets of data. The more general claim about the population that we are trying to find evidence for
A requirement for inference with z-procedures is: - The population should have a Normal distribution. - The sample should have a Normal distribution. - The sample should be representative of the population. - Both A. and C. are correct
The population should have a Normal distribution.
What tells us how close x¯ is likely to be to µ? - The sample standard deviation, s - The standard deviation of the population σ - The mean of the population, µ - The sampling distribution of x¯
The sampling distribution of x¯ This provides information so that we can figure out how close x¯ might be to µ.
A population distribution is flat (or uniform) with mean μ = 2 and standard deviation σ = 0.4. The sampling distribution of x¯ is created from the sample means from all possible samples of size 64. How does the shape of this sampling distribution compare with the shape of a second sampling distribution with samples of size 10? - The shapes are the same, both are flat. - The shapes are both approximately Normal but the distribution of samples of size 64 is less Normal. - The shapes are the same, both are perfectly Normal. - The shapes are both approximately Normal but the distribution of samples of size 10 is less Normal.
The shapes are both approximately Normal but the distribution of samples of size 10 is less Normal. According to the central limit theorem, the shape of a sampling distribution gets closer to Normal as sample size increases. With a sample size of 64, the sampling distribution of x¯ is approximately Normal instead of flat like the population distribution.
Suppose that a manager is interested in estimating the average amount of money customers spend in her store. After sampling 36 transactions at random, she found that the average amount spent was $37.8537.85. She then computed a 9999% confidence interval to be between $32.8932.89 and $42.8142.81. Give a valid interpretation of the interval?
The store manager is 99% confident that the average amount spent by all customers is between $32.89 and $42.81.
A very large school district in Connecticut wanted to estimate the average IQ of this year's graduating class. The district took a simple random sample of 100 seniors and calculated the 95% confidence interval for µ as 95 to 105 points. Consider this interpretation of the confidence interval: The school district can be 95% confident that the mean IQ of the 100 students sampled is contained in the interval of 95 to 105 points. What needs to be done to fix this statement? - The true parameter needs to be stated correctly and in the correct setting. - The confidence level is either not reported or not reported correctly. - The value of the estimate is not given. - Nothing needs to be fixed; the interpretation is correct.
The true parameter needs to be stated correctly and in the correct setting. You know the mean of the sample, so you don't need to estimate it. The confidence interval gives a range of values for the parameter (the population mean), not the sample mean. A better interpretation is, With 95% confidence, the mean IQ of all graduating students in this school district is between 95 and 105 points.
A researcher would like to estimate the mean number of hours students at a large university sleep per night. She takes a simple random sample of 312 students and asks them to report the number of hours they slept the previous night. Using the sample data she reports that she is 95% confident that the true average amount of nightly sleep students at the university get is between 6.1 and 8.1 hours. What does the phrase "95% confident" mean in statistics? - Of the 312 students, 95% of them slept between 6.1 and 8.1 hours. - If other researchers repeat this experiment, they will get the same confidence interval (6.1 to 8.1) 95% of the time. - These estimates were obtained using a method that gives correct results 95% of the time. - When the researcher showed her results to 100 other researchers, 95% of them thought they were correct.
These estimates were obtained using a method that gives correct results 95% of the time. This is the meaning of the phrase "95% confident" in statistics
An endocrinologist is interested in the effects of depression on the thyroid. It is believed that healthy subjects have a mean thyroxin (a hormone related to thyroid function) level of 7.0 micrograms/100 ml and a standard deviation of 1.6 micrograms/100 ml. The endocrinologist wants to assess whether the mean thyroxin level is different for those with depression. She samples 35 subjects with depression and obtains a sample mean of 7.82 micrograms/100 ml for thyroxin. She finds the P-value to be 0.0024 for testing H0: µ = 7 versus Ha: µ ≠ 7. Can she conclude that the mean thyroxin differs significantly from 7 micrograms/100 ml at α = 0.05? - They would lead to the alternative not being accepted. - They would lead to the null not being rejected. - They are not statisticially significant. - They are statistically significant.
They are statistically significant. Since P-value = 0.0024 < α = 0.05, the results are statistically significant.
Financial aid offices are interested in how much a student might reasonably be able to earn when figuring aid packages. A simple random sample of students at one university in southern Georgia was interviewed about how much money they earn each month. The resulting 95% confidence interval was ($150.75, $223.74) for those with part-time jobs. Which of the following statements is true? - This confidence interval should contain the true mean monthly earnings for all students with part-time jobs at that university. - This confidence interval should contain the true mean monthly earnings for all college students with part-time jobs. - This confidence interval should contain the true mean monthly earnings for all college students. - This confidence interval should contain the true mean monthly earnings for all Georgia college students with part-time jobs.
This confidence interval should contain the true mean monthly earnings for all students with part-time jobs at that university. The sample was obtained from only one university. The results of inference extend only to the population from which the sample was taken.
If we are testing a null hypothesis that a production line is "in control" (meaning that output is satisfactory) versus an alternative that it has gone out of control, the consequences of rejecting the null mean having to shut down to adjust the machinery. This means a loss of production time and money. In which situation below would we want to set the α level the lowest (to avoid unnecessary shutdowns)?
We're making potato chips and the hypothesis test is concerned with the amount of fat in the chips. Too much fat content might be a cause for concern. However, that is not as serious a concern as some of the others listed.
A polling company uses random digit dialing to select the households to be interviewed. In one city, of 1000 calls, 15% of the calls reach an unlisted number. This is not surprising, since 18% of the residential phones in that city are unlisted. The number 18% is: - a statistic. - a parameter. - unknown. - a sampling distribution.
a parameter
What type of sample is required for our inference methods?
a simple random sample
Suppose that Jason recently landed job offers at two companies. Company A reports an average salary of $51,500 with a standard deviation of $2,175. Company B reports an average salary of $46,820 with a standard deviation of $5,920. Assume that salaries at each company are normally distributed. a) Jason's goal is to secure a position that pays $55,000 per year. What are the 𝑧‑scores for Jason's desired salary at Company A and Company B? Company A z: Company B z: b) At which company is Jason more likely to obtain his desired salary of $55,000 per year? - Company B, because the 𝑧‑score for $55,000 at Company B is less than the 𝑧‑score for $55,000 at Company A. - Company B, because the 𝑧‑score for $55,000 at Company B is greater than the 𝑧‑score for $55,000 at Company A. - Company A, because the 𝑧‑score for $55,000 at Company A is greater than the 𝑧‑score for $55,000 at Company B. - Company A, because the 𝑧‑score for $55,000 at Company A is less than the 𝑧‑score for $55,000 at Company B.
a) Company A z: 1.61; Company B z: 1.38 b) Company B, because the 𝑧‑score for $55,000 at Company B is less than the 𝑧‑score for $55,000 at Company A.
The distribution of bladder volume in men is approximately Normal with mean 550 milliliters (ml) and standard deviation 100 ml. In women, bladder volumes are approximately Normal with mean 400 ml and standard deviation 75 ml. Refer to the plot of these two distributions and use the 68-95-99.7 rule to answer the questions. a) Between what volumes do the middle 95% of men's bladders fall? lower value: upper value: b) What percent of men's bladders have a volume larger than 650 ml? c) Between what values do almost all (99.7%) of women's bladder volumes fall? lower value: upper value: d) How small are the smallest 2.5% of all bladders among women?
a) lower value: 350; upper value: 750 b) 16% c) lower value: 175; upper value: 625 d) 250 (a) The middle 95% of men's bladder volumes fall within plus or minus two standard deviations from the mean, so 550 ± 2(100) = 350 to 750 ml. (b) The value 650 is one standard deviation above the mean. So, 16% (50% minus half of 68% ) of men's bladders have a volume larger than 650 ml. (c) Almost all of women's bladder volumes fall within plus or minus three standard deviations from the mean, so 400 ± 3(75) = 175 to 625 ml. (d) The smallest 2.5% would be below 250 ml (two standard deviations below the mean).
When pulling a single card out of a well-shuffled deck, the events {face card} and {spade} are disjoint, which is NOT demonstrated by which card? - jack of spades - ace of spades - queen of spades - king of spades
ace of spades An ace is not a face card.
In a test of significance we typically want to find evidence __________ the null hypothesis.
against The null hypothesis is usually the hypothesis that we are trying to find evidence against.
In a test of significance we make a claim and ask if the data give evidence __________ it.
against This is what a test of significance does in a nutshell.
If the P-value is very small, then the difference between the observed statistic value and the claimed parameter value must be __________. - large - too big to be due to chance - statistically significant - all of the above
all of the above Right. Essentially, we need to make sure that the parameter is very large, making it too big to be due to chance, and therefore statistically significant. With a small P-value (think < 0.05), then the difference between the observed and claimed must be very alrge.
Facts about the mean and standard deviation of the sampling distribution of x¯ __________. - are useless no matter what we know - allow us to know nothing unless we know the shape of the population - allow us to relate the mean and standard deviation of the sampling distribution to the population - allow us to use the standard Normal distribution to compute probabilities
allow us to relate the mean and standard deviation of the sampling distribution to the population These facts only tell us how the mean and standard deviation of the sampling distribution of x¯ relate to the mean and standard deviation of the population. While these facts are important, by themselves, they do not allow us to use the standard Normal distribution to compute probabilities on x¯.
In a test of significance we typically want to find evidence for the __________ hypothesis.
alternative When the data give evidence against the null hypothesis, we conclude that the alternate hypothesis must be correct.
When performing a hypothesis test on a data set, whether or not the results are statistically significant depends on: - neither the sample size nor the alternative hypothesis. - both the sample size and the alternative hypothesis. - the sample size. - the alternative hypothesis.
both the sample size and the alternative hypothesis. Both the sample size and the alternative hypothesis affect the significance of the test.
Julie was waiting in line to buy tickets to a concert. When she got to the window, she remarked that she'd been waiting 45 minutes. What kind of variable is "time spent waiting in line?"
continuous
Increasing the desired margin of error will __________ the required sample size.
decrease Increasing the desired margin of error will require a smaller sample size. The margin of error is in the denominator of the formula for sample size. Making that larger will lower the result of the calculation.
An airline wants to know the average time it takes their passengers to claim their luggage. The time to claim luggage for this airline is known to be Normally distributed with mean μ and standard deviation σ = 5 minutes. The margin of error for a 95% confidence interval would __________ if the airline took a simple random sample of 100 passengers instead of 10 passengers.
decrease Increasing the sample size while keeping the confidence level fixed will decrease the margin of error and the confidence interval width because the sample size is in the denominator of the margin of error formula.
For a specified sample size, decreasing α will: - decrease power. - sometimes increase power and sometimes decrease it. - not affect power. - increase power.
decrease power. Decreasing α results in rejecting a false null hypothesis less often, thus automatically decreasing power.
If the population standard deviation were 2 instead of 5, the margin of error would be _______.
decreased Smaller standard deviations will make the margin of error smaller, because we are changing σ in the formula for the margin of error.
Decreasing the level of confidence __________ the margin of error.
decreases Decreasing the confidence level results in a decrease in z* which decreases the margin of error and the width of the confidence interval.
For a specified sample size, decreasing α __________ power.
decreases Decreasing α (changing from 10% to 5%, for example) results in rejecting a false null hypothesis less often, thus automatically decreasing power.
As the sample size increases, the standard deviation of the sampling distribution __________.
decreases The standard deviation of the sampling distribution is σ/n‾√. Thus, as n increases, the sampling distribution gets smaller.
A correct interpretation of the statement, "The probability that a child delivered in a certain hospital is a girl is 0.50" would be that over a long period time, it is predicted that there will be a(n) __________ proportion of boys and girls born at that hospital.
equal A probability of 0.50 means equal proportions.
A population distribution is very left skewed with mean μ = 40 and standard deviation σ = 10. Consider two sampling distributions from this population distribution. Sampling distribution #1 is created from the sample means from all possible random samples of size n = 9; sampling distribution #2 is created from the sample means from all possible random samples of size 81. How do the means compare? The mean of sampling distribution #1 is __________ to the mean of sampling distribution #2.
equal The mean of the sampling distribution of x¯ is always equal to the mean of the population distribution, regardless of sample size. So, the means are equal.
A population distribution is flat (or uniform) with mean μ = 2 and standard deviation σ = 0.4. The sampling distribution of x¯ is created from the sample means from all possible samples of size 64. How does the mean of this sampling distribution compare with the mean of the population distribution? The mean of the sampling distribution is __________ the mean of the population distribution.
equal to The mean of the sampling distribution of x¯ is always equal to the mean of the population distribution.
A population distribution has a Normal shape with mean μ = 50 and standard deviation σ = 4. Consider two sampling distributions from this population distribution. Sampling distribution #1 is created from the sample means from all possible random samples of size n = 8; sampling distribution #2 is created from the sample means from all possible random samples of size 64. The mean of sampling distribution #1 is _____________ the mean of sampling distribution #2. - greater than - slightly greater than - equal to - less than
equal to The mean of the sampling distribution of x¯ is always equal to the mean of the population distribution. So, the means are equal.
From a computer simulation of rolling a fair die ten times, the following data were collected on the showing face: 5 5 1 3 2 1 5 6 5 1 The probability of rolling 4 is __________. - 0 - 0.5 - twice the probability of rolling a two - equal to the probability of rolling anything else
equal to the probability of rolling anything else Not having seen a four on these 10 rolls does not mean it won't happen. The die is fair.
Statistical inference is used to: - compute a sample statistic. - estimate an unknown population parameter. - describe a sampling distribution.
estimate an unknown population parameter.
If the height of 20-year-old American women is Normally distributed with a mean of 64 inches and a standard deviation of 3 inches, then the sampling distribution for the average height of 36 randomly chosen women will be: - exactly Normal, with a mean of 64 inches and a standard deviation of 0.5 inch. - approximately Normal, with a mean of 64 inches and a standard deviation of 0.5 inch. - approximately Normal, with a mean of 64 inches and a standard deviation of 3 inches. - exactly Normal, with a mean of 64 inches and a standard deviation of 3 inches.
exactly Normal, with a mean of 64 inches and a standard deviation of 0.5 inch.
how to correctly interpret the confidence intervals?
give the confidence level make sure the parameter is stated correctly/in context give the intervals (EX) We are 90% confident that the true mean weight of all black bears is between 344.8 and 361.2 pounds.
Suppose that a regulatory agency will propose that Congress cut federal funding to a metropolitan area if its mean level of NOx is unsafe—that is, if it exceeds 5.0 ppt. The agency gathers sample NOx concentrations on 60 different days and calculates a test of significance to assess whether the mean level of NOx is greater than 5.0 ppt. The implications of a Type II error here is providing federal funding when in fact the level of NOx is __________ than 5.0 ppt.
greater Continuing to provide funding means the null hypothesis was not rejected here, when in fact the air is unsafe. This is a Type II error.
A population distribution is very left skewed with mean μ = 40 and standard deviation σ = 10. Consider two sampling distributions from this population distribution. Sampling distribution #1 is created from the sample means from all possible random samples of size n = 9; sampling distribution #2 is created from the sample means from all possible random samples of size 81. How do the standard deviations compare? The standard deviation of sampling distribution #1 is __________ than the standard deviation of sampling distribution #2.
greater The standard deviation of the sampling distribution of x¯ equals σ/n‾√. The standard deviation for sampling distribution #1 is 10/√9 = 3.33 and the standard deviation for sampling distribution #2 is 10/√81 = 1.11.
A population distribution has a Normal shape with mean μ = 50 and standard deviation σ = 4. Consider two sampling distributions from this population distribution. Sampling distribution #1 is created from the sample means from all possible random samples of size n = 8; sampling distribution #2 is created from the sample means from all possible random samples of size 64. How do the standard deviations compare? The standard deviation of sampling distribution #1 is __________ than the standard deviation of sampling distribution #2.
greater The standard deviation of the sampling distribution of x¯ equals σ/n‾√. The standard deviation for sampling distribution #1 is 4/√8 = 1.414 and the standard deviation for sampling distribution #2 is 4/√64 = 0.5
We fail to reject the null hypothesis whenever the P-value is __________ α.
greater than
Mensa is the "high IQ" society. Their rules for eligibility for membership state that an individual must have an IQ in the upper 2%, which corresponds to a z-score of 2.05. The Wechsler Adult Intelligence Scale is approximately Normal with mean 100 and standard deviation 15. On this scale, an IQ _____ 130 will qualify. (Mensa does not use decimal points in their eligibility scores.) - less than or equal to - greater than - less than - greater than or equal to
greater than 100 + 2.05(15) = 130.75, so 130 is too low to be a qualifying member of Mensa.
If a z-score is negative, the area to its right is __________ 0.5. - equal to - less than - greater than - approximately
greater than If a z-score is negative, the value is below the mean. The area to the right will be more than 0.5.
Researchers worry that exposure to pollutants from an environmental waste site may lead to an increase in the incidence of Disease X among residents of the nearby community. In a random sample of 8360 community members, 418 people had this disease. Among members of this community, the odds of having the disease are __________ 5%. - less than - greater than - exactly - half of
greater than The proportion of community members with the disease is 418/8360 = 0.05. The proportion of community members without the disease is: 1-0.05 = 0.95. To find the odds of the disease divide: 0.05/0.95 = 0.0526 which rounds to 0.053.
Aldo takes a nationally administered college readiness test and finds out his z-score was 1.7. Does this mean Aldo's score was lower or higher than the national average? His score was __________ than the national average.
higher Aldo's z-score was 1.7, this is a positive number, so Aldo's score was higher than the national average.
the sampling distribution of ¯x tells us .....
how close to μ the sample mean ¯x is likely to be
Increasing the level of confidence will __________ the sample size required to maintain the same margin of error.
increase Increasing the level of confidence will results in an increase in z* which will require a larger sample size because the z* multiplier is larger.
Decreasing the sample size __________ the margin of error.
increases Decreasing the sample size will increase the margin of error.
Increasing the level of confidence __________ the margin of error.
increases Increasing the confidence level results in an increase in z* which increases the margin of error and the width of the confidence interval.
For a specified sample size, increasing α __________ power.
increases Increasing α (changing from 5% to 10%, for example), results in rejecting a false null hypothesis more often, thus automatically increasing power.
Vigorous exercise helps people live several years longer on average. It is not clear whether mild activities like slow walking extend life. Suppose that the added life expectancy from a regular slow walk is just two months. A statistical test is more likely to find a significant increase in mean life if: - it is based on a very large random sample. - it is based on a very small random sample. - The size of the sample does not have any effect on the significance of the test.
it is based on a very large random sample
Very small population effects can be statistically significant if the sample size is __________.
large Small effects may still be statistically significant if the sample size is large.
The length of human pregnancies from conception to birth is known to be normally distributed with a mean of 266 days and standard deviation of 16 days. A "nine month" pregnancy might be considered as 274 days (four months have 30 days, and February has 28 in most years). What proportion of pregnancies last less than "9 months"? First, find the z-score for 274: (274 - 266)/16 = 0.5. Then, find the area to the __________ of 0.5, using the table. The proportion of pregnancies that last less than "9 months" is 0.6915.
left The z-score for 274 days is (274 - 266)/16 = 0.5. From the table, the area to the left of 0.5 is 0.6915.
A population distribution is flat (or uniform) with mean μ = 2 and standard deviation σ = 0.4. The sampling distribution of x¯ is created from the sample means from all possible samples of size 64. How does the standard deviation of this sampling distribution compare with the standard deviation of the population distribution? The standard deviation of the sampling distribution is __________ the standard deviation of the population distribution.
less than The standard deviation of the sampling distribution of x¯ equals σ/n‾√ = 0.4/64‾‾‾√=0.5 which is less than the standard deviation of the population distribution which is σ = 0.4.
John takes a nationally administered college readiness test and finds out his z-score was -2.1. Does this mean John's score was lower or higher than the national average? His score was __________ than the national average.
lower John's z-score was -2.1, this is a negative number, so John's score was less than the national average.
A confidence interval is of the form "estimate ± __________.
margin of error We add and subtract the margin of error from the sample estimate to get a confidence interval estimate of a parameter.
A Normal distribution is specified by which of the following numbers? - Mean and range - Median and standard deviation - Mean and standard deviation - Mean and median
mean and standard deviation
A population distribution is very left skewed with mean μ = 40 and standard deviation σ = 10. Consider two sampling distributions from this population distribution. Sampling distribution #1 is created from the sample means from all possible random samples of size n = 9; sampling distribution #2 is created from the sample means from all possible random samples of size 81. The shape of #1 is __________ skewed than #2. - less - more - equally - similarly
more According to the central limit theorem, both sampling distributions are closer to Normal than the population distribution. But sampling distribution #1 with sample size 9 is not approximately Normal. Sampling distribution #2 with sample size 81 is approximately Normal. The shape of #1 is more skewed.
Oscar and Felix are handicapping next year's NFL season. Oscar claims that the chance the Indianapolis Colts will win the Super Bowl is 30%, and the chance that the New England Patriots win is 20%. Felix says the chance that New Orleans will win the Super Bowl is 25%, the chance that the Dallas Cowboys will win is 20%, and the chance that the Chicago Bears will win is 10%. Whose probabilities are consistent? - Felix - Oscar - both - neither
neither Neither has probabilities that sum to 100%.
A requirement of our inference methods is that the sample has a __________ distribution. - skewed - binomial - normal - uniform
normal Normality is a requirement to use z.
An outcome that would rarely happen if a claim were true is good evidence that the claim is ___________.
not true An observed outcome that would rarely happen if a claim were true is good evidence that the claim is not true and thus should be rejected.
If P-value > α, we fail to reject the __________ hypothesis.
null Large P-values do not give evidence against the null hypothesis or for the alternative hypothesis, so we "fail to reject" the null hypothesis.
The P-value is the probability of obtaining a statistic as extreme or more extreme than what was actually observed if the __________ hypothesis were true.
null The P-value is the probability of obtaining certain values of the statistic, given that the null hypothesis is true, not a probability on either hypothesis
In a test of significance we typically want to find evidence against the __________ hypothesis.
null This is the hypothesis that the test assesses the strength of evidence against.
A consumer advocate is interested in evaluating the claim that a new granola cereal contains 4 ounces of cashews in every bag. The advocate recognizes that amounts of cashews will vary slightly from bag to bag, but she suspects that the mean amount of cashews per bag is less than 4 ounces. To check the claim, the advocate purchases a random sample of 40 bags of cereal and calculates a sample mean of 3.68 ounces of cashews. The consumer advocate should declare statistical significance only if there is a small probability of __________. - observing a sample mean less than 4 ounces when µ = 3.68 ounces - observing a sample mean of exactly 3.68 ounces when µ = 4 ounces - observing a sample mean of 3.68 ounces or less when µ = 4 ounces - observing a sample mean of 3.68 ounces or greater when µ = 4 ounces
observing a sample mean of 3.68 ounces or less when µ = 4 ounces Results are statistically significant when the sample results are unlikely if the null hypothesis were true. In this case, unlikely is having a small probability of observing a sample mean of 3.68 ounces or less if µ were 4 ounces.
An endocrinologist is interested in the effects of depression on the thyroid. It is believed that healthy subjects have a mean thyroxin (a hormone related to thyroid function) level of 7.0 micrograms/100 ml and a standard deviation of 1.6 micrograms/100 ml. The endocrinologist wants to assess whether the mean thyroxin level is different for those with depression. She samples 35 subjects with depression and obtains a sample mean of 7.82 micrograms/100 ml for thyroxin. She should declare statistical significance only if there is a small probability of __________. - observing a sample mean of 7.82 or greater when µ = 7. - observing a sample mean of exactly 7.82 when µ = 7. - observing a sample mean of 7.82 or more extreme when µ = 7.0 micrograms/100 ml. - observing a sample mean less than 7 when µ = 7.82.
observing a sample mean of 7.82 or more extreme when µ = 7.0 micrograms/100 ml. Results are statistically significant when the sample results are unlikely if the claim were true. In this case, unlikely is having a small probability of observing a sample mean of 7.82 or more extreme if µ were 7.
The alternative hypothesis is _____ if it states that a parameter is larger than or that it is smaller than the null hypothesis value
one-sided
An observed __________ that would rarely happen if a claim were true is good evidence that the claim is not true.
outcome An outcome that would rarely happen if a claim were true is good evidence that the claim is not true.
An event is random if individual __________ are uncertain but happen in a predictable manner through time.
outcomes This is the definition of randomness. We do not know what outcome will happen on any one trial, but over the long run, a pattern of sorts emerges.
In a Normal quantile plot, points that are far away from the overall pattern of the plot represent __________.
outliers Outliers will appear as points that are far away from the overall pattern of the other points.
The mean of the sampling distribution of p̂ is _____.
p p̂ is an unbiased estimator of p. This means that the mean of the sampling distribution of p̂ is p, the population proportion of successes.
A numerical value summarizing information about an entire population is called a __________.
parameter A parameter is a number that summarizes information about the entire population. It is not a word phrase describing the individuals.
A test of significance uses evidence provided by sample data to assess whether a claim about a population __________ is supported or refuted.
parameter Hypothesis tests are all about the believability of claims about population parameters.
The average IQ score among adults is 100. The number 100 is a __________.
parameter It is a parameter because it is a number that summarizes information about the entire population, the population being all adults.
Your friends follow the Atlanta Braves baseball team. Donna says the chance they will make the playoffs this year is 75%. This is an example of __________ probability.
personal The number is based on her beliefs, not a long series of repeated trials, so it is personal probability.
The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples that could be taken from the __________.
population This is the definition of a sampling distribution.
For random samples of size 7 from a population, the mean of the sample means from all possible samples is equal to the __________ mean.
population What a wonderful fact! The mean of all possible sample means equals the mean of the population.
Hypotheses always refer to a _____ , not to a particular outcome. Be sure to state H0 and Ha in terms of _____.
population ; population parameters
If the area to the right of a z-score is less than 0.5, the z-score is __________. - negative - positive - 0 - less than 0.5
positive If the area to the right is less than 0.5, the value is above the mean so the z-score is positive.
A __________ model is a mathematical description of a random process.
probability Probability models specify what can happen (the sample space), and give a way of assigning probabilities.
The label on a can of a particular brand of extra large olives states that there are about 33 olives in each can. A gourmet cook feels that the claim of 33 olives per can is too high, and that the average number of olives per can is less than 33. He samples 35 cans and finds x¯ = 32.9 with a P-value of 0.119 for testing H0: µ = 33 versus Ha: µ < 33. Is the following interpretation of this P-value correct or incorrect? The P-value is the __________ of observing a sample mean less than or equal to 32.9 olives, assuming that the true mean number of olives per can is 33 olives.
probability This interpretation is correct because it has both of the following: (1) the probability is about getting the sample statistic (sample mean of 32.9 olives or less) and (2) the assumption is on the null hypothesis being true (true mean is 33 olives).
The probability of a particular outcome is the __________ of times the outcome would happen in a very long series of repetitions.
proportion
Suppose that a regulatory agency will propose that Congress cut federal funding to a metropolitan area if its mean level of NOx is unsafe—that is, if it exceeds 5.0 ppt. The agency gathers sample NOx concentrations on 60 different days and calculates a test of significance to assess whether the mean level of NOx is greater than 5.0 ppt. Which of the following best describes the implications of a Type II error? - cutting federal funding when in fact the level of NOx is greater than 5.0 ppt - providing federal funding when in fact the level of NOx is equal to 5.0 ppt or less - cutting federal funding when in fact the level of NOx is equal to 5.0 ppt or less - providing federal funding when in fact the level of NOx is greater than 5.0 ppt
providing federal funding when in fact the level of NOx is greater than 5.0 ppt Continuing to provide funding means the null hypothesis was not rejected here, when in fact the air is unsafe. This is a Type II error.
The sample mean, x¯, is a __________ variable.
random Because the sample mean is a random variable, we model its behavior with a probability model.
An event is __________ if individual outcomes are uncertain but happen in a predictable manner through time.
random This is the definition of randomness. We do not know what will happen on any one trial, but over the long run, a pattern of sorts emerges.
The margin of error includes only: - error caused by non-response bias. - error caused by the wording of the question. - random variation in the act of selecting an SRS. - error caused by the act of selecting a convenience sample.
random variation in the act of selecting an SRS.
If P-value < α, we __________ the null hypothesis.
reject Small P-values give evidence against the null hypothesis or conversely for the alternative hypothesis.
We are testing a null hypothesis that a production line is "in control" (meaning that output is satisfactory) versus an alternative hypothesis that it has gone out of control. The consequence of rejecting the null hypothesis is having to shut down to adjust the machinery. This results in a loss of production time and money. If this production line is making blood pressure medication, we'd want to set the α level __________ when the test is concerned with the amount of active drug in each pill.
relatively high This would be a case where the consequences of not rejecting a false null hypothesis could be severe. We would NOT want a low α level in this case. The wrong dosage in the pills could have severe consequences (including death) for the people who take these!
The length of human pregnancies from conception to birth is known to be normally distributed with a mean of 266 days and standard deviation of 16 days. A pregnancy longer than "nine months" might be considered as 274 days or longer (four months have 30 days, and February has 28 in most years). What proportion of pregnancies last more than "9 months?" First, find the z-score for 274: (274 - 266)/16 = 0.5. Then, find the area to the __________ of 0.5, using the table. The proportion of pregnancies that last more than "9 months" is 0.3085.
right The z-score for 274 days is (274 - 266)/16 = 0.5. From the table, the area to the right of 0.5 is 1 - 0.6915 = 0.3085.
A group of researchers wanted to know if there was a difference in average yearly income taxes paid between residents of two very large cities in the Midwestern United States. The average for the first city was $6,505 and for the second city, it was $6,511. The difference provided a P-value of 0.0007. These results are NOT practically significant because actual difference in the two sample means is too __________.
small A small P-value means the result is statistically significant. That does not necessarily mean it is practically meaningful.
Even large population effects can fail to be statistically significant if the sample size is __________.
small Large population effects can fail to be statistically significant if the sample size is small
P-values that are ___________ give evidence against the null hypothesis.
small Small P-values give evidence against the null hypothesis and for the alternative hypothesis. These indicate our data (and statistics calculated from them) were unlikely to happen if the null hypothesis is true.
The standard deviation of the sampling distribution of x¯ gets ________ as the sample size increases.
smaller The sample size is in the denominator of the formula for the standard deviation of the sampling distribution of x¯ equals σ/n‾√. Thus, it gets smaller as the sample size increases
The only component of the margin of error in a confidence interval for μ that a researcher CANNOT manipulate is the ________________.
standard deviation A researcher can select the confidence level and sample size. A researcher has no control over the population standard deviation, σ. Variability is what it is and the best we can do is measure it.
The alternative hypothesis should always be: - set by someone other than the researcher. - stated after the researcher has looked at sample data. - stated before the researcher has looked at the sample data.
stated before the researcher has looked at the sample data.
A sampling distribution is a distribution of a(n) __________.
statistic A sampling distribution gives the distribution of a statistic.
The average number of hours of sleep per night was 9.46 hours for a sample of 104 five to seven year-old children. The number 9.46 is a __________.
statistic It is a statistic because it is a number that can be computed from the sample of 104 children.
The average amount of caffeine consumed per day for a sample of 97 eight to twelve year-olds was 78 mg. The number 78 is a __________.
statistic It is a statistic because it is a number that can be computed from the sample of 97 children.
If P-value < α, the results are __________.
statistically significant
If the P-value is as small as or smaller than α, we say that the data are _____ at level α.
statistically significant
A group of researchers wanted to know if there was a difference in average yearly income taxes paid between residents of two very large cities in the Midwestern United States. The average for the first city was $6,505 and for the second city, it was $6,511. The difference provided a P-value of 0.0007. The difference is __________. - statistically significant and practically significant - not statically significant but practically significant - not statistically nor practically significant - statistically significant but NOT practically significant
statistically significant but NOT practically significant A difference of only $6 in over $6000 is less than 0.1%. This is probably too small to really matter.
In a matched pairs experiment, subjects pushed a button as quickly as they could after taking a caffeine pill and also after taking a placebo pill. The mean pushes per minute were 283 for the placebo and 311 for caffeine. The numbers 283 and 311 are: - statistics. - confounded. - random numbers. - parameters.
statistics
A Normal distribution is: - skewed right. - bimodal. - symmetric. - skewed left.
symmetric
α denotes level of significance for the statistical _________.
test α is the symbol for the level of significance. The P-value is sometimes called the "observed significance."
The most important condition for reaching sound conclusions from statistical inference is usually: - that the data can be thought of as a random sample from the population of interest. - that the population distribution is exactly Normal. - that no outliers are found in the sample.
that the data can be thought of as a random sample from the population of interest.
The survival times of guinea pigs inoculated with an infectious viral strain vary from animal to animal. The distribution of survival times is strongly skewed to the right. The central limit theorem says that: - as we study more and more infected guinea pigs, their average survival time gets even closer to the mean m for all infected guinea pigs. - the average survival time of a large number of infected guinea pigs has a distribution of the same shape (strongly skewed) as the distribution for individual infected guinea pigs. - the average survival time of a large number of infected guinea pigs has a distribution that is close to Normal.
the average survival time of a large number of infected guinea pigs has a distribution that is close to Normal.
The area under a density curve above a range of values tells us __________. - the average value of the variable - the proportion of sample values in the range - the proportion of all possible observations in the range - how often each of the values occurs
the proportion of all possible observations in the range
In a distribution, the cumulative probability for a value x is __________. - the proportion of observations greater than or equal to x - the proportion of observations smaller than or equal to x - the number of observations smaller than or equal to x - the number of observations greater than or equal to x
the proportion of observations smaller than or equal to x. The proportion of observations that lie at or below x gives the cumulative probability of x. Observation at or below x are smaller than or equal to x.
When comparing different studies, it is more useful to know __________. - the count of successes in each study - the proportion of successes in each study - the count and proportion are both equally useful - Neither the count nor the proportion are useful.
the proportion of successes in each study The count of successes in a study is meaningful only if we know the total number of observations in the study. The proportion of successes is a much more useful statistic when comparing the results of multiple studies.
What is captured by the confidence interval?
the true average of the population mean
hypothesis tests (or sometimes significance tests), goal:
to assess the strength of the evidence provided by data against some claim concerning a population
In a Normal distribution, the mean equals the median. - true - false
true
At a certain university, the National Science Foundation awarded a large grant to create environmental science laboratory courses. The purpose of these was to educate students about the impacts of certain activities on the environment. In assessing the impact of the courses on students' attitudes, a special survey was administered the first few semesters the courses were taught. The data from these were analyzed using multiple hypothesis tests. In all, 51 tests were performed to attempt to connect student demographics with increased environmental awareness. Three of the test results were significant at the 5% level. We should exercise caution in looking at these results because at the 5% level, we'll expect 0.05*51 = 2.55 "significant" test results even if all the null hypotheses of "no difference" are _______.
true An α level of 5% means results this extreme will happen 5% of the time by chance. Sometimes we get the "unlucky" sample (one in the tails of the distribution). Finding only three "significant" results is not strong evidence of an impact on the students.
If the symbol ≠ is found in the alternative hypothesis, we declare the test to be _____-sided.
two ≠ could be either smaller or larger so it is two-sided.
The alternative hypothesis is _____ if it states that the parameter is different from the null value (it could be either smaller or larger).
two-sided
A 95% confidence interval for the mean normal body temperature (in ºF) was computed as (98.123, 98.375), based on a sample of 130 observations of body temperature. The correct interpretation of this interval is: - we got the number using a method that gives correct results 95% of the time. - 95% of observations in the sample have body temperature between 98.123 and 98.375 ºF. - 95% of the individuals in the population should have body temperature between 98.123 and 98.375 ºF.
we got the number using a method that gives correct results 95% of the time.
Suppose we are testing H0: µ = 100 versus Ha: µ < 100. Which of the following will have the largest P-value? - x¯ = 70 - x¯ = 75 - x¯ = 80 - x¯ = 85
x¯ = 85 Since this test is one-sided with less than in the alternative, the x¯ value of 85 is the closest to µ = 100 and will have the largest P-value.
If you picked different samples from a population, .....
you would probably get different sample means and virtually none of them would actually equal the true population mean, μ.