Stats Exam 2
a researcher takes a random sample of 500 young men between the ages of 18 and 20 and calculates the mean NAEP math score to be 280. The standard deviation of math scores for the sample of young men was 25. a. What are the appropriate null and alternative hypotheses? b. What would be an example of a Type I error in the context of this problem?
a. H0: μ = 275 vs. Ha: μ > 275 b. Believing that the mean NAEP math score is higher than 275 when it is not actually higher
Suppose we are testing the following hypotheses: H0: It is not your friend's birthday. Ha: It is your friend's birthday. a. What is a type I error for these hypotheses? b. What constitutes a type II error for these hypotheses?
a. Saying "Happy Birthday" when it isn't their birthday. (rejecting Ho when it is true) b. Saying nothing when it is in fact their birthday.
H0: The patient is healthy. Ha: The patient has a medical problem. a. Which of the following describes a Type I error in this situation? b. What describes statistical power in this situation?
a. Sending a healthy patient to the doctor. b. The probability of sending a patient with a medical problem to the doctor.
The weight of a carton of a dozen eggs produced by a certain breed of hens is supposed to be normally distributed with a mean of 780 grams. A quality manager randomly checks thirty-five cartons of eggs (n = 35) to see whether the mean weight differs from 780 grams. She finds x̄ = 796. a. Parameter of interest? b. What are the null and alternative hypotheses? c. Which of the following shows the conditions that must be met by one-sample t procedures? d. Assume that this study is a two-sided test. A 95% confidence interval estimate was computed to be (787.82 grams, 804.18 grams). On the basis of this interval, at α = 0.05, what can she conclude about H0: μ = 780 versus Ha: μ ≠ 780?e. Assume that the p-value was calculated to be 0.03. Interpret this p-value in context.
a. The mean weight of all cartons of eggs produced by a certain breed of hens b. H 0: μ = 780 vs. H a: μ ≠ 780 c. Normality of the population or large sample size, and randomness in the data collection d. Reject H0 since 780 is outside the given interval. e. The probability of getting a sample mean as extreme or more extreme than 796g is equal to 0.03 assuming the population mean is 780g.
Key word in interval testing (statistic and proportion)?
between
Relationship between β and n
inverses
The purpose of a confidence interval is to provide _________.
plausible values that a parameter could take
What is the first step in statistical hypothesis testing?
stating the claims
x̅ ± t* (s/sqrtn) where is the margin of error in this equation
t* (s/sqrtn)
Standard error of x̄ refers to
the estimate of the standard deviation of the sampling distribution of x̄
A certain brand of fishing line claims to have an average breaking strength of 30 pounds. A group of fishermen become angry because this brand of line seems to break so easily and test 25 randomly selected lines of this brand. The mean breaking strength is 27.994 pounds with a standard deviation of 0.846 pounds. A plot of the data follows. What parameter is being estimated?
μ = the true mean breaking strength of this brand of fishing line
The Federal Pell Grant Program provides need-based grants to low-income undergraduate and certain postbaccalaureate students to promote access to postsecondary education. According to the National Postsecondary Student Aid Study conducted by the U.S. Department of Education in 2008, the average Pell grant award for 2007-2008 was $2,600. Assume that the standard deviation, σ, in Pell grant awards was $500 and that the distribution of awards is left skewed. --Suppose we take a sample of size n = 50 from this same population, and we calculate x̅ = $2,800. How many standard deviations, (σ/ √n), away from μ is this sample mean? -----How do you solve this?
----Solved via: (x̅-μ)/(σ/ √n) where x̅=2800 n=50 and mu and sigma are given. You just solve the above equation and it'll give you SDs (2.83)
What is the Statistical Dogma for Processes?
-All processes have natural variation (common causes or normal sources of variation) -All processes occasionally susceptible to unnatural variation (special causes or assignable sources of variation)
Relationship between α, β, and power?
-α and power are proportional (as one goes down both go down) -α is inverse of β
The Federal Pell Grant Program provides need-based grants to low-income undergraduate and certain postbaccalaureate students to promote access to postsecondary education. According to the National Postsecondary Student Aid Study conducted by the U.S. Department of Education in 2008, the average Pell grant award for 2007-2008 was $2,600. Assume that the standard deviation, σ, in Pell grant awards was $500. -1. Suppose we take random samples of size 75. What will the mean of the sampling distribution of x̄ be? -2. For samples of size 75, what will the standard deviation of the sampling distribution of x̄ be? -3. If we take samples of size 200 rather than 75, what will happen to the mean of the sampling distribution of x̄? -4. If we take samples of size 200 rather than 75, what will happen to the standard deviation of the sampling distribution of x̄? -5. If we take random samples of size 75 from this population, what will the shape of the sampling distribution of x̄ be?
1. 2600 (samp distribution of x̄ same as μ) 2. [α/(sqrt n)]= 500/sqrt75 3. mean stays the same 4. SD decreases (500/(sqrt 200)) 5. Approx. normal
Suppose we have an extremely left-skewed population with a mean of 45 and a standard deviation of 7. -1. For random samples of size 15, what will the shape of the sampling distribution of x̄ be? -2. Can we compute the probability that x̄ is less than 32 for a random sample of size 15? -3. Suppose we took samples of size 45. What will the shape of the sampling distribution of x̄ be?
1. not approx. normal--CLT does not apply 2. No, because the population is not Normally distributed and we cannot apply the Central Limit Theorem. 3. Approx. normal
What does a confidence interval give us?
A range of plausible values for the population parameter
The ________ hypothesis generally represents what the researcher wants to check, or suspects might actually be the case.
Alternative
We can never compute probabilities on x̅ when the population is skewed (T/F)
False-CLT
If we set α = 0.05, what can we do to increase power?
Increase sample size
Researchers want to estimate the amount of time teenagers spend watching television during one week. A random sample of 500 teenagers yielded a sample mean of 12.60 hours of television per week. -What type of statistical inference is being used?
Point Estimation
The sampling distribution of x̄ created from small random samples from a Normally distributed population is Normal (t/F)
T
If the p-value is less than α, then the results are statistically significant. T/F
T (rejecting Ho means the difference you're seeing is real and not due to chance)
A student takes a random sample of freshman at BYU and records their age at their first kiss. He calculates a 95% confidence interval of (15.5, 21.2). a. What parameter is the student trying to estimate? b. Can we say that 95% of the ages are included in the interval (15.5, 21.2)?
a. The mean age at first kiss for all BYU freshman b. No
The sampling distribution of x̅ gives _____ from all possible samples of the same size from the same population.
all x̅ values
The larger the sample size, the ________ the degrees of freedom, and the ________ the t distribution is to a normal z distribution.
higher, closer
What is the test statistic formular?
t=x̅-μ/ (s/sqrtn)
A test with ≠ in Ha is ____
two-sided set
The sample mean, ________, is used to estimate the population mean, _________.
x̄, μ
Equation for the standard deviation of the sampling distribution of x̄?
α/(sqrt n)
Scores on the math portion of the SAT follow a Normal distribution with a mean of 507 and a standard deviation of 111. -What is the probability that any random sample of 4 students has an average SAT math score between 400 and 625?
-Between (take difference of both Z values) -Use z = (x̄-μ)/ (σ/sqrtn) for both where x̄ is 400 and 625. You'll then get the z numbers for both values, find the probability on z table and take the differences.
What are the necessary aspects of the interpretation of confidence levels?
-Confidence level (must say: Confidence...this does not equal probability) -Parameter in Context -Calculated Interval
Scores on the math portion of the SAT follow a Normal distribution with a mean of 507 and a standard deviation of 111. -What is the probability that the mean SAT math score of a sample of 4 students is more than 600? ---What are the steps to solve this?
-It's a normal distribution so you can still find probability. -Use z = (x̄-μ)/ (σ/sqrtn) x̄=600 μ=507 n=4 σ=111 -This will give you a z-value of 1.67. But because it asked for GREATER THAN we find the value for Z of 1.67 on the chart, .9525 and subtract it from 1. So 1-.9525=.0465
What does the quantitative data example and categorical data example look like for point estimation?
-Quantitative: Based on a sample of n = 47 policies, we estimate that the average premium at this agency is approximately $1800. -Categorical: Based on a sample of n = 144 households, we estimate that the proportion of infected bamboo cutting boards is approximately 10.4%.
Suppose we have a very right skewed population distribution where μ = 80 and σ = 20. 1. For random samples of size n = 100, what is the mean of the sampling distribution of x̅? 2. For random samples of size n = 100, what is the shape of the sampling distribution of x̅? 3. For random samples of size n = 100, what is the standard deviation of the sampling distribution of x̅? 4. If all possible samples of size 200 are taken instead of 100, how would this change the mean and standard deviation of the sampling distribution of x̅?
1. 80 2. Approx. normal 3. SD of x̅=σ/(sqrt(n)) 4. mean stays the same and SD would decrease (plug into above formular)
Describe the 4 elements of tests of significance: 1. Claim 1 and Claim 2: 2. Outcome: 3. Assessment of Evidence: 4. Conclusion:
1. Claim 1 and Claim 2: opposing claims about an unknown parameter. Presumption is for claim 1 unless there is strong evidence against it. 2. Outcome: standardized outcome that measures how far the outcome diverges from claim 1 3. Assessment of Evidence: How likely is it to get this outcome if claim 1 is true? 4. Conclusion: An outcome that would rarely happen if claim 1 is true is good evidence that claim 1 is not true; hence we believe claim 2 is true.
A tire store advertises that the average price of a new set of their tires is only $150. One of their recent customers believes their advertised average is too low - that the true mean price for a set of tires exceeds $150. He plans to carry out a hypothesis test at α = 0.05. In order to perform the test, the customer took an SRS of 8 sets of tires recently sold. The mean of these sets of tires was x̅ = $156.90 and the standard deviation was s = $11.80. 1. What are the appropriate null and alternative hypotheses for this test? 2. Are the conditions of randomness and normality met for this test? How? 3. What are the appropriate degrees of freedom for this test? 4. What is the value of the t test statistic for this test? 5. Suppose the value for the t test statistic is t = 1.79. What is the p-value for a one-sided test? 6. Suppose the p-value of the test is found to be 0.0715 using statistical software. What is the appropriate conclusion at α = 0.05.
1. Ho: μ = 150 vs. Ha: μ > 150 2. No, because the sample data was not plotted and it was not stated that the population is normal 3. 8-1=7 4. use t=x̅-μ/ (s/sqrtn)=1.65 5. T CI chart for t=1.79 in-between at DF=7, go down column to interval for one-sided test, which is .05 and .1, so 0.05 < p-value < 0.10 6. Fail to reject the null hypothesis. We do not have sufficient evidence to conclude that the mean price of a set of tires is significantly greater than $150.
A fast food chain claims their regular hamburgers have an average of 310 calories. One consumer believes this average is actually much higher and takes a random sample of 45 hamburgers. The mean of this sample is x¯ = 314, and standard deviation is s = 27. 1. What are the appropriate hypotheses? 2. What is the standard error of the sample mean, x-bar? (and how-to?) 3. Appropriate t-test statistic? (how to)
1. Ho: μ = 310 vs. Ha: μ >310 2. s/sqrtn= 27/sqrt45=4.02 3. Use formula t=x̅-μ/(s/sqrtn)=.99
To discourage students from driving to campus, a university claims students spend an average of 20 minutes looking for a parking spot. Students believe the actual time is less than this. After taking a random sample of 45 students, a sample mean of 17.4 minutes to find a parking spot was calculated. 1. To assess the evidence provided by the sample data, what is the appropriate question to ask? 2. What is the alternative hypothesis in this example?
1. How likely is it that, in a sample of 45, the true mean amount of time needed to find a parking spot is 17.4 minutes or less if the true mean is 20? 2. The mean amount of time needed to find a parking spot is less than 20 minutes.
The administrator reports that the mean GPA (grade point average) of a random sample of 40 male scholarship athletes is 3.02 and the mean GPA of a random sample of 36 female scholarship athletes is 3.11. If there is no difference in the mean GPA of male and female athletes, the probability of obtaining this difference (3.11 - 3.02 = 0.09) or more extreme is approximately 0.287. 1. What is Ho? 2. What is Ha? 3. In order to assess the evidence provided by the sample data, what is the appropriate question to ask? 4. With a p-value of 0.287, what is the appropriate conclusions to make?
1. Male and female scholarships have the same GPA 2. Male and female scholarship athletes do not have the same mean GPA 3. How likely is it to observe a difference of 0.09 or more extreme if there is no difference in the mean GPA for male and female scholarship athletes? 4. data does not provide strong enough evidence for rejecting H0
Suppose we have a right-skewed population distribution with a mean of 222 and a standard deviation of 33. -1. For random samples of size 19, what will be the shape of the sampling distribution of x̅? -2.For a random sample of size 19, can we compute the probability that x̅ is less than 200? -3. Suppose we took samples of size 100 instead of 19. What will be the shape of the sampling distribution of x̅?
1. Not approx. normal--CLT does not apply 2. No-both conditions are not met; population is not normal and we cannot apply CLT with small n 3. Approx normal
Researchers want to estimate the amount of time teenagers spend watching television during one week. A random sample of 500 teenagers yielded a sample mean of 12.60 hours of television per week. 1. What is the parameter the researchers are trying to estimate? 2. The mean number of hours each week teenagers in their sample spent watching television
1. The mean number of hours each week all teenagers spend watching television 2. The mean number of hours each week teenagers in their sample spent watching television
One of your professors claims 90% of BYU students are currently enrolled in a religion course. To test this claim, you randomly sample 300 BYU students and find that only 78% of them are enrolled in a religion course. Based on these sample results, you have evidence against your professor's claim. 1. What type of statistical inference did you use? 2. parameter of interest? 3. statistic to measure the parameter of interest?
1. hypothesis 2. proportion of all BYU students currently enrolled in a religion course 3. proportion of byu students in the same currently in a religion class
Based on sample results, a 90% confidence interval for the mean servings of fruit per day consumed by grade school children is (0.21, 2.45). What is the margin of error? -How do you solve for this?
2.45-.21=2.24 2.24/2=1.12
To discourage students from driving to campus, a university claims students spend an average of 20 minutes looking for a parking spot. Students believe the actual time is less than this. After taking a random sample of 45 students, a sample mean of 17.4 minutes to find a parking spot was calculated. 3. Suppose the student analyzes the data and finds that the probability of obtaining this difference (17.4 - 20), if the true mean actually is 20 minutes, is 0.012. What is the appropriate conclusion at α = 0.05? 4. If the probability of obtaining our sample data, assuming the null hypothesis were true, is large, we have enough evidence to accept the null hypothesis. (T/F)
3. Our data provides strong evidence for rejecting H0. 4. False--fail to reject null Hypothesis
A run of ________ or more points in a row on the same side of the center line indicates an out-of-control process.
9
Scores on the math portion of the SAT follow a Normal distribution with a mean of 507 and a standard deviation of 111. -Below what score do 25% of students fall? (Round your answer to the nearest whole number.) ---What formula will you use?
Answer: 433 -Formula: z= (x̄-μ)/ (σ) where you are solving for x̄ -First step: the 25% it gives you is what you'll need to look up on your Z table. (because it's asking below/to the left you can just use the value find that's close to .25). You find -.67 is the closest Z value that represents .25. You'll then just solve for Xbar.
According to the Central Limit Theorem, for random samples, what is the approximate shape of the sampling distribution of x̅ when the population distribution is non-Normal?
Approximately Normal if the sample size is large
We want to test the hypotheses H0 : μ = 50 versus Ha : μ > 50 to determine whether a new variety of corn will yield more than 50 bushels per acre. We plan to sample 100 plots and measure yield per acre on each plot. Assuming H0 is true and that σ = 5, describe the sampling distribution of x̄
Approximately normal with mean 50 and standard deviation 0.5.
β
Beta -Probability of a Type II error
Definition of an Interval Estimation:
Confidence Intervals. -Range of plausible values for a population parameter -Way to estimate the parameter by giving likely values -Research questions ask for a value
Which of the following confidence levels and significance levels are appropriate for using a confidence interval approach to hypothesis testing? Confidence Level = 90% and α = 0.01 Confidence Level = 99% and α = 0.01 Confidence Level = 95% and α = 0.5 Confidence Level = 95% and α = 0.01
Confidence Level = 99% and α = 0.01 (must add up to 100%)
A certain brand of fishing line claims to have an average breaking strength of 30 pounds. A group of fishermen become angry because this brand of line seems to break so easily and test 25 randomly selected lines of this brand. The mean breaking strength is 27.994 pounds with a standard deviation of 0.846 pounds. A plot of the data follows What is an estimate of the mean breaking strength of this brand of fishing line? What type of inference should be performed?
Confidence interval—research question asks to estimate a value
What is the quantity z*?
Confidence multiplier
The weekly oral dosage of anabolic steroids was measured on a sample of 20 body builders. Consider the following confidence interval interpretation: "We are 95% confident that the average weekly oral dose of anabolic steroids used by all body builders is between 152 mg and 194 mg." Is this interpretation of a confidence interval correct or incorrect? Why or why not?
Correct. It gives all three parts of confidence interval interpretation
Which of the following is NOT a step of hypothesis testing? Choosing a sample and collecting data Assessing the evidence Making conclusions Stating the claims Creating an interval estimate
Creating an interval estimate
T/F the point estimate will always equal the parameter
F
The p-value gives the probability that the null hypothesis is true. T/F
F
True or False: If results are statistically significant, then they are always practically significant.
F
True or False: Important differences are always statistically significant if a large sample size is used.
F
If there is not enough evidence to support the alternative hypothesis, we can accept the null hypothesis. T/F
F (don't accept Ho, just do not reject)
Suppose the p-value was found to be 0.1629 using statistical software. What is the appropriate conclusion when α = .10?
Fail to reject the null hypothesis. We have insufficient evidence to conclude -Difference could be due to chance
The statement, "We are 90% confident that the interval (119.5,128.1) captures the true mean yield in bushels per acre" is a proper interpretation of a confidence interval.
False
What is a type II error?
False negative -accepting the null hypothesis when it should have been rejected, stating that no difference exists when in actuality there is a difference
Which alternative hypotheses will allow you to determine statistical significance at α = 0.05 using a 95% confidence interval? H a: μ = 30 H a: μ > 30 H a: μ <30 H a: μ ≠ 30
H a: μ ≠ 30
If the P-value is less than or equal to α, what do you do to the null hypothesis?
If P is low, Reject Ho and declare the observed difference to be statistically significant (likely a real difference is present, not due to chance)
Consider the following confidence interval interpretation: "90% of the time the true mean number of Utah high school students involved in a car accident per month falls between 1523.78 and 1539.56." Is this interpretation of a confidence interval correct or incorrect? Why or why not?
Incorrect. It does not state the confidence level correctly
A tire manufacturer has a 60,000 mile warranty for tread life. The manufacturer considers the overall tire quality to be acceptable if less than 8% are worn out at 60,000 miles. A study was done and researchers were 98% confident that the proportion of tires that are worn out at 60,000 miles lies between 7.8% and 9.6%. What type of statistical inference is this?
Interval estimation
Relationship between β and α
Inversely proportional
Ha Alternative Hypothesis Definition
Involves inequalities, like <,>, or not equal to. State of difference or non compliance
What two things do we need in order to compute margin of error for a one-sample t confidence interval for μ?
Level of confidence and the standard error of x̄.
What is a Type I error?
Probability of rejecting a null hypothesis that is actually true (false positive)
If all possible samples of size 20 are taken instead of size 100, how would this change the mean and standard deviation of the sampling distribution of x̄? --What is equation we use here?
Rule: Mean of the SD of the samp. dist. of x̄ = population mean (μ) SD would increase using a smaller n value (20) in the denominator --Standard deviation of sampling distribution of x̄ = α/(sqrt n)
All students in the US who took the ACT in 2014 had a mean score of μ = 21.0. Suppose you randomly select two samples of students from this population, and you calculate the sample mean for each. Sample 1 has a size of n = 40, and Sample 2 has a size of n = 250. Which sample is more likely to get a sample mean of 18 or less?
Sample 1
What is a point estimate? Provide example.
Specific value of an estimator -Ex: the proportion of infected cutting boards for n = 144 households is 10.4%
Ho Null Hypothesis definition
Statement of innocence or of no difference. The original idea.
A fast food chain claims that a large order of french fries has 540 calories. To test the claim that the true mean is actually higher, a sample of 15 large orders of french fries is taken. The sample mean is 560 and the sample standard deviation is 21. ---Suppose the t test statistic was 2.94. What is the appropriate p-value for this test?
Steps: 1. DF=15-1=14 2. Look for the 2.94 t test given on the CI side. 3. Determine whether this is a one sided test where Ha: μ < or > or two-sided, where Ha: μ not equal to 4. This is one-sided, as it says HIGHER. 5. Find the two numbers where 2.94 is in-between, then go all the way down the in-between column to the one-sided row and you'll see two numbers that make up the p-value interval. Answer: .005<pvalue<.01
How is level of confidence determined?
Subjectively determined by the researcher.
T/F Increasing the confidence level will lead to a wider margin of error.
T
T/F Statistically significant means that there is enough evidence to reject the null hypothesis, whereas practical significance can only be determined by the researcher if the results are worth acting upon.
T
True or False: Power is when we reject a false null hypothesis.
T
True or False: Statistical inference can be defined as making generalizations about the population based on sample data.
T
A 95% confidence interval estimate for the mean weight was computed to be (781.82 grams, 812.18 grams). On the basis of this interval, at α = 0.05, the researcher can reject H0 : μ = 780 and conclude that Ha : μ ≠ 780 is correct T/F
T (The value 780 is not found between 781.82 and 812.18. Because the interval gives us possible values for μ based on the sample and because 780 is not one of them, we can reject H0 and say the μ does not equal 780.)
A certain brand of fishing line claims to have an average breaking strength of 30 pounds. A group of fishermen become angry because this brand of line seems to break so easily and test 25 randomly selected lines of this brand. The mean breaking strength is 27.994 pounds with a standard deviation of 0.846 pounds. A plot of the data follows. Do these data provide sufficient evidence for the fishermen to conclude that the average breaking strength is less than claimed? What type of inference should be performed?
Test of significance—research question is asking whether sample data support or contradict a claim about the parameter
Definition of hypothesis testing
Tests of Significance. -States a claim and then checks whether sample data provides evidence for or against the claim -Uses sample data to check whether a claim about the population parameter is t or f -Yes or No research questions
The weekly oral dosage of anabolic steroids was measured on a sample of 20 body builders. A 95% confidence interval estimate for the average weekly oral dose of anabolic steroids obtained from these results was 152 mg to 194 mg. The mean of the sample is 173 mg and the margin of error for the confidence interval given in the above question is 21 mg. Which one of the following is a correct interpretation of margin of error?
The maximum difference we expect between our sample result and the true average weekly oral dose is no more than 21 mg.
In order to estimate the mean age to be diagnosed with diabetes, a researcher takes a sample and finds the mean age to be 16.4. What is the parameter of interest? ---What statistic is used to estimate the parameter of interest?
The mean age of all people at which they were diagnosed with diabetes ---mean age of diagnoses for the sample
An administrator in a very large company wants to estimate the mean level of nitrogen oxides (NOX) emitted in the exhaust of a particular car model in their very large fleet of cars. Historically, nitrogen oxide levels have been known to be Normally distributed with a standard deviation of 0.15 g/ml. What statistic is used to estimate the parameter of interest? ---What is the parameter of interest that the administrator wants to estimate?
The mean level of nitrogen oxide of a sample of cars of a particular model in the very large fleet ---The mean level of nitrogen oxide of all cars of a particular model in the very large fleet.
A manufacturing process produces bags of cookies. The weights of these bags are known to be normally distributed and should have a mean of μ = 15.0 ounces with a standard deviation of σ = 0.4 ounces. In order to monitor the process, four bags are selected periodically and their average weight (x̄) is computed. -What is the parameter of interest?
The mean weight of all bags of cookies produced by this manufacturing process.
Two studies were done on the same set of data, where study I was a one-sided test and study II was a two-sided test. The p-value of the test corresponding to study I was found to be 0.030. What is the p-value for study II?
The p-value must be 0.060
What does s/sqrt n estimate?
The standard deviation of the sampling distribution (of x-bar)
An article claims that teenagers on average will check their cellphones 150 times in one day. A student decides to test this claim using the hypotheses H0: μ = 150 vs. Ha: μ ≠ 150. A 95% confidence interval for the true mean is found to be (154.3, 167.5). On the basis of this interval, what should the student conclude at α=0.05?
The true mean is not equal to 150 since the claimed value, 150, is not in the interval.
What is the purpose of a statistical control chart?
To distinguish between natural and unnatural variation.
If we fail to reject a false null hypothesis, what type of error are we making?
Type II
The weights of Cougar Tail donuts are known to have a normal distribution with a mean of 5.78 oz and a standard deviation of 0.21 oz. -How many standard deviations away from the mean is a donut that weighs 6 oz?
Use Z=(x-μ)/σ x=6
A researcher is interested in the mean height of all fifth graders in Utah. She randomly samples 35 students and calculates a sample mean of 49 inches and sample standard deviation of 3.22 inches. Compute the margin of error (t* s/sqrt n) for 95% confidence. ---Explain how you would solve
Use the equation: t* +or- s/sqrtn Where you find t* by looking up 34 DF on the chart (but not going over) t*=2.042 s=3.22 n=35
A professor reported that students in a class had scores on the final exam that were left-skewed with a mean of 81.9 and a standard deviation of 6.1. A student took a random sample of 50 students and calculated the mean score to be 80.3. -What is the probability that the sample this student obtained would have a mean of 80.3 or lower? --What equation? How will you find it? What are each of the values?
Use your Xbar probabilities flowchart, and you'll use the CLT equation: Z=(x̄-μ)/ (σ/sqrt(n)). This gives you a Z value which you can just look up directly μ=81.9 σ=6.1 x̄=80.3 n=50
The weekly oral dosage of anabolic steroids was measured on a sample of 20 body builders. A 95% confidence interval estimate for the average weekly oral dose of anabolic steroids obtained from these results was 152 mg to 194 mg. Which one of the following is a correct interpretation of this confidence interval?
We are 95% confident that the average weekly dose of anabolic steroids used by all body builders is between 152 mg. and 194 mg
Suppose the 95% confidence interval estimate for the mean monthly cost for Internet service for all Internet users is ($19.90, $21.90). Which of the following is a correct interpretation of this 95% confidence interval?
We are 95% confident that the mean monthly cost for Internet service paid by all Internet users is between $19.90 and $21.90.
A professor is interested in the mean amount of money BYU students spend on groceries per week. He randomly samples 200 students and calculates a sample mean of $42. He then computes a 95% confidence interval of ($34.2, $49.8). Which of the following is a correct interpretation of this confidence interval?
We are 95% confident that the true mean amount of money BYU students spend on groceries per week is between $34.2 and $49.8.
To estimate the average speed of cars traveling on a certain stretch of highway, officers randomly select 75 cars and record their speed. Suppose the 95% confidence interval is (64.5, 78.2). --What is the appropriate interpretation of this confidence interval?
We are 95% confident that the true mean speed of cars on this certain stretch of highway is between 64.5 and 78.2 miles per hour.
In practice, if we don't know whether the population is normal and our sample size is less than 30, when can we proceed with inference for confidence intervals and hypothesis testing?
When the data is single-peaked and there are no outliers
Y/N Consider the following interpretation of 99% confidence: "99% of intervals calculated with this method will capture the population parameter." Is this a correct interpretation of the confidence level?
Y
Suppose we take samples of size 40 from a population with a mean of 200 and a standard deviation of 50. Can we compute the probability that x̄ is greater than 270?
Yes, because the sampling distribution of x̄ is normally distributed.
A fast food chain claims that a large order of french fries has 540 calories. To test the claim that the true mean is actually higher, a sample of 15 large orders of french fries is taken. The sample mean is 560 and the sample standard deviation is 21. ---Calculate a 95% confidence interval for the mean number of calories for all large orders of fries.
You'll use the x̅ ± t* (s/sqrtn) equation. WEIRD. x̅=560 t*=Confidence Interval for DF of 14=2.145
What is a point estimator? -Provide example
a statistic that provides an estimate of a population parameter -EX: Estimator of the population mean µ is the sample mean, x̄
The IQ level of students at a particular university has an unknown mean. A simple random sample of 100 students is found to have a sample mean IQ of x̅ = 115 and a sample standard deviation of s = 15. a). Calculate a 95% confidence interval for the mean IQ level of all students in the university. b). If the researcher wanted to have 95% confidence in the results with a margin of error of 5.1, how many students must be sampled? (Assume σ = 15)
a) not given sigma for SD, so use x̅ ± t* (s/sqrtn) You find t* from the 95% t chart=1.960 b). use n=(z*(σ)/m)^2 where m is the margin of error. You get 33.23, in which case you'll just round around
Suppose the thickness of the boards produced by a certain factory process varies normally. The distribution of thickness of the circuit boards is supposed to have the mean μ = 12 mm if the manufacturing process is working correctly. A random sample of five circuit boards is selected and measured, and the average thickness is found to be 9.13 mm. The standard deviation for the sample is computed to be 1.11 mm. a. Calculate a 95% confidence interval for the mean thickness of the circuit boards. b. Suppose the 95% confidence interval for the mean thickness of the circuit boards is (8.1, 11.1). What would be the correct conclusion to make for our two-sided hypothesis test at significance level 5%, given the confidence interval?
a. DF=4 at 95% CI gives 2.776. use this for equation: x̅ ± t* (s/sqrtn) to get (7.75, 10.5) b.Reject Ho because 12 mm is not included in the confidence interval; therefore, we have sufficient evidence to conclude that the mean thickness of the circuit boards is different from 12 mm.
A sample of 100 sales receipts is taken from two competing grocery stores. Store A has a mean total of $49.60 on each receipt and Store B has a mean total of $50.23 on each receipt. Store B claims they have a higher mean total per customer than Store A. If there is no difference between the two stores, the probability of obtaining this difference (49.60 - 50.23 = 0.63) is 0.356. a. What is the null hypothesis? b. With a p-value of 0.356, what is the appropriate conclusion to make?
a. Remember the null hypothesis means "no difference" Store A and Store B have the same mean total per customer. b. Very high p above .05, therefore The data does not provide strong evidence to reject H0.
The null hypothesis states that the average time full-time corporate employees work per week is 40 hours. The alternative hypothesis states that the average time full-time corporate employees work per week is more than 40 hours. To substantiate his claim, the researcher randomly selected 40 corporate employees and finds that they work an average of 43 hours per week with a standard deviation of 9.6 hours. a. Are the conditions for this test met? Why or why not? b. What is the test statistic for testing the hypotheses H0: μ =40 vs. Ha: μ > 40? c. Suppose the test statistic is 1.37. What is the p-value for this one-sided test? d. Suppose the p-value is 0.135. What is the correct interpretation of this p-value? e. Suppose the p-value is 0.0677. At α = 0.05, what should the researcher conclude? f. Referring to e, Suppose α =0.10 rather than 0.05. What should the researcher conclude?
a. Yes, because it was a random sample and n>30 b. Use test statistic formula: t=x̅-μ/ (s/sqrtn) =1.98 c. Find 1.37 interval at DF=39 and go down to one sided process to get interval which gives you: .05 < p-value < .10 d. Assuming the null hypothesis is true, there is a 0.135 probability of obtaining a sample statistic as extreme or more extreme than what we calculated. e. Fail to reject the null hypothesis. There is insufficient evidence to conclude that the average time full-time corporate employees work per week is greater than 40 hours. (could be chance) f. Reject the null hypothesis. The average time full-time corporate employees work per week is greater than 40 hours.
A certain brand of fishing line claims to have an average breaking strength of 30 pounds. A group of fishermen become angry because this brand of line seems to break so easily and test 25 randomly selected lines of this brand. The mean breaking strength is 27.994 pounds with a standard deviation of 0.846 pounds. A plot of the data follows. Suppose the p-value is 0.0001. Interpret this P-value in context. a. The probability that the mean breaking strength is 30 pounds is 0.0001. b. If the mean breaking strength of the line is 30 pounds as claimed, the probability of obtaining a sample mean breaking strength of 27.994 lbs or less is 0.0001. c. The probability is 0.0001 of rejecting the hypothesis that the mean breaking strength is 30 pounds when that hypothesis is correct. d. 0.01% of the time will we get a sample mean that leads us to reject the hypothesis that the mean breaking strength is 30 pounds when it is not 30 pounds.
b-because we should only get that value of 27.994 at a rate of .0001 of the time. But we got it here so that supports rejecting Ho
The weekly oral dosage of anabolic steroids was measured on a sample of 20 body builders. A 95% confidence interval estimate for the average weekly oral dose of anabolic steroids obtained from these results was 152 mg to 194 mg. Which of the following is a correct interpretation of 95% confidence? (a) There is a 0.95 probability that the average weekly oral dose of anabolic steroids is somewhere between 152 mg to 194 mg. (b) Ninety-five percent of the time, the average weekly oral dose of anabolic steroids is somewhere between 152 mg to 194 mg. (c) Using the same procedure as was used to obtain the computed interval in repeated sampling, we will obtain intervals that contain the average weekly oral dose of anabolic steroids 95 percent of the time.
c-idk
If p-value > α then ...?
do not reject the Ho. You do not declare observed difference statistically significant (Difference is likely due to chance)
Statistically significant is equivalent to all of the following except one. Which one is not equivalent? a) P-value < α. b) The difference between the observed value of the statistic and the value of the parameter as given in H0 is too large to attribute to just chance variation. c) The probability of obtaining a sample statistic as extreme or more extreme than actually observed if H0 were true is too small for us to believe that H0 is correct. d) The observed statistic is inconsistent with the null hypothesis. e) The difference between an observed statistic and the true parameter value is due to chance variation.
e) The difference between an observed statistic and the true parameter value is due to chance variation.
The test statistic t=x̄-μ/(s/sqrtn) measures
how many standard errors the observed x̄ is from the claimed parameter value μo
t*(s/sqrt(n))
margin of error for estimating μ
Requirements to use CLT?
n>30 and/or normal distribution
A test with < or > in Ha is _____
one-sided set
In order to estimate the mean age to be diagnosed with diabetes, a researcher takes a sample and finds the mean age to be 16.4. What type of statistical inference is this?
point estimation
μ
population mean
σ
population standard deviation
The p-value gives the _____
probability of observing an outcome as extreme or more extreme if Ho is true
Key word in hypothesis testing (statistic and proportion)?
proportion
Relationship between effect size and power?
proportional
The sample proportion, ____, is used to estimate the population proportion, ____.
p̂, p
Whenever performing a one sample t procedure on means, we should check for
randomization and no outliers in the data
Consider this formula for a confidence interval when σ is unknown: x̄ ± t* s/sqrtn Which part of this formula is the standard error of x-bar?
s/sqrtn
What is the correct formula for the standard error of x¯?
s/sqrtn
What is x̄
sample mean
What is p̂?
sample proportion -Estimator of the population proportion, p
n
sample size
s
sample standard deviation
σ/sqrt(n)
standard deviation of sampling of xbar
s/sqrt(n)
standard error of x bar
Sample results are said to be statistically significant whenever
the difference between the observed statistic and the claimed parameter value given in Ho is too large to be due to chance
Margin of error for 99% confidence tells us
the most a statistic differs from the parameter for the middle 99% of all possible statistic values.
Level of confidence can be defined as
the percentage of the time that the procedure will produce intervals that contains the parameter value.
What is power?
the probability of correctly rejecting a false null hypothesis =1-β
Fill in the blank: The t-distribution with 6 degrees of freedom has ___________the standard Normal distribution.
the same center but is more spread out than
A university administrator obtains a report of the academic records of past scholarship athletes at the university. From the report the administrator believes that the mean GPA (grade point average) of current male scholarship athletes is 3.02. If a researcher believes that the mean GPA of male athletes is significantly different than 3.02, what type of test should be conducted? ----Refer to the situation above. Suppose the researcher actually believed the true mean is much lower than 3.02. What type of test should be conducted?
two sided ---one sided lower tail
Definition of point estimation:
uses the sample statistic as the estimate of the population parameter with NO measure of uncertainty -only used as step 1 in valid inference
The Federal Pell Grant Program provides need-based grants to low-income undergraduate and certain postbaccalaureate students to promote access to postsecondary education. According to the National Postsecondary Student Aid Study conducted by the U.S. Department of Education in 2008, the average Pell grant award for 2007-2008 was $2,600. Assume that the standard deviation, σ, in Pell grant awards was $500. --What is the probability that the mean of n = 75 Pell grant awards will exceed (is greater than) $2700? ------What is the equation and the parts/steps?
z=(x̄-μ)/(σ/sqrtn) x̄=2700 μ=2600 σ=500 n=75 z = (2700-2600) / (500/sqrt(75))= z =1.73 look up z chart and it gives you .9582. But because it asks for greater than, you do 1-.9582=.0418
What is the probability of making a type II error?
β
What is the center line of the control chart for this process?
μ
mean of the sampling distribution of x̅ = ___?
μ (population mean)
What are the limits to control processes?
μ +- 3*(σ/sqrt(n))
An insurance agent collects a random sample of n = 47 auto insurance premiums and finds the average premium to be $1800 with a standard deviation of $500. What is the statistic in this study?
*fact about the sample The average amount of the n = 47 sampled auto insurance premiums
An insurance agent collects a random sample of n = 47 auto insurance premiums and finds the average premium to be $1800 with a standard deviation of $500. What is the parameter of interest for the insurance agent?
-The average amount of all auto insurance premiums
Relationship between n, β, and power?
-n is proportional to power -n and β are opposites
A proper interpretation of a confidence interval should have the following 3 things:
-statement of confidence -parameter in context -calculated interval
α
Alpha, level of significance. -Probability of a type I error
T/F The null hypothesis is the claim that the researcher wants to prove.
F (this is Ha)
The purpose of a confidence interval is to estimate the value of a sample statistic. T/F
F-parameter
What do we obtain from the sampling distribution of x̄, created assuming the null hypothesis is true, in order to perform a test of hypothesis?
P-value.
When comparing the z-distribution and the t-distribution, the t-distribution has a thiccer spread. T/F
T
If all possible samples of size 80 are taken from a population instead of size 20, how would this change the mean and standard deviation of the sampling distribution of x̅?
The mean would stay the same and the standard deviation would decrease