Foundations of Business Analytics Exam 3
A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size _____. Multiple Choice: a. n has a probability of 0.05 of being selected b. N and n have the same probability of being selected c. n has the same probability of being selected d. n has a probability of 0.5 of being selected
c. n has the same probability of being selected
A media group has conducted a survey asking voters if the election were held today would you vote for "Harvey the Rabbit" versus the best potential alternative candidate. They would like to show that "Harvey" would get a majority of the vote. The survey has 800 responses of which 424 said they would vote for "Harvey". Let alpha = .05
Ho: P < .5 Ha: P > .5
Decision rule using critical value method
If tcalc is more extreme than tcrit then Reject Ho Otherwise, Do not Reject Ho. (Note: MindTap uses terminology Fail to Reject Ho)
coverage error
If the research objective and the population from which a sample is to be drawn are not aligned, the data collected will not help accomplish the research objective
Using an α = 0.04, a confidence interval for a population proportion is determined to be 0.65 to 0.75. If the level of significance is decreased, the interval for the population proportion _____. Multiple Choice: a. becomes wider b. remains the same c. becomes narrower d. does not change
a. becomes wider
When the expected value of the point estimator is equal to the population parameter it estimates, it is said to be _____. Multiple Choice: a. unbiased b. symmetric c. precise d. predicted
a. unbiased
A parameter is a numerical measure from a population, such as _____. Multiple Choice: a. μ b. s c. x̄ d. p̄
a. μ
measurement error
an incorrect measurement of the characteristic of interest
What are the two decisions that you can make from performing a hypothesis test? Multiple Choice: a. Accept the null hypothesis; Accept the alternative hypothesis b. Reject the alternative hypothesis; Accept the null hypothesis c. Reject the null hypothesis; Fail to reject the null hypothesis d. Make a Type I error; Make a Type II error
c. Reject the null hypothesis; Fail to reject the null hypothesis
A Type I error is committed when _____. Multiple Choice: a. the validity of a claim was rejected b. the critical value is greater than the value of the test statistic c. a true null hypothesis is rejected d. a true alternative hypothesis is not accepted
c. a true null hypothesis is rejected
For a population with an unknown distribution, the form of the sampling distribution of the sample mean is _____. Multiple Choice: a. approximately normal for small sample sizes b. exactly normal for large sample sizes c. approximately normal for large sample sizes d. exactly normal for small sample sizes
c. approximately normal for large sample sizes
An estimate of a population parameter that provides an interval of values believed to contain the value of the parameter is known as the _____. Multiple Choice: a. population estimate b. parameter level c. interval estimate d. confidence level
c. interval estimate
Fill in the blanks with the correct word: In many practical sampling situations, the finite population correction factor is close to ____ , so the difference between the values of the standard deviation for the finite and infinite populations is ____________.
1; negligible
The purpose of statistical inference is to make estimates or draw conclusions about a _____. Multiple Choice: a. mean of the sample based upon the mean of the population b. sample based upon information obtained from the population c. population based upon information obtained from the sample d. statistic based upon information obtained from the population
c. population based upon information obtained from the sample
A random sample selected from an infinite population is a sample selected such that each element selected comes from the same _____ and each element is selected _____. Multiple Choice: a. population; simultaneously b. sample; independently c. population; independently d. sample; simultaneously
c. population; independently
The value of the _____ is used to estimate the value of the population parameter. Multiple Choice: a. population statistic b. sample parameter c. sample statistic d. population estimate
c. sample statistic
The processes that generate big data can be described by the following four attributes or dimensions: Multiple Choice: a. volume, variability, veracity, and velocity b. variety, vectors, veracity, and velocity c. volume, variety, veracity, and velocity d. tall data, wide data, narrow data, and big data
c. volume, variety, veracity, and velocity
Fill in the blanks with the correct word: When the population does not have a normal distribution, the ______________________ is helpful in identifying the shape of the sampling distribution of x̅ .
central limit theorem
census
collects data from every element in the population of interest
In order to determine an interval for the mean of a population with unknown standard deviation, a sample of 24 items is selected. The mean of the sample is determined to be 23. The number of degrees of freedom for reading the t value is _____. Multiple Choice: a. 22 b. 21 c. 24 d. 23
d. 23 Explanation: DOF = n - 1 n - 1 = 24 - 1 = 23
When using interval estimation, what is always important?
it is always important to carefully consider whether a random sample of the population of interest has been taken.
Sample Size
large enough to satisfy CLT
a =
level of significance
Confidence intervals for the population mean and population proportion become more ________ as the size of the sample __________.
narrow; increases
The results of any hypothesis test, no matter the sample size, are only reliable if the sample is relatively free of ______________.
nonsampling errors
No business decision should be based solely on statistical inference; ________________ should always be considered in conjunction with statistical significance.
practical significance
Sampling error is unavoidable when collecting a...
random sample
When the sample size n is very large, almost any difference between the sample mean and the hypothesized population mean results in ....
rejection of the null hypothesis
•If repeated independent random samples of the same size are collected from the population of interest using probability sampling techniques, on average the samples will be ...
representative of the population.
sources of big data
sensors, mobile devices, Internet activities, digital processes, and social media
People end up tossing 12% of what they buy at the grocery store. Assume this is the true population proportion and that you plan to take a sample survey of 540 grocery shoppers to further investigate their behavior. Show the sampling distribution of p, the proportion of groceries thrown out by your sample respondents. If required, round your answer to four decimal places. standard error of p-bar = ?
standard error of p-bar = .014 Explanation: standard error of p-bar = sqrt(p(1-p) / n)) p-bar = mean p-bar = 12% or .12 n = 540 standard error of p-bar = sqrt(.12(1-.12) / 540 = .014
Test Statistic for Hypothesis Test about a population proportion
tcalc = (p bar - Hyp value)/ std error Std error =sqrt(hyp value*(1-hyp value)/n)
Sampling error
the deviation of the sample statistic from the population parameter that results from random sampling
df (degrees of freedom) = n - 1 refers to
the number of data values that are free to vary if the mean is known
level of significance error
the probability of making a Type I error when the null hypothesis is true
Convenience Sampling
using students
Effective Application of Confidence Intervals
• Interval estimates become increasingly precise as the sample size increases; extremely large samples will yield extremely precise estimates. • No interval estimate, no matter how precise, will accurately reflect the parameter being estimated unless the sample is relatively free of nonsampling error.
four attributes or dimensions of big data
• Volume- the amount of data generated. • Variety- the diversity in types and structures of data generated. • Veracity- the reliability of the data generated. • Velocity- the speed at which the data are generated.
Health insurers are beginning to offer telemedicine services online that replace the common office visit. Wellpoint provides a video service that allows subscribers to connect with a physician online and receive prescribed treatments. Wellpoint claims that users of its LiveHealth Online service saved a significant amount of money on a typical visit. The data shown below ($), for a sample of 20 online doctor visits, are consistent with the savings per visit reported by Wellpoint. 92, 34, 40, 105, 83, 55, 56, 49, 40, 76, 48, 96, 93, 74, 73, 78, 93, 100, 53, 82 Assuming that the population is roughly symmetric, construct a 95% confidence interval for the mean savings for a televisit to the doctor as opposed to an office visit. If required, round your answers to two decimal places.
$61.20 to $80.80 Explanation: Sample Mean - 71 (Sum / n) Sample Standard Deviation - 22.351 s = sqrt(Sigma(xi - u)^2 / N - 1)) Sample size - 20 (n) Standard Error - 4.998 standard error = σ / sqrt(n) (22.351 / sqrt(20)) = 4.998 t* @ 95% - 2.093 DOF = n - 1 = 20 - 1 = 19 Alpha = 1 - .95 = .05 Alpha/2 = .05/2 = .025 t score = 2.093 (from t table) Margin of Error - 10.461 Margin of Error = ta/2 * s / sqrt(n) 2.093 * 22.351/sqrt(20) = 10.461 Formula: Sample Mean (+/-) Margin of Error 71 - 10.461 = 61.20 71 + 10.461 = 80.80
The state of California has a mean annual rainfall of 22 inches, whereas the state of New York has a mean annual rainfall of 42 inches. Assume that the standard deviation for both states is 8 inches. A sample of 30 years of rainfall for California and a sample of 45 years of rainfall for New York has been taken. If required, round your answer to three decimal places. (a)Show the sampling distribution of the sample mean annual rainfall for California. California E(x) = _________ inches σx̄ = __________ inches (b)Show the sampling distribution of the sample mean annual rainfall for New York. New York E(x) = _________ inches σx̄ = __________ inches (c)In which of the preceding two cases, part (a) or part (b), is the standard error of x smaller? Why? Fill in the blanks: The standard error is _____________ for New York because the sample size is ___________ than for California.
(a) California E(x) = 22 inches σx̄ = 1.46 inches (b) New York E(x) = 42 inches σx̄ = 1.19 inches (c) smaller; larger Explanation: (a) E(x) = mean The Answer is in the question "The state of California has a mean annual rainfall of 22 inches" E(x) = 22 inches σx̄ = standard error standard error = σ / sqrt(n) σ = 8 n = 30 8 / sqrt(30) = 1.46 inches (b) E(x) = mean The Answer is in the question "whereas the state of New York has a mean annual rainfall of 42 inches" E(x) = 42 inches σx̄ = standard error standard error = σ / sqrt(n) σ = 8 n = 45 8 / sqrt(45) = 1.19 inches (c) Larger sample size with the same standard deviation leads to a smaller standard error
A production line operation is designed to fill cartons with laundry detergent to a mean weight of 32 ounces. A sample of cartons is periodically selected and weighed to determine whether underfilling or overfilling is occurring. If the sample data lead to a conclusion of underfilling or overfilling, the production line will be shut down and adjusted to obtain proper filling. (a)Choose the null and alternative hypotheses that will help in deciding whether to shut down and adjust the production line. H0: ? Ha: ? (b)Comment on the conclusion and the decision when H0 cannot be rejected. Fill in the blanks: Conclude that there ___________ statistical evidence that the production line is not operating properly. ___________ the production process to continue. (c)Comment on the conclusion and the decision when H0 can be rejected. Fill in the blanks: Conclude that there ___________ statistical evidence that overfilling or underfilling exists. ____________ down and adjust the production line.
(a) H0: u = 32 Ha: u ≠ 32 (b) is not; Allow (c) is; Shut Explanation: (a) H0 = what is desired = 32 ounces Ha = what is undesired = anything that is not 32 ounces H0 = 32, Ha: u ≠ 32 (b) H0: u = 32, the alternative hypothesis has not been tested yet so there isn't any evidence and they can continue. (c) H0 can be rejected when u ≠ 32, so Ha. There is evidence and you should shut down if u ≠ 32 which is Ha.
A shareholders' group, in lodging a protest, claimed that the mean tenure for a chief executive officer (CEO) was at least nine years. A survey of companies reported in The Wall Street Journal found a sample mean tenure of x = 7.47 years for CEOs with a standard deviation of s = 6.38 years. (a)Choose the correct hypotheses that can be used to challenge the validity of the claim made by the shareholders' group. H0: ? Ha: ? (b)Assume that 85 companies were included in the sample. What is the p value for your hypothesis test? If required, round your answer to four decimal places. (c)At α = 0.01, what is your conclusion? Fill in the blanks: We ___________ the null hypothesis. We ____________ conclude that the mean tenure for a CEO is shorter than nine years.
(a) H0: u ≥ 9 years Ha: u < 9 years (b)0.0149 (c) fail to reject; cannot Explanation: (a) H0: u ≥ 9 years (What is given; "at least 9 years" = ≥ 9 years) Ha: u < 9 years (What you are testing for; the validity of the claim = if not more than or equal to 9 years, then it is less than 9 years) (b) Test Statistic Formula: (X bar - Hyp value)/ ((s) / sqrt(n))) x-bar = 7.47 hyp value = 9 n = 85 s = 6.38 t = 7.47 - 9 / (6.38 / sqrt(85)) t = - 2.211 DOF = 85 - 1 = 84 p-value =T.DIST(-2.211,84,TRUE) = 0.0149 (c) If p value is less than a, then you can reject the null hypothesis. If p value is more than a, then you cannot reject the null hypothesis. So you fail to reject the null hypothesis because p (.0149) is greater than a (.01).
The Wall Street Journal reported that the age at first startup for 35% of entrepreneurs was 29 years of age or less and the age at first startup for 65% of entrepreneurs was 30 years of age or more. (a)Suppose a sample of 200 entrepreneurs will be taken to learn about the most important qualities of entrepreneurs. Show the sampling distribution of p where p is the sample proportion of entrepreneurs whose first startup was at 29 years of age or less. If required, round your answers to four decimal places. np = ? n (1 - p) = ? E(p-bar) = ? standard error of p-bar = ? (b)Suppose a sample of 200 entrepreneurs will be taken to learn about the most important qualities of entrepreneurs. Show the sampling distribution of p where p is now the sample proportion of entrepreneurs whose first startup was at 30 years of age or more. If required, round your answers to four decimal places. np = ? n (1 - p) = ? E(p-bar) = ? standard error of p-bar = ? (c)Are the standard errors of the sampling distributions different in parts (a) and (b)? Yes or No? Justify your answer: Fill in the blanks: The sample size and the product of p(1 - p) are ____________ in both calculations.
(a) np = 70 n (1 - p) = 130 E(p-bar) = .35 standard error of p-bar = .0337 (b) np = 130 n (1 - p) = 70 E(p-bar) = .65 standard error of p-bar = .0337 (c) No the same Explanation: (a) np = n * p n = 200 p = 35% or .35 = 200 * .35 = 70 n (1 - p) = 200 (1 - .35) = 130 E(p-bar) = .35 standard error of p-bar = sqrt(p(1-p) / n)) standard error of p-bar = sqrt(.35(1-.35) / 200)) = .0337 (b) np = n * p n = 200 p = 65% or .65 = 200 * .65 = 130 n (1 - p) = 200 (1 - .65) = 70 E(p-bar) = .65 standard error of p-bar = sqrt(p(1-p) / n)) standard error of p-bar = sqrt(.65(1-.65) / 200)) = .0337 (c) No (.0037 for both) the same (.35(1-.35) = (.65(1-.65))
For the year 2010, 33% of taxpayers with adjusted gross incomes between $30,000 and $60,000 itemized deductions on their federal income tax return. The mean amount of deductions for this population of taxpayers was $16,642. Assume that the standard deviation is σ = $2,260. If required, round your answer to two decimal places. (a)What are the sampling distributions of x for itemized deductions for this population of taxpayers for each of the following sample sizes: 30, 50, 100, and 400? E(x) = n = 30, standard deviation of x-bar = ? n = 50, standard deviation of x-bar = ? n = 100, standard deviation of x-bar = ? n = 400, standard deviation of x-bar = ? (b)What is the advantage of a larger sample size when attempting to estimate the population mean? Fill in the blanks: A larger sample _____________ the standard error and results in a(n) ______________ precise estimate of the population mean.
(a) E(x) = $16,642 n = 30, standard deviation of x-bar = $412.62 n = 50, standard deviation of x-bar = $319.61 n = 100, standard deviation of x-bar = $226 n = 400, standard deviation of x-bar = $113 (b) reduces; more Explanation: (a) E(x) = mean The answer is in the question. "The mean amount of deductions for this population of taxpayers was $16,642." μ =$16,642 σ = $2,260 standard error = σ / sqrt(n) n = 30, standard error = $412.62 (2,260 / sqrt(30) = $412.62 n = 50, standard error = $319.61 (2,260 / sqrt(50) = $319.61 n = 100, standard error = $226 (2,260 / sqrt(100) = $226 n = 400, standard error = $113 (2,260 / sqrt(400) = $113 (b) You can notice that as n increases, the standard error reduces.
The Economic Policy Institute periodically issues reports on wages of entry-level workers. The institute reported that entry-level wages for male college graduates were $21.68 per hour and for female college graduates were $18.80 per hour in 2011. Assume that the standard deviation for male graduates is $2.30, and for female graduates it is $2.05. (a)What is the sampling distribution of x for a random sample of 50 male college graduates? If required, round your answer to four decimal places. (b)What is the sampling distribution of x for a random sample of 50 female college graduates? If required, round your answer to four decimal places. (c)In which of the preceding two cases, part (a) or part (b), is the standard error of x smaller? Why? Fill in the blanks: The standard error is smaller for ___________ college graduates because the standard deviation of entry level hourly wage is ____________ for female graduates than it is for male graduates.
(a) .3253 (b) .2899 (c) Part b, female; smaller Explanation: σx̄ = standard error standard error = σ / sqrt(n) (a) σ = 2.3 n = 50 2.3 / sqrt(50) = .3253 (b) σ = 2.05 n = 50 2.05 / sqrt(50) = .2899 (c) .2899 is smaller than .3253 .2899 is female, .3253 is male
One of the questions in the Pew Internet & American Life Project asked adults if they used the Internet at least occasionally. The results showed that 454 out of 478 adults aged 18-29 answered Yes; 741 out of 833 adults aged 30-49 answered Yes; and 1,058 out of 1,644 adults aged 50 and over answered Yes. If required, round your answers to four decimal places. (a)Develop a point estimate of the proportion of adults aged 18-29 who use the Internet. (b)Develop a point estimate of the proportion of adults aged 30-49 who use the Internet. (c)Develop a point estimate of the proportion of adults aged 50 and over who use the Internet. (d)Comment on any apparent relationship between age and Internet use. Fill in the blanks: _____________ adults are more likely to use the Internet. (e)Suppose your target population of interest is that of all adults (18 years of age and over). Develop an estimate of the proportion of that population who use the Internet.
(a) .9498 (b) .8896 (c) .6436 (d) Younger (e) .7624 Explanation: p-bar = the total number of successes divided by the total number of trials (a) 454 / 478 = .9498 (b) 741 / 833 = .8896 (c) 1058 / 1644 = .6436 (d) Younger because .9498 is a larger number than the others (e) (454 + 741 + 1058) / (478 + 833 + 1644) = .7624
The College Board reported the following mean scores for the three parts of the SAT: Critical Reading - 502 Mathematics - 515 Writing - 494 Assume that the population standard deviation on each part of the test is σ = 100. If required, round your answers to two decimal places. (a)For a random sample of 30 test takers, what is the sampling distribution of x for scores on the Critical Reading part of the test? (b)For a random sample of 60 test takers, what is the sampling distribution of x for scores on the Mathematics part of the test? (c)For a random sample of 90 test takers, what is the sampling distribution of x for scores on the Writing part of the test?
(a) 18.26 (b) 12.91 (c) 10.54 Explanation: σx̄ = standard error standard error = σ / sqrt(n) (a) σ = 100 n = 30 100 / sqrt(30) = 18.26 (b) σ = 100 n = 60 100 / sqrt(60) = 12.91 (c) σ = 100 n = 90 100 / sqrt(90) = 10.54
The national mean annual salary for a school administrator is $90,000 a year. A school official took a sample of 25 school administrators in the state of Ohio to learn about salaries in that state to see if they differed from the national average. Salaries: 77600 76000 90700 97200 90700 101800 78700 81300 84200 97600 77500 75700 89400 84300 78700 84600 87700 103400 83800 101300 94700 69200 95400 61500 68800 (a)Choose the hypotheses that can be used to determine whether the population mean annual administrator salary in Ohio differs from the national mean of $90,000. H0: ? Ha: ? (b)The sample data for 25 Ohio administrators is contained in the file named Administrator. What is the p value for your hypothesis test in part (a)? If required, round your answer to four decimal places. Do not round your intermediate calculations. (c)At α = 0.05, can your null hypothesis be rejected? Yes or No? What is your conclusion? Fill in the blanks: There is _______________ evidence to conclude that the national mean annual salary for a school administrator in the state of Ohio _________________ the national average.
(a) H0: u = 90,000; Ha: u ≠ 90,000 (b) .043 (c) Yes sufficient; differs Explanation: (a) H0: u = 90,000 (What is given); Ha: u ≠ 90,000 (What you are testing for. If it isnt equal to 90,000) (b) Test Statistic Formula: (X bar - Hyp value)/ ((s) / sqrt(n))) x-bar = 85,272 hyp value = 90,000 n = 25 s = sqrt(Sigma(xi - u)^2 / N - 1)) = 11,039.23 t = 85,272 - 90,000 / (11,039.23 / sqrt(25)) t = - 2.14 DOF = 25 - 1 = 24 p-value = T.DIST.2T(abs(-2.14), 24) = .0426 (c) If p value is less than a, then you can reject the null hypothesis.
Which is cheaper: eating out or dining in? The mean cost of a flank steak, broccoli, and rice bought at the grocery store is $13.04. A sample of 100 neighborhood restaurants showed a mean price of $12.65 and a standard deviation of $2 for a comparable restaurant meal. (a)Choose the appropriate hypotheses for a test to determine whether the sample data support the conclusion that the mean cost of a restaurant meal is less than fixing a comparable meal at home. H0: ? Ha: ? (b)Using the sample from the 100 restaurants, what is the p value? If required, round your answer to four decimal places. (c)At α = 0.05, what is your conclusion? Fill in the blanks: We _______________ the null hypothesis. We __________ conclude that the cost of a restaurant meal is significantly cheaper than a comparable meal fixed at home.
(a) H0: u ≥ 13.04; Ha: u < 13.04 (b) .0270 (c) reject; can Explanation: (a) H0: u ≥ 13.04 (What the sample data says, cooking at home is equal or greater than 13.04); Ha: u < 13.04 (What you are testing for, is it less than cooking a meal at home, less than 13.04?) (b) n = 100 x-bar = 12.65 s = 2.00 Standard Error of x = s / sqrt(n) 2 / sqrt(100) = .20 t = mean - hypothesized value / standard error 12.65 - 13.04 / .20 = - 1.95 p-value = T.DIST(-1.95,99,TRUE) = .0270 (c) p = .0270, a = .05 When your p-value is less than or equal to your significance level, you reject the null hypothesis. We then accept the Alternative Hypothesis (Eating out is less than $13.04).
Suppose a new production method will be implemented if a hypothesis test supports the conclusion that the new method reduces the mean operating cost per hour. (a)Choose the appropriate null and alternative hypotheses if the mean cost for the current production method is $220 per hour. H0: ? Ha: ? (b)What is the Type I error in this situation? What are the consequences of making this error? Fill in the blanks: The Type I error is _____________ H0 when it is ____________. This error occurs when it is concluded that the new production method ______________ the mean operating cost per hour, when in fact it does not. This error could lead to __________________ a production method that ___________________ to _______________ operating costs. (c) What is the Type II error in this situation? What are the consequences of making this error? Fill in the blanks: The Type II error is ____________ H0 when it is _____________. This error occurs when it is concluded that the new production method _____________ the mean operating cost per hour, when in fact it does. This error could lead to __________________ a production method that _______________ have helped to ______________ operating costs.
(a) H0: u ≥ 220; Ha: u < 220 (b) rejecting; true; reduces; implementing; does not help; reduce (c) accepting; false; does not reduce; not implementing; would; reduce Explanation: (a) H0 is what you are testing against (if it is greater than or equal to 220); Ha is what you are testing for (if it is less than 220)
A simple random sample of 5 months of sales data provided the following information: Month & Units Sold 1 - 94 2 - 80 3 - 85 4 - 94 5 - 92 (a)Develop a point estimate of the population mean number of units sold per month. x-bar = ? (b)Develop a point estimate of the population standard deviation. If required, round your answer to two decimal places. s = ?
(a) x-bar = 89 (b) s = 6.24 Explanation: (a) x-bar = mean 94 + 80 + 85 + 94 + 92 = 445 445 / 5 = 89 (b) s = sqrt(Sigma(xi - u)^2 / N - 1)) s = sqrt(94 - 89)^2 + (80 - 89)^2 + (85 - 89)^2 + (94 - 89)^2 + (92 - 89)^2 / 5 - 1)) = 6.24
The president of Doerman Distributors, Inc., believes that 30% of the firm's orders come from first-time customers. A random sample of 100 orders will be used to estimate the proportion of first-time customers. Assume that the president is correct and p = 0.30. What is the sampling distribution of p for this study? If required, round your answer to four decimal places.
.0458 Explanation: standard error of p-bar = sqrt(p(1-p) / n)) p = .30 n = 100 standard error of p-bar = sqrt(.30(1-.30) / 100)) = .0458
level of significance =
1 - confidence coefficient
Three possible reasons that a sample mean differs from population mean
1. Sampling error. 2. Nonsampling error. 3. Population mean has changed since prior study.
The International Air Transport Association surveys business travelers to develop quality ratings for transatlantic gateway airports. The maximum possible rating is 10. Suppose a simple random sample of 50 business travelers is selected and each traveler is asked to provide a rating for the Miami International Airport. The ratings obtained from the sample of 50 business travelers follow. Ratings: 6, 4, 6, 8, 7, 7, 6, 3, 3, 8, 10, 4, 8, 7, 8, 7, 5, 9, 5, 8, 4, 3, 8, 5, 5, 4, 4, 4, 8, 4, 5, 6, 2, 5, 9, 9, 8, 4, 8, 9, 9, 5, 9, 7, 8, 3, 10, 8, 9, 6 Develop a 95% confidence interval estimate of the population mean rating for Miami. If required, round your answers to two decimal places. Do not round intermediate calculations.
5.74 to 6.94 Explanation: Sample Mean - 6.34 (Sum / n) Sample Standard Deviation - 2.163 s = sqrt(Sigma(xi - u)^2 / N - 1)) Sample size - 50 (n) Standard Error - 0.306 standard error = σ / sqrt(n) (2.163 / sqrt(50)) t* @ 95% - 2.010 DOF = n - 1 = 49 Alpha = 1 - .95 = .05 Alpha/2 = .05/2 = .025 t score = 2.010 (from t table) Margin of Error - 0.61 Margin of Error = ta/2 * s / sqrt(n) 2.010 * 2.163/sqrt(50) = .61 Formula: Sample Mean (+/-) Margin of Error 6.34 - 0.61 = 5.74 6.34 + 0.61 = 6.94
tall data
A data set that has so many observations that traditional statistical inference has little meaning
wide data
A data set that has so many variables that simultaneous consideration of all variables is infeasible
significance tests
Applications of hypothesis testing that only control the Type I error
Point Estimation
Calculating sample mean, sample standard deviation, and sample proportion
We wish to show that more than 60% of the target population are in favor of making Big Light the official drink. A SRS of 300 is taken. 200 respond they think Big Light should be the official drink. Alpha = .05
Ho P < .6 Ha P >.6
nonsampling error
Deviations of the sample from the population that occur for reasons other than random sampling
Expected Value of p ̅
E ( p ̅ ) = p
The Expected Value of x̄ where...
E ( x̄ ) = μ
Randomization
Each element has the same probability of being selected
Interdependence
Each element is selected independently
nonresponse error
Even when the sample is taken from the appropriate population, nonsampling error can occur when segments of the target population are systematically underrepresented or overrepresented in the sample
Fill in the blanks with the correct word: __________________ is a necessary ingredient of sound statistical practice.
Good judgement
The three forms for a hypothesis test about a population proportion are:
H0 : p ³ p0 Ha : p < p0 H0 : p £ p0 Ha : p > p0 H0 : p = p0 Ha : p ¹ p0
The manager of the Danvers-Hilton Resort Hotel stated that the mean guest bill for a weekend is $600 or less. A member of the hotel's accounting staff noticed that the total charges for guest bills have been increasing in recent months. The accountant will use a sample of future weekend guest bills to test the manager's claim. Which form of the hypotheses should be used to test the manager's claim? H0: ? Ha: ?
H0: u ≤ 600 Ha: u > 600 Explanation: H0 = what is given/desired = less than or equal to 600 (u ≤ 600) Ha = the claim that you are testing = if it is greater than 600 (u > 600)
Because of high production-changeover time and costs, a director of manufacturing must convince management that a proposed manufacturing method reduces costs before the new method can be implemented. The current production method operates with a mean cost of $220 per hour. A research study will measure the cost of the new method over a sample production period. Develop the null and alternative hypotheses most appropriate for this study. H0: ? Ha: ?
H0: u ≥ 220 Ha: u < 220 Explanation: H0 = what you don't want from your test = anything other than what you want from your test (u ≥ 220) Ha = what you want from your test = less than 220 (u < 220)
Central Limit Theorem
In selecting random samples of size n from a population, the sampling distribution of the sample mean can be approximated by a normal distribution as the sample size becomes large.
For many years businesses have struggled with the rising cost of health care. But recently, the increases have slowed due to less inflation in health care prices and employees paying for a larger portion of health care benefits. A recent Mercer survey showed that 52% of U.S. employers were likely to require higher employee contributions for health care coverage. Suppose the survey was based on a sample of 600 companies. Compute the margin of error and a 95% confidence interval for the proportion of companies likely to require higher employee contributions for health care coverage. If required, round your answer to four decimal places. Round intermediate calculations to four decimal places. Margin of Error: ? Confidence Interval: ____________ to ___________
Margin of Error: 0.0400 Confidence Interval: .48 to .56 Explanation: DOF = n - 1 = 600 - 1 = 599 Alpha/2 = 1 - .95 = .05/2 = .025 t score = 1.96 (from t table) standard error of p-bar = sqrt(p(1-p) / n)) p = .52 sqrt(.52(1-.52) / 600)) = 0.0204 Margin of Error = t score * SE 1.96 * 0.0204 = .0400 Formula: p-bar (+/-) margin of error .52 - .0400 = .48 .52 + .0400 = .56
Example of CI for Population Mean: I take a SRS of 49 books and find the sample mean price is $150 with a sample standard deviation of $28. Estimate the 95% confidence interval for true population mean price.
The 95% Confidence Interval for M is $141.96 to $158.04
big data
The data sets that are generated are so large or complex that current data processing capacity and/or analytic methods are not adequate for analyzing the data
Start with X bar ± Z * σ / sqrt(n): How can you calculate σ if you do not know µ? X bar ± Z * s / sqrt(n) There are now 2 sources of standard error, σ and s. To adjust for the standard error using s to estimate σ what new distribution do we use?
To adjust for the standard error using s to estimate σ we use a new distribution - the t distribution
As the number of degrees of freedom for a t distribution increases, the difference between the t distribution and the standard normal distribution _____. Multiple Choice: a. becomes smaller b. fluctuates c. stays the same d. becomes larger
a. becomes smaller
An article indicated that the biggest issue facing e-retailers is the ability to turn browsers into buyers. The article stated that less than 10% of browsers buy something from a particular website. A SRS of 2000 browsers was taken of which 180 made a purchase. Let alpha = .10. Ho: P ³ 0.1 Ha: P < 0.1 a) The test statistic = ___________________________ b) The critical value = ____________________________ c) P-value = ________________ Multiple Choice: d) If the decision is Reject Ho, which of the following statements should we conclude? 1) Over 10% of all browsers will likely make a purchase. 2) Less than 10% of all browsers will likely make a purchase. 3) At least 10% of all browsers will likely make a purchase. 4) 10% or less of all browsers will likely make a purchase.
a) -1.49 b) -1.28 c) .0681 d) 2 Explanation: (a) Test Statistic Formula: tcalc = (p bar - Hyp value)/ std error Std error =sqrt(hyp value*(1-hyp value)/n) Std error = sqrt(0.1(1-0.1)/2000) Std error = .006708 tcalc = (.09 - .1) / .006708 = -1.49 (b) DOF = 180 - 1 = 179 a = .10 t score left tailed = -1.28 (c) p-value = 1 - z score 1 - .9319 = .0681 (d) Reject h0 means we accept ha which is p < .10
A research study will be done to see if the average number of hours undergraduate students' study per week differs from 14. A random sample of 36 students showed an average of 12 hours per week studying with a standard deviation of 2.46 hours. Let a = .05 to test the following hypotheses. Ho: m = 14 Ha: m ≠ 14 a) The test statistic = _________________________________ b) The decision would be: Multiple Choice: 1) DNR Ho 2) Reject Ho 3) DNR Ha 4) Reject Ha c) If the decision is Reject Ho, what type of error could we have made?
a) -4.88 b) 2 c) Type I Explanation: (a) t score = 2.030 Test Statistic Formula: (X bar - Hyp value)/ ((s) / sqrt(n))) (12 - 14) / (2.46) / sqrt(36) = - 4.88 (b) If the test statistic is more extreme in the direction of the alternative than the critical value, reject the null hypothesis in favor of the alternative hypothesis. If the test statistic is less extreme than the critical value, do not reject the null hypothesis. m ≠ 14 so we can reject Ho. (c) A Type I Error occurs when we Reject Ho when, in fact, Ho is True. In this case, we mistakenly reject a true null hypothesis.
A manufacturer claims the proportion of products that do not meet minimum government standards is less than 3%. To test the manufacturer's claim a random sample of 600 products is taken of which 15 did not meet the standard. Let a= .05. Ho: P ≥ .03 Ha: P < .03 a) The sample proportion = ___________________________ b) The critical value = _________________________ c) The p-value = __________________ d) If the decision is Reject Ho, which of the following statements should we conclude? Multiple Choice: 1) Less than 3% do not meet standards. 2) We don't have enough evidence to say 3% or more do not meet standards. 3) More than 3% do not meet standards. 4) 3% or less do not meet standards.
a) .025 b) -1.647 c) .2365 d) 1 Explanation: (a) p = x / n p = 15 / 600 p = .025 (b) Critical Value DOF = n - 1 = 600 - 1 = 599 Alpha = .05 Right tailed = 1.647 so Left Tailed = - 1.647 (c) p value = (d) If we reject Ho we are saying that more than 3% is not true so we accept ha which is less than 3% is true.
A new brand of yogurt is being market tested. A random sample of 300 taste-test the new yogurt and 75 said they liked it. Construct a 90% confidence interval for the population proportion that will like the new yogurt. a) sample proportion (p bar) = _______________ b) t score = __________________ c) Margin of Error = ________________ d) Lower limit = ________________ Upper limit = __________________
a) .25 b) 1.65 c) .04 d) .21 to .29 Explanation: (a) p bar = number of successes / the total number of trials p bar = 75 / 300 p bar = .25 (b) DOF = 300 - 1 = 299 Alpha/2 = 1 - .90 = .10/2 = .05 t score = 1.65 (from t table) (c) Margin of Error = t score * sqrt(p bar *(1- p bar)/n) 1.65 * sqrt(.25 *(1-.25)/300)) = .04125 or .04 rounded (d) Formula: p-bar (+/-) margin of error .25 - .04 = .21 (lower limit) .25 + .04 = .29 (upper limit)
Insurance company records indicate that 10% of its policyholders file claims involving theft of personal property from their homes. Suppose a random sample of 400 policyholders is selected. a) What is the probability the sample proportion is more than 8%? b) What is the probability the sample proportion is less than 11%? c) P (.07 < p bar < .13) =
a) .9088 b) .7475 c) .9545 Explanation: (a) σp ̅ = √(p(1 - p) / n) .015 = σ Find the Z-Score: Z score = x - μ / σ = .08 - .10 / .015 = - 1.33333 P-value from Z-Table: P (x > .08) = 1 - P(x < .08) = 0.90879 (b) σp ̅ = √(p(1 - p) / n) .015 = σ Find the Z-Score: Z score = x - μ / σ = .11 - .10 / .015 = 0.66667 P-value from Z-Table: P (x < .11) = 0.7475 (c) σp ̅ = √(p(1 - p) / n) .015 = σ FIND Z SCORE OF BOTH VALUES: Find the Z-Score of P (x < .07): Z score = x - μ / σ = .07 - .10 / .015 = - 2.0 P-value from Z-Table: P (x < .07) = 0.02275 Find the Z-Score of P (x < .13): Z score = x - μ / σ = .13 - .10 / .015 = 2.0 P-value from Z-Table: P (x < .13) = 0.97725 .97725 - .02275 = .9545
I wish to establish that the average business student exceeded 900 points on a competency test. I took a simple random sample of 36 and found the sample mean was 915 with a standard deviation of 78. Let alpha = .05 Ho: μ ≤ 900 Ha: μ > 900 a) test statistic = _____________________ a) P-value = ___________________ c) Critical value = _____________________ d) IF the decision is DNR Ho, which one of the following statements should we conclude? Multiple Choice: 1) We do not have enough evidence to say it is more than 900. 2) We have enough evidence to say the average score is greater than 900. 3) The average score is 900 or more. 4) The average score is 900.
a) 1.15 b) .129 c) 1.69 d) 1 Explanation: (a) Test Statistic Formula: (X bar - Hyp value)/ ((s) / sqrt(n))) (915 - 900) / (78) / sqrt(36) = 1.153846 or 1.15 rounded (b) Right Tailed p value of 1.15 Z of 1.15 = .8749 1 - .8749 = .125 or = .129 (c) Critical Value (t score) DOF = n - 1 = 36 - 1 = 35 a = .05 (don't divide a by 2 for this) t score = 1.69 (from t table) (d) If we do not reject H0, we conclude that we do not have significant evidence to show that Ha is true. We do not conclude that H0 is true.
In the past, the average age of employees of a large corporation has been 40 years. Recently, the company has been hiring older individuals. In order to determine whether there has been an increase in the average age of all the employees, a SRS of 25 employees was selected. The average age in the sample was 42 years with a standard deviation of 5 years. Assume population is normally distributed. Let a = .05. Ho: m ≤ 40 Ha: m > 40 a) The test statistic = _________________________________ b) The critical value = _______________________________ c) The p-value = ___________________ d) The decision is ____________________________ Multiple Choice: e) If the decision is Do Not Reject Ho, what type of error may have been committed? 1) Type I error 2) Type II error 3) Type III error 4) None of the above
a) 2 b) 1.71 c) .0285 d) Reject Ho e) 2 Explanation: (a) Test Statistic Formula: (X bar - Hyp value)/ (std error) * sqrt(n) (42 - 40) / (5) * sqrt(25) = 2.0 (b) DOF = 25 - 1 = 24 Alpha/2 = .05 / 2 = .025 2.064 (t table) (c) Value of Alpha/2 of T score = 2.0 on T table. Find the answer between those two values = .0285 (d) You can reject H₀ at the significance level 0.05, because your p-value does not exceed 0.05. (e) type II error describes the error that occurs when one fails to reject a null hypothesis that is actually false. The error rejects the alternative hypothesis, even though it does not occur due to chance.
Suppose a random sample of 496 women showed 25% were in favor of more part-time jobs. Test to see if the proportion of women favoring part-time jobs is greater than 20%. Let alpha =.05. Ho: P ≤ .2 Ha: P > .2 a) The test statistic = ___________________________ b) The critical value = ____________________________ c) P-value = ________________ d) If the decision is Reject Ho, which of the following statements should we conclude? 1) More than 20% are in favor of more part time jobs. 2) Less than 20% are in favor of more part time jobs. 3) 20% or more are in favor of more part time jobs. 4) 20% or less are in favor of more part time jobs.
a) 2.78 b) 1.65 c) .0028 d) 1 Explanation: (a) Test Statistic Formula: tcalc = (p bar - Hyp value)/ std error Std error =sqrt(hyp value*(1-hyp value)/n) Std error = sqrt(0.20(1-0.20)/496) Std error = .01796 tcalc = (.25 - .2) / .01796 = 2.78 (b) DOF = 125 - 1 = 123 alpha / 2 = .025 1.65 = t score (c) p-value = 1 - z score 1 - .9973 = .0028
An insurance company randomly sampled 64 policyholders. The sample mean age for a policyholder was 45 with a standard deviation of 4.2 years. Construct a 95% confidence interval for the population mean age of a policyholder. a) Lower limit = _____________________ Upper limit = ____________________ b) If n increases, what happens to the width of the confidence interval? Multiple Choice: 1) it would get wider 2) it would get narrower 3) it would not change at all 4) None of the above
a) 44 to 46 b) 2 Explanation: Mean = 45 S = 4.2 n = 64 (a) DOF = n - 1 = 64 - 1 = 63 Alpha = 1 - .95 = .05 Alpha/2 = .05/2 = .025 t score = 1.998 (from t table) Margin of Error = t score * s / sqrt(n) 1.998 * 4.2/sqrt(64) = 1.05 Formula: Sample Mean (+/-) Margin of Error 45 - 1.05 = 43.95 or 44 rounded 45 + 1.05 = 46.05 or 46 rounded (b) From the formula, it should be clear that: - The width of the confidence interval decreases as the sample size increases. - The width increases as the standard deviation increases. - The width increases as the confidence level increases (0.5 towards 0.99999 - stronger). - The width increases as the significance level decreases (0.5 towards 0.00000...01 - stronger).
A producer of high quality teas wants to be sure they are filling tea bags with an average weight of 5.5 grams. Any underfilling or overfilling is undesirable. The correct set of hypotheses is Multiple Choice: a) Ho: M = 5.5, Ha: M ≠ 5.5 b) Ho: M ≤ 5.5, Ha: M > 5.5 c) Ho: M < 5.5, Ha: M ≥ 5.5 d) Ho: M ≠ 5.5, Ha: M = 5.5
a) Ho: M = 5.5, Ha: M ≠ 5.5 Explanation: The Alternative Hypothesis (Ha) is what is undesirable. What is undesirable is not 5.5 (any underfilling or overfilling). The Null Hypothesis (Ho) is what is desirable. What is desirable is 5.5.
Assembly time (in minutes) for a random sample of 10 parts is shown below. Assume the population is normally distributed. Provide a 95% confidence interval for the population mean assembly time. Minutes: 9, 6, 8, 11, 10, 6, 5, 12, 8, 10 Mean: 8.5 Standard Error: .734091 Median: 8.5 Mode: 6 Standard Deviation: 2.321398 Sample Variance: 5.3888889 Kurtosis: -1.12543 Skewness: -0.09992 Range: 7 Minimum: 5 Maximum: 12 Sum: 85 Count: 10 Confidence Level (95%): 1.660628 a) Lower limit = _______________ Upper limit = ____________________ b) Which of the following is the correct interpretation of this confidence interval? Multiple Choice: 1) We are 95% sure that the mean time is between 6.84 and 10.16 minutes. 2) 95% of the sample means are between 6.84 and 10.16 minutes. 3) The method used to calculate the confidence interval is correct 95% of the time (i.e., the interval will include the population mean). 4) The true mean time is 8.5 minutes. c) .95 is the probability the sampling error is __________ or less. Multiple Choice: 1) .05 2) 1.66 3) .73 4) .95
a) LL = 6.84 UL = 10.16 b) 3 c) 2 Explanation: (a) Margin of Error = 1.660628 Formula: Sample Mean (+/-) Margin of Error 8.5 - 1.660628 = 6.84 (Lower Limit) 8.5 + 1.660628 = 10.16 (Upper Limit)
A company believes demand for a new product will be 40 units per store. They randomly sample 36 stores and ask each store to specify an anticipated demand quantity. The sample mean is 38.3 units with a standard deviation of 10.52. Use a .05 level of significance to test the following. Ho: µ = 40 Ha: µ ≠ 40 a) Critical value = _____________________ b) Test statistic = ______________________ c) P-value = _________________ d) Decision ______________________________ e) Should they stay with a production plan for 40 units per store? Yes or No
a) ± 2.03 b) -.97 c) .3389 d) DNR Ho e) Yes Explanation: (a) DOF = 36 - 1 = 35 alpha / 2 = .025 2.030 = t score (b) Test Statistic Formula: (X bar - Hyp value)/ (std error) * sqrt(n) (38.3 - 40) / (10.52) * sqrt(36) = -.97
If the expected value of the sample statistic is equal to the population parameter being estimated, the sample statistic is said to _____. Multiple Choice: a. be an unbiased estimator of the population parameter b. be a random estimator of the population parameter c. have high precision d. have low variability
a. be an unbiased estimator of the population parameter
In a random sample of 1200 students 300 responded that they have a part-time job. What is the 95% confidence interval for the true population proportion of students that have a part-time job. A) Upper Limit _____________________ B) Lower Limit _____________________
a. .257 b. .225 Explanation: DOF = 300 - 1 = 299 Alpha/2 = 1 - .95 = .05/2 = .025 t score = 1.968 (from t table) Margin of Error = t α /2* sqrt(p bar *(1- p bar)/n) p-bar = 300 / 1200 = .25 1.968 * sqrt(.25 *(1-.25)/1200) = .0246 (a) Formula: p-bar + margin of error .25 + .0246 = .2746 (b) Formula: p-bar - margin of error .25 - .0246 = .2254
A Time/CNN telephone poll of 1400 American adults asked, "Where would you rather go in your spare time?" (Time, April 1992). The top response by 504 adults was a shopping mall. Construct a 95% confidence interval. a) sample proportion (p bar) = b) t score = c) standard error = d) margin of error = e) upper limit = ___________________ lower limit = _____________________ f) What happens to the width of the CI if alpha changes from .05 to .01?
a. .36 b. 1.96 c. .0128 d. .0252 e. UL = .385 LL = .335 f. wider Explanation: (a) 504 / 1400 = .36 (b) DOF = n - 1 = 504 - 1 = 503 Alpha/2 = 1 - .95 = .05/2 = .025 t score = 1.96 (from t table) (c) SE = sqrt(p-bar(1-p-bar)/n)) sqrt(.36(1-.36)/1400)) = .0128 (d) Margin of Error = t score * SE 1.96 * .0128 = .0252 (e) Formula: p bar ± margin of error .36 + .0252 = .385 .36 - .0253 = .335 (f) From the formula, it should be clear that: The width increases as the confidence level increases (0.5 towards 0.99999 - stronger). The width increases as the significance level decreases (0.5 towards 0.00000...01 - stronger).
In a random sample of 700 professors 400 responded that they believed students needed more math skills. What is the 99% confidence interval for the true population proportion of professors that believe students need more math skills? A. _____________________ to B. _____________________ is the C. _____________________ confidence interval for D. _____________________ E. Can you conclude with 99% confidence that a majority of professors believe students need more math skills? _______ yes/no
a. .52 b. .62 c. 99% d. population proportion e. Yes Explanation: DOF = 400 - 1 = 399 Alpha/2 = 1 - .99 = .01/2 = .005 t score = 2.588 (from t table) Margin of Error = t α /2* sqrt(p bar *(1- p bar)/n) p-bar = 400 / 700 = .5714 2.588 * sqrt(.5714 *(1-.5714)/700) = .0484 (a) Formula: p-bar - margin of error .5714 - .0484 = .52 (b) Formula: p-bar + margin of error .5714 + .0484 = .62 (e) .52 to .62 is majority
A simple random sample of 400 students showed that 80 used smart phones. Estimate the 90% confidence interval for the population proportion of students who use smart phones. A. The t score used is = ________________ B. The margin of error is = _______________ C. Upper limit of the confidence interval = ______________ D. Lower limit of the confidence interval = _________________ E. If we estimated a 95% confidence interval it would be Multiple Choice: A. Narrower B. Wider C. The same width D. Could be narrower or wider
a. 1.65 b. .033 c. .233 d. .167 e. Wider Explanation: (a) DOF = 80 - 1 = 79 Alpha/2 = 1 - .90 = .10/2 = .05 t score = 1.664 or 1.65 (from t table) (b) Margin of Error = t α /2* sqrt(p bar *(1- p bar)/n) p-bar = 80 / 400 = .20 1.65 * sqrt(.20 *(1-.20)/400)) = .033 (c) Formula: p-bar + margin of error .20 + .033 = .233 (d) Formula: p-bar - margin of error .20 - .033 = .167 (e) From the formula, it should be clear that: The width increases as the confidence level increases (0.90 to .95 - stronger).
In interval estimation, as the sample size becomes larger, the interval estimate _____. Multiple Choice: a. becomes narrower b. gets closer to 1.96 c. remains the same, since the mean is not changing d. becomes wider
a. becomes narrower
A statistics teacher started class one day by drawing the names of 10 students out of a hat and asked them to do as many pushups as they could. The 10 randomly selected students averaged 15 pushups per person with a standard deviation of 9 pushups. Suppose the distribution of the population of number of pushups that can be done is approximately normal. If we would like to capture the population mean with 95% confidence the margin of error would be _____. Multiple Choice: a. 2.262 (9 / sqrt(10)) b. 2.125 (15 / sqrt(10)) c. 1.960 (9 / sqrt(10)) d. 3.250 (15 / sqrt(10))
a. 2.262 (9 / sqrt(10)) Explanation: Margin of Error = ta/2 * s / sqrt(n) First Find: Sample Mean - 15 (Sum / n) Sample Standard Deviation - 9 Sample size - 10 (n) standard error = σ / sqrt(n) (9 / sqrt(10)) = 2.85 t* @ 95% - 2.262 DOF = n - 1 = 10 - 1 = 9 Alpha = 1 - .95 = .05 Alpha/2 = .05/2 = .025 t score = 2.262 (from t table) Now: Margin of Error = ta/2 * s / sqrt(n) ta/2 = t score t score = 2.262 s / sqrt(n) = 9 / sqrt(10) So the form is: 2.262 (9/sqrt(10))
The t value for a 99% confidence interval estimation based upon a sample of size 10 is _____. Multiple Choice: a. 3.249 b. 2.576 c. 1.812 d. 1.645
a. 3.249 Explanation: DOF = n - 1 = 10 - 1 = 9 Alpha = 1 - .99 = .01 Alpha/2 = .01/2 = .005 t score = 3.249 (from t table)
The BIG BURGER regional manager wishes to determine how long people are waiting in the drive thru at the Statesboro BIG BURGER. He gives the following data. I took a random sample of 36 cars and found the average length of time spent waiting was 8 minutes with standard deviation of 14 minutes. What is the confidence interval for the true mean time customers spend waiting in the drive thru? (alpha = .1). A. _____________________ to B. _____________________ is the C. _____________________ confidence interval for D. _______________________
a. 4.06 b. 11.94 c. 90% d. population mean Explanation: Alpha = .10 Alpha = 1 - Confidence Interval Confidence Interval = 1 - Alpha Confidence Interval = 1 - .10 = .90 Confidence Interval = 90% DOF = 36 - 1 = 35 Alpha/2 = 1 - .90 = .10/2 = .05 t score = 1.690 (from t table) Margin of Error = ta/2 * s / sqrt(n) 1.690 * 14 / sqrt(36) = 3.94 (a) Formula: x̅ - margin of error 8 - 3.94 = 4.06 (b) Formula: x̅ + margin of error 8 + 3.94 = 11.94 (c) 1 - .10 = .90 90% Confidence Interval
The manager of a grocery store has taken a random sample of 36 customers. The average length of time it took these 36 customers to check out was 4 minutes with a sample standard deviation of 1.2 minutes. Construct a 90% confidence interval. a) The upper limit = b) The lower limit = c) If we want the same level of confidence (i.e., 90%) but want the interval to be narrower, we need to increase ________________________________________.
a. 4.34 b. 3.66 c. n (sample size) Explanation: n = 36 Mean = 4 s = 1.2 DOF = 36 - 1 = 35 Alpha/2 = 1 - .90 = .10/2 = .05 t score = 1.690 (from t table) Margin of Error = ta/2 * s / sqrt(n) 1.690 * 1.2 / sqrt(36) = .338 (a) Formula: x̅ + margin of error 4 + .338 = 4.338 or 4.34 (b) Formula: x̅ - margin of error 4 - .338 = 3.662 or 3.66
In a random sample of 81 on-line orders the sample mean order is $55 with a sample standard deviation of $24. Estimate the 90% confidence interval for the true population mean order? A) Upper Limit _____________________ B) Lower Limit _____________________
a. 59.44 b. 50.56 Explanation: DOF = 81 - 1 = 80 Alpha/2 = 1 - .90 = .10/2 = .05 t score = 1.644 (from t table) Margin of Error = ta/2 * s / sqrt(n) 1.644 * 24 / sqrt(81) = 4.384 (a) Formula: x-bar + margin of error 55 + 4.384 = 59.44 (b) Formula: x-bar - margin of error 55 - 4.384 = 50.56
A local university administers a comprehensive exam to the recipients of a B.S. degree. A sample of 8 examinations is selected at random and scored. The scores are shown below. Provide a 95% confidence interval for the population mean score. Assume the population is normally distributed: Scores: 55 89 62 78 92 81 75 45 Multiple Choice: a) The point estimate of the population mean = b) The margin of error = c) upper limit = d) lower limit = e)For a 99% confidence interval the margin of error =
a. 72.13 b. 13.89 c. 86.02 d. 58.23 e. 20.56 Explanation: n = 8 (a) Population Mean: 55 + 89 + 62 + 78 + 92 + 81 + 75 + 45 = 577 577 / 8 = 72.125 or 72.13 (b) DOF = n - 1 = 7 Alpha/2 = .05/2 = .025 t score = 2.37 (from t table) Sample Standard Deviation: 16.62 Margin of Error = ta/2 * s / sqrt(n) 2.365 * 16.617 / sqrt(8) = 13.89 (c) Formula: x̅ - margin of error 72.13 - 13.89 = 58.23 (d) x̅ + margin of error 72.13 + 13.89 = 20.56
Which of the following sources of big data is not publicly available? Multiple Choice: a. Medical records b. Sports records c. Weather data d. Twitter
a. Medical records
Larger values of α have the disadvantage of increasing the probability of making a _____. Multiple Choice: a. Type I error b. normal probability error c. random sampling error d. Type II error
a. Type I error
A researcher believes that more than 30% of retirees supplement Medicare with some form of employer health insurance coverage. The correct set of hypotheses to test this is Multiple Choice: a) Ho: P < 0.30; Ha: P ≥ 0.30 b) Ho: P ≤ 0.30; Ha: P > 0.30 c) Ho: P ≥ 0.30; Ha: P < 0.30 d) Ho: P > 0.30; Ha: P ≤ 0.30
b) Ho: P ≤ 0.30; Ha: P > 0.30 Explanation: H0 = what you aren't testing = anything other than what you want from your test (P ≤ 0.30) Ha = what you believe, what you are testing for = more than 30% (P > 0.30)
If nonsampling error is introduced in the data collection process, the likelihood of making a Type I or Type II error may be ___________ than if the sample data are free of nonsampling errors.
higher
A statistics teacher started class one day by drawing the names of 10 students out of a hat and asked them to do as many pushups as they could. The 10 randomly selected students averaged 15 pushups per person with a standard deviation of 9 pushups. Suppose the distribution of the population of number of pushups that can be done is approximately normal. The 95% confidence interval for the true mean number of pushups that can be done is _____. Multiple Choice: a. 5.75 to 24.25 b. 8.56 to 21.40 c. 11.31 to 18.55 d. 13.02 to 16.98
b. 8.56 to 21.40 Explanation: DOF = n - 1 = 10 - 1 = 9 Alpha = 1 - .95 = .05 Alpha/2 = .05/2 = .025 t score = 2.262 (from t table) standard error = σ / sqrt(n) (9 / sqrt(10)) = 2.85 Margin of Error = t score * SE 2.262 * 2.85 = 6.447 Formula: Sample Mean (+/-) Margin of Error 15 - 6.447 = 8.56 15 + 6.447 = 21.40
A student wants to determine if pennies are really fair when flipped, meaning equally likely to land heads up or tails up. He flips a random sample of 50 pennies and finds that 28 of them land heads up. If p denotes the true probability of a penny landing heads up when flipped, what are the appropriate null and alternative hypotheses? Multiple Choice: a. H0: p ≥ 0.5, Ha: p < 0.5. b. H0: p = 0.5, Ha: p ≠ 0.5. c. H0: p ≤ 0.5, Ha: p ≠ 0.5. d. H0: p ≥ 28, Ha: p < 28.
b. H0: p = 0.5, Ha: p ≠ 0.5. Explanation: H0 is what is given (p = 0.5); Ha is what you are testing for (if it is fair or not, so p ≠ 0.5). Basically: fair vs. not fair (H0 vs Ha)
A sample of 37 AA batteries had a mean lifetime of 584 hours. A 95% confidence interval for the population mean was 579.2 < μ < 588.8. Which statement is the correct interpretation of the results? Multiple Choice: a. The probability that the population mean is between 579.2 hours and 588.8 hours is 0.95. b. We are 95% confident that the mean lifetime of all the bulbs in the population is between 579.2 hours and 588.8 hours. c. 95% of the light bulbs in the sample had lifetimes between 579.2 hours and 588.8 hours. d. None of these statements correctly interpret the results.
b. We are 95% confident that the mean lifetime of all the bulbs in the population is between 579.2 hours and 588.8 hours.
The proportion of dental procedures that are extractions is 0.16. Which of the following exemplifies a Type I error in this situation? Multiple Choice: a. We fail to reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually 0.16. b. We reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually 0.16. c. We fail to reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually different from 0.16. d. We reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually different from 0.16.
b. We reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually 0.16. Explanation: If we reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually 0.16, we will have committed a Type I error because this is equivalent to rejecting the null hypothesis when the null hypothesis is true.
A simple random sample of 31 observations was taken from a large population. The sample mean equals 5. Five is a _____. Multiple Choice: a. population parameter b. point estimate c. standard error d. population mean
b. point estimate
The population parameter value and the point estimate differ because a sample is not a census of the entire population, but it is being used to develop the _____. Multiple Choice: a. population parameter b. point estimate c. population mean d. standard error
b. point estimate
You wish to show the average apartment rent in Statesboro is greater than $800. You plan on taking a sample to test your claim. The correct set of hypotheses is Explanation: a) Ho: M = 800, Ha: M ≠ 800 b) Ho: M ≥ 800, Ha: M < 800 c) Ho: M ≤ 800, Ha: M > 800 d)Ho: M >800, Ha: M ≤ 800
c) Ho: M ≤ 800, Ha: M > 800 Explanation: The Alternative Hypothesis (Ha) is what your testing. You're testing if the mean is greater than 800. The Null Hypothesis (Ho) is the claim of what your testing so other than it being greater than 800, the mean would have to be less than or = to 800.
A researcher believes less than 52% of the population feel that the government should be more involved in the regulation of private enterprise. The correct set of hypotheses is Multiple Choice: a) Ho: P < 0.52, Ha: P ≥ 0.52 b) Ho: P ≤ 0.52, Ha: P > 0.52 c) Ho: P ≥ 0.52, Ha: P < 0.52 d) Ho: M ≥ 0.52, Ha: M < 0.52
c) Ho: P ≥ 0.52, Ha: P < 0.52 Explanation: The Alternative Hypothesis (Ha) is what your testing. The researcher believes the population proportion is less than 52%. The Null Hypothesis (Ho) is the claim of what your testing so other than it being less than 52%, the population proportion would have to be greater than or = to 52%.
Crunchy Nut Cereal wants to fill boxes with a mean weight of 32 ounces. Any underfilling or overfilling will result in a shutdown and readjustment of the machine. A random sample will be periodically taken to be sure the mean weight is being met. The correct set of hypotheses is Multiple Choice: a) Ho: m ≥ 32; Ha: m < 32 b) Ho: m ≤ 32; Ha: m > 32 c) Ho: m = 32; Ha: m ≠ 32 d) Ho: m ≠ 32; Ha: m = 32
c) Ho: m = 32; Ha: m ≠ 32 Explanation: H0 = what is desired = 32 ounces Ha = what is undesired = anything that is not 32 ounces Ho: m = 32; Ha: m ≠ 32
The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with 95% confidence. He selects a random sample of 150 of the employees and finds that 53 of them logged onto Facebook that day. What is the estimate of the standard error of the proportion σp̄? Multiple Choice: a. 0.050 b. 0.350 c. 0.039 d. 0.455
c. 0.039 Explanation: standard error of p-bar = sqrt(p(1-p) / n)) p = 53 / 150 = .353 sqrt(.353(1-.353) / 150)) = 0.039
The owners of a fast food restaurant have automatic drink dispensers to help fill orders more quickly. When the 12 ounce button is pressed, they would like for exactly 12 ounces of beverage to be dispensed. There is, however, some variation in this amount. The company does not want the machine to systematically over fill or under fill the cups. Which of the following gives the correct set of hypotheses? Multiple Choice: a. H0: u ≤ 12, Ha: u > 12 b. H0: u > 12, Ha: u < 12 c. H0: u = 12, Ha: u ≠ 12 d. H0: u ≥ 12, Ha: u < 12
c. H0: u = 12, Ha: u ≠ 12 Explanation: H0 = what is desired = 12 ounces Ha = what is undesired = anything that is not 12 ounces H0 = 12, Ha: u ≠ 12
The American League consists of 15 baseball teams. Suppose a sample of 5 teams is to be selected to conduct player interviews. The following table lists the 15 teams and the random numbers assigned by Excel's RAND function. Team & Random Number New York - 0.178624 Boston - 0.290197 Baltimore - 0.578370 Tampa Bay - 0.867778 Toronto - 0.965807 Minnesota - 0.811810 Chicago - 0.562178 Cleveland - 0.960271 Detroit - 0.253574 Kansas City - 0.326836 Oakland - 0.288287 Los Angeles - 0.895267 Texas - 0.500879 Seattle - 0.839071 Houston - 0.713682 Sort the table by random number from smallest to largest and select a sample of size 5. The simple random sample consists of: Multiple Choice: a. New York, Toronto, Detroit, Texas, and Houston b. Baltimore, Toronto, Detroit, Oakland, and Houston c. New York, Detroit, Oakland, Boston, and Kansas City d. Toronto, Minnesota, Cleveland, Seattle, and Houston
c. New York, Detroit, Oakland, Boston, and Kansas City
A pizza shop advertises that they deliver in 30 minutes or less or it is free. People who live in homes that are located on the opposite side of town believe it will take the pizza shop longer than 30 minutes to make and deliver the pizza. Write the null and alternative hypotheses that can be used to conduct a significance test. Multiple Choice: a. H0: u ≥ 30, Ha: u < 30 b. H0: u < 30, Ha: u > 30 c. H0: u > 30, Ha: u < 30 d. H0: u ≤ 30, Ha: u > 30
d. H0: u ≤ 30, Ha: u > 30 Explanation: H0 = what is desired = 30 minutes or less (u ≤ 30) Ha = what is undesired = longer than 30 minutes (u > 30)
A simple random sample of 31 observations was taken from a large population. The sample mean equals 5. Five is a _____. Multiple Choice: a. population mean b. population parameter c. standard error d. point estimate
d. point estimate
The basis for using a normal probability distribution to approximate the sampling distribution of the sample means and population mean is _____. Multiple Choice: a. Chebyshev's theorem b. the empirical rule c. Bayes' theorem d. the central limit theorem
d. the central limit theorem
The standard error of the associated sampling distributions ____________ as the sample size increases
decrease
The margins of error ___________ as the sample sizes ____________.
decreases; increases
The potential sampling error also __________ as the sample size ___________.
decreases; increases
The shape of t distribution depends on what parameter?
degrees of freedom
Sampling Distributions
- Sampling Distribution of x - Sampling Distribution of p
A simple random sample of 400 students showed that 100 used tablets. Estimate the 90% confidence interval for the population proportion of students who use tablets. 1. The t score used is = 2. The margin of error is = 3. Upper limit of the confidence interval = 4. Lower limit of the confidence interval =
1. 1.649 2. .0357 3. 21.43 4. 28.67 Explanation: (1) DOF = n - 1 = 399 Alpha/2 = .10/2 = .05 t score = 1.649 (from t table) (2) Margin of Error = t α /2* sqrt(p bar *(1- p bar)/n) p-bar = 100/400 = .25 1.649 * sqrt(.25 *(1-.25)/400) = .0357 (3) Formula: p bar - margin of error .25 - .0357 = .2143 .2143 = Confidence Level 21.43 = Confidence Interval (4) p bar + margin of error .25 + .0357 = .2867 .2867 = Confidence Level 28.67 = Confidence Interval
A simple random sample of 49 college professors shows the average number of hours spent on research per week is 18 with a sample standard deviation of 10.5 hours. Estimate the 95% confidence interval for the population mean time spent doing research. 1. The t score used is = 2. The margin of error is = 3. Upper limit of the confidence interval = 4. Lower limit of the confidence interval =
1. 2.011 2. 3.0165 3. 14.9835 4. 21.0165 Explanation: (1) DOF = n - 1 = 48 Alpha/2 = .05/2 = .025 t score = 2.011 (from t table) (2) Margin of Error = ta/2 * s / sqrt(n) 2.011 * 10.5/sqrt(49) = 3.0165 (3) Formula: x̅ - t ̬ (α /2) * s /√n 18 - 3.0165 = 14.9835 (4) x̅ + t ̬ (α /2) * s /√n 18 + 3.0165 = 21.0165
Non-random Sampling Methods (non-probabilistic)
1. Convenience Sampling 2. Judgement Sampling 3. Snowball Sampling
Assumptions for Sampling Distributions
1. Independence 2. Randomization 3. Sample size 4. 10% condition
In a random sample of 700 professors 400 responded that they believed students needed more quantitative skills. 1. What is the 90% confidence interval for the true population proportion of professors that believe students need more quantitative skills? 2. Can you conclude with 90% confidence that the majority of professors believe students need more quantitative skills? Yes or No?
1. Proportion of positive results = P = x/N = 0.5714 Lower bound = 0.5398 Upper bound = 0.6026 2. Yes
Probabilistic Sampling Methods
1. Random Samples 2. Systematic Sampling 3. Stratified Samples 4. Cluster Sampling
Steps of Hypothesis Testing
1. State Ho and Ha 2. Specify α (level of significance) - common levels = .1, .05 or .01 3. Determine rejection region(s) by critical value(s) - tcrit 4. Take a sample and calculate test statistic (tcalc) t calc = (X bar - Hyp value)/ std error 5. Calculate p-value (the probability of observing a test statistic equal to or more extreme than the one observed). 6. Make a decision - either Reject Ho or Do Not Reject Ho
I take a SRS of 400 people and 210 respond yes, they like statistics. Estimate the 99% confidence interval for P (population proportion). p bar (sample proportion) =
210/400 = .525
Fill in the blanks with the correct word: General statistical practice is to assume that, for most applications, the sampling distribution can be approximated by normal distribution whenever the sample size is ______ or more.
30
Ha =
Alternative Hypothesis
Interval Estimation
Because a point estimator cannot be expected to provide the exact value of a population parameter, interval estimation is frequently used to generate an estimate of the value of a population parameter.
The Big-Mart Distribution Center reported that their average hourly salary is less than $7.50. You plan on taking a sample to test Big-Mart's claim. The correct set of hypotheses is Ho = Ha =
H0 < 7.50 Ha ≥ 7.50
I wish to establish that the average alcohol content of the micro-brew beer I produce does not exceed .08. Formulate the null and alternative hypothesis. Ho = Ha =
H0 ≤ .08 Ha > .08
A soft drink filling machine, when in perfect adjustment, fills the bottles with 16 ounces of soft drink. Any overfilling or underfilling results in the shutdown and readjustment of the machine. To determine whether or not the machine is properly adjusted, the correct set of hypotheses is Ho = Ha =
H0: u = 16 Ha: u ≠ 16
H0 =
Null Hypothesis
The general form of an interval estimate
Point estimate ± Margin of error
Decision rule using p-value method
Reject H0 if p value ≤ a
What does the shape of t distribution look like?
Similar in shape to the standard normal distribution, but wider
confidence coefficient
The confidence level expressed as a decimal value. For example, .95 is the confidence coefficient for a 95% confidence level.
E ( x̄ ) =
The expected value of x̄
point estimate
The numerical value obtained for X ̅ , s , or p ̅
confidence interval
The range of values within which a population parameter is estimated to lie. For example, 90% confidence interval.
Central Limit Theorem of p ̅
The sampling distribution of sample proportion can be approximated by a normal distribution whenever np ≥ 5 and n(1 - p) ≥ 5
Two Types of Errors: with Hypothesis Tests
Type 1 error - Rejecting Ho when Ho is true Type 2 error - DNR Ho when Ha is true
sample statistic
a corresponding characteristic of a sample
population proportion
a fraction of the population that has a certain characteristic
sampling frame
a list of elements from which the sample will be selected
random variable
a quantity whose values are not known with certainty
Hypothesis Test
a statistical method that uses sample data to evaluate a hypothesis about a population
If I take a SRS of size 500 from a Population with P = .7 what is the probability that: a. P( p ̅ < .66 ) = b. P( p ̅ > .73) = c. P( .67 < p ̅ < .73) =
a. .0287 b. .0793 c. .8690 Explanation: (a) 1. Find Standard Deviation of p-bar: sqrt(0.7(1-0.7)/500)) = .0205 2. Observe that: np = 500 * 0.7 = 350 ≥ 10, nq = 500 * 0.3 = 150 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 500 = 0.001 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = -infinity Adjusted Right Limit = .66 + 0.001 = 0.661 4. Now, the following is obtained using normal approximation: P (p^ ≤ .66) = Cf P (p^ ≤ .661) = P (p - .7 / .0205 ≤ .6661 / .0205) = P (Z ≤ (.661 - .7 / .0205)) = P (Z ≤ - 1.9) = .0287 (b) 1. Find Standard Deviation of p-bar: sqrt(0.7(1-0.7)/500)) = .0205 2. Observe that: np = 500 * 0.7 = 350 ≥ 10, nq = 500 * 0.3 = 150 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 500 = 0.001 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = .73 - .001 = .729 Adjusted Right Limit = infinity 4. Now, the following is obtained using normal approximation: P (p^ ≥ .73) = Cf P (p^ ≥ .729) = P (p - .7 / .0205 ≥ .729 - .7 / .0205) = P (Z ≥ (.729 - .7 / .0205)) = P (Z ≥ 1.42) = .0793 (c) 1. Find Standard Deviation of p-bar: sqrt(0.7(1-0.7)/500)) = .0205 2. Observe that: np = 500 * 0.7 = 350 ≥ 10, nq = 500 * 0.3 = 150 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 500 = 0.001 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = .67 - .001 = .669 Adjusted Right Limit = .73 + .001 = .731 4. Now, the following is obtained using normal approximation: P (.67 ≤ p^ ≤ .73) = Cf P (.669 ≤ p^ ≤ .731) = P (.669 - .7 / .0205 ≤ .p - .7 / .0205 ≤ .731 - .7 / .0205) = P (.669 - .7 / .0205 ≤ Z ≤ (.731 - .7 / .0205)) = P (Z ≤ 1.51) - P (Z ≤ - 1.51) = .9345 - .0655 = .869
If I am sampling from a Population that has P=.6 and my random sample size is 400. Find the following probabilities. a. P ( p ̅ < .56) = b. P ( p ̅ > .63) = c. P (.58 < p ̅ < .62) =
a. .0571 b. .1210 c. .6156 Explanation: (a) 1. Find Standard Deviation of p-bar: sqrt(0.6(1-0.6)/400)) = .0245 2. Observe that: np = 400 * 0.6 = 240 ≥ 10, nq = 400 * 0.3 = 160 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 400 = 0.0013 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = -infinity Adjusted Right Limit = .56 + .0013 = .5613 4. Now, the following is obtained using normal approximation: P (p^ ≤ .56) = Cf P (p^ ≤ .5613) = P (p - .6 / .0245 ≤ .5613 - .6 / .0245) = P (Z ≤ (.5613 - .6 / .0245)) = P (Z ≤ - 1.58) = .0571 (b) 1. Find Standard Deviation of p-bar: sqrt(0.6(1-0.6)/400)) = .0245 2. Observe that: np = 400 * 0.6 = 240 ≥ 10, nq = 400 * 0.4 = 160 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 400 = 0.0013 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = .63 - .0013 = .6288 Adjusted Right Limit = infinity 4. Now, the following is obtained using normal approximation: P (p^ ≥ .63) = Cf P (p^ ≥ .6288) = P (p - .6 / .0245 ≥ .6288 - .6 / .0245) = P (Z ≥ (.6288 - .6 / .0245)) = P (Z ≥ 1.17) = .1210 (c) 1. Find Standard Deviation of p-bar: sqrt(0.6(1-0.6)/400)) = .0245 2. Observe that: np = 400 * 0.6 = 240 ≥ 10, nq = 400 * 0.3 = 160 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 400 = 0.0013 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = .58 - .0013 = .5788 Adjusted Right Limit = .62 + .0013 = .6213 4. Now, the following is obtained using normal approximation: P (.58 ≤ p^ ≤ .62) = Cf P (.5788 ≤ p^ ≤ .613) = P (.5788 - .6 /.0245 ≤ p - .6 / .0245 ≤ .6213 - .6 / .0245) = P (.5788 - .6 /.0245 ≤ Z ≤ .6213 - .6 / .0245) = P (-.87 ≤ Z ≤ .87) = P (Z ≤ .87) - P (Z ≤ - .87) = .8078 - .1922 = .6156
If I take a SRS size 36 from a population with mean = 80 and σ = 18 what is the probability that: a. P (x̅ < 77 ) b. P (x̅ > 86 ) c. P (74 < x̅ < 86) d. P ( 77 < ( x̅ ) < 83)
a. .15866 b. .0228 c. .9545 d. .6827 Explanation: (a) 1. Find the Z-Score: (77-80) / (18 / sqrt(36)) z = -1 2. Look at Z-score table: z = -1 = .15866 (b) 1. Find the Z-Score: (86-80) / (18 / sqrt(36)) z = - 2 2. Look at Z-score table: z = - 2 = .0228 (c) 1. Find the Z-Score of both: (74-80) / (18 / sqrt(36)) z = - 2 (86-80) / (18 / sqrt(36)) z = 2 2. Look at Z-score table: z = - 2 = .0228 z = 2 = .9772 3. Subtract Values from one another .9772 - .0228 = .9545 (d) 1. Find the Z-Score of both: (77-80) / (18 / sqrt(36)) z = - 1 (83-80) / (18 / sqrt(36)) z = 1 2. Look at Z-score table: z = - 1 = .1587 z = 1 = .8413 3. Subtract Values from one another .8413 - .1587 = .6827
Population percentage is 52% if I take a SRS of 200 what is the probability that the sample percentage will be, a) Less than 49% ? b) Between 50% and 54% ? c) Greater than 53% ?
a. .2177 b. .4778 c. .4168 Explanation: 52% = .52 1. Find Standard Deviation of p-bar: sqrt(0.52(1-0.52)/200)) = .0353 2. Observe that: np = 200 * .52 = 104 ≥ 10, nq = 200 * 0.48 = 96 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 200 = 0.0025 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = -infinity Adjusted Right Limit = .49 + .0025 = .4925 4. Now, the following is obtained using normal approximation: P (p^ ≤ .49) = Cf P (p^ ≤ .4925) = P (p - .52 / .0352 ≤ .4925 - .52 / .0353) = P (Z ≤ (.4925 - .52 / .0353)) = P (Z ≤ - .78) = .2177 (b) 52% = .52 1. Find Standard Deviation of p-bar: sqrt(0.52(1-0.52)/200)) = .0353 2. Observe that: np = 200 * .52 = 104 ≥ 10, nq = 200 * 0.48 = 96 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 200 = 0.0025 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = .50 - .0025 Adjusted Right Limit = .54 + .0025 = .5425 4. Now, the following is obtained using normal approximation: P (.50 ≤ p^ ≤ .54) = Cf P (.4975 ≤ p^ ≤ .5425) = P (.4975 - .52 / .0353 ≤ p - .52 / .0352 ≤ .5425 - .52 / .0353) = P (.4975 - .52 / .0353 ≤ Z ≤ .5425 - .52 / .0353) = P (-.64 ≤ Z ≤ .64) = P (Z ≤ .64) - P (Z ≤ -.64) = 0.7389 - .2611 = .4778 (c) 52% = .52 1. Find Standard Deviation of p-bar: sqrt(0.52(1-0.52)/200)) = .0353 2. Observe that: np = 200 * .52 = 104 ≥ 10, nq = 200 * 0.48 = 96 ≥ 10 (the assumption for normal approximation for the sampling distribution is met) 3. Since we need to use a continuity correction, we need to use the factor cf = 0.5 / N = 0.5 / 200 = 0.0025 to adjust the end points of the end points of the probability intervals, so the new limits are: Adjusted Left Limit = .53 - .0025 = .5275 Adjusted Right Limit = .infinity 4. Now, the following is obtained using normal approximation: P (p^ ≤ .53) = Cf P (p^ ≤ .5275) = P (p - .52 / .0352 ≤ .5275 - .52 / .0353) = P (Z ≤ .5275 - .52 / .0353)) = P (Z ≤ .21) = .4168
Given a population with mean of 75 and standard deviation of 12. A simple random sample of 36 is taken. Find the following probabilities: a. P ( x̅ < 77) = b. P ( 72 < x̅ < 78) =
a. .8413 b. .8664 Explanation: (a) 1. Find the Z-Score: (77-75) / (12 / sqrt(36)) z = 1 2. Look at Z-score table: z = 1 = .8413 (b) 1. Find the Z-Score of both: (72-75) / (12 / sqrt(36)) z = - 1.5 (78-80) / (12 / sqrt(36)) z = 1.5 2. Look at Z-score table: z = - 1.5 = .0668 z = 1.5 = .9332 3. Subtract Values from one another .9332 - .0668 = .8664
You take a simple random sample of size 64 from a population with a mean of 100 and a standard deviation of 32. Find the answers to these questions: a. The standard error of x̅ is _______ . What is the probability that x̅ will be: b. Less than 110? c. Greater than 96? d. Within 1.7 standard errors of M?
a. 4 b. .9938 c. .8413 d. .4554 Explanation: (a) 32/ sqrt(64) = 4 (b) 1. Find the Z-Score: (110-100) / (32 / sqrt(64)) z = 2.5 2. Look at Z-score table: z = 2.5 = .9938 (c) 1. Find the Z-Score: (96-100) / (32 / sqrt(64)) z = -1 2. Look at Z-score table: z = -1 = .8413 (d) 1. 1.7 standard errors of M 1.7 * 4 = 6.8 100 - 6.8 = 93.2 P(93.2 < x < 100) = 2. Find the Z-Score of both: (93.2-100) / (12 / sqrt(36)) z = -1.7 (100-100) / (12 / sqrt(36)) z = 0 3. Look at Z-score table: z = - 1.7 = .0446 z = 0 = .5 4. Subtract Values from one another .5 - .0446 = .4554
How do you estimate the value of a population parameter?
compute a corresponding characteristic of the sample - a sample statistic
t distribution formula
df (degrees of freedom) = n - 1
Random Samples
each element in the population has the sample probability of being selected
Stratified Samples
elements in the population are divided into groups (strata), a random sample is taken from each strata, combined into one sample
Cluster Sampling
elements in the population are divided into separate groups or clusters, usually naturally forming, a random sample is taken from each cluster, combined into one sample
Fill in the correct words: When making ____________, it is important to have a close _______________ between the ____________ and the __________________.
inference; correspondence; sampled population; target population
10% Condition
n should be no more than 10% of population when drawing samples without replacement
Fill in the blanks with the correct word: When the population has a normal distribution, the sampling distribution of x̅ is _____________________ for any sample size.
normally distributed
interval estimate
often computed by adding and subtracting a value, called the margin of error (ME or MOE), to the point estimate
General form of an interval estimate of a population proportion
p ̅ ± Margin of error
General Format for CI for Proportion
p ̅ ± margin of error Simplified: p ̅ ± t ̬ (α /2) * √( p ̅ *(1- p ̅ ) / n )
Target Population
population about which we want to make inferences
The sampling distribution of x̄ is the...
probability distribution of all possible values of the sample mean
Judgement Sampling
researcher picks or chooses subjects
Fill in the blanks with the correct word: Knowledge of the ______________ and its _______________ enables us to make ________________ about how close the _________________ is to the _________________.
sample distribution; properties; probability statements; sample mean; population mean
Snowball Sampling
subjects refer other subjects
Systematic Sampling
system or pattern, i.e., select every 10th person or element, to be sure sample is random start the selection with a randomly selected person or element
t ̬ (α /2) * s /√n is known as
the Margin of Error (MOE)
confidence level
the estimated probability that a population parameter lies within a given confidence interval
E ( p ̅ ) =
the expected value of p ̅
The sample mean x̄
the point estimator of the population mean m
The sample proportion p ̅
the point estimator of the population proportion p
The sample standard deviation s
the point estimator of the population standard deviation s
sampled population
the population from which the sample is drawn
μ =
the population mean
p =
the population proportion
The sampling distribution of p ̅
the probability distribution of all possible values of the sample proportion p ̅
level of significance
the probability that the interval estimation procedure will generate an interval that does not contain the population mean
margin or error (ME or MOE)
the range of percentage points in which the sample accurately reflects the population
the probability distribution x̄
the sampling distribution of X ̅
What happens as the degrees of freedom increase?
the t distribution narrows, its peak becomes higher, and it becomes more similar to the standard normal distribution.
Fill in the blanks with the correct word: When the expected value of a point estimator equals the population parameter, we say the point estimator is _____________.
unbiased
statistical inference
uses sample data to make estimates of or draw conclusions about one or more characteristics of a population
What does the formula for the standard deviation of x bar depend on?
whether the population is finite or infinite.
General form of an interval estimate of a population mean
x̅ ± Margin of error
The final formula to use to estimate a confidence interval for M is
x̅ ± t ̬ (α /2) * s /√n
Standard Deviation of p ̅ Infinite Population Formula
σp ̅ = √(p(1 - p) / n)
Standard Deviation of x̅ Finite Population Formula
σx̅ = (√N - n/N - 1) * (σ/√n)
Standard Deviation of x̅ Infinite Population Formula
σx̅ = σ/√n
Standard Deviation of p ̅ Finite Population Formula
σx̅ = √(N - n/N - 1) * √(p(1 - p) / n)
A sampling distribution has....
• An expected value or mean. • A standard deviation. • A characteristic shape or form.
Difficulties Associated with Taking A Census
• Expensive. • Time consuming. • Misleading. • Unnecessary. • Impractical.
3 General forms of H0 and Ha
• Ho: μ ≤ μ0 (right or upper-tail test) • Ho: μ ≥ μ0 (left or lower-tail test) • Ho: μ = μ0 (two-tailed test) • Ha: μ > μ0 (right or upper-tail test) • Ha: μ < μ0 (left or lower-tail test) • Ha: μ ≠ μ0 (two-tailed test)