Statistics Test #3
Determine the margin of error for the confidence interval for the proportion: 0.517 < p < 0.87
.87-.517 = .353 / 2 .1765
First thing to determine when you are doing confidence interval & hypothesis testing:
1. Do you have the mean or the proportion? 2. If you have the mean, you have to then decide if you have sigma or only the s.
Easy way: (7.1 & 7.2) How to find Z a/2 given the confidence interval: Table F
1. Find % of confidence interval you're looking for 2. Scroll all the way to the bottom row
A random sample of 41 cards in the drive thru of a popular fast food restaurant revealed an average bill of $18.63. The population standard deviation is $5.66. Estimate the mean bill for all cards from the drive thru with a 97% confidence. Round answer to two decimal places.
1. Find Z a/2 for the 94% confidence interval = 2.17 (z value) 2. Calculate the confidence interval. 2.17 (5.66 / square root of 41) = 1.92 18.63 +/- 1.92 = 16.71 < mu < 20.55
The t distribution is similar to the standard normal distribution is these ways:
1. It is bell-shaped 2. It is symmetric about the mean 3. The mean, median, and mode are equal to 0 and are located at the center of the distribution. 4. The curve never touches the x axis.
Three properties of a good estimator:
1. The estimator should be an unbiased estimator. That is, the expected value or the mean of the estimates obtained from samples of a given size is equal to the parameter being estimated. 2. The estimator should be consistent. For a consistent estimator, a sample size increases, the value of the estimator approaches the value of the parameter estimated. 3. The estimator should be a relatively efficient estimator, that is, of all the statistics that can be used to estimate a parameter, the relatively efficient estimator has the smallest variance. **willl not be on test
Three methods used to test hypothesis
1. The traditional method (this is the only one we are going to do) 2. The P-value method 3. The confidence interval method
The t distribution differs from the standard normal distribution in the following ways
1. The variance is greater than 1 2. The t distribution is actually a family of curves based on the concept of degrees of freedom which is related to sample size 3. As the sample size increases, the t distribution approaches the standard normal distribution. (smaller sample size flattens the curve, the bigger the number get, the taller the curve)
Three different things your going to see for confidence interval & hypothesis testing:
1. confidence interval for the mean --> going to have sigma 2. confidence interval for mean and ONLY have sigma 3. confidence interval for the proportion
For a 90% confidence interval, Z a/2 =
1.65
For a 95% confidence interval, Z a/2 =
1.96
The following data represent a sample of assets (in millions of dollars) of 30 credit unions in southwestern Pennsylvania. Find the 90% confidence interval of mean. Sample mean is 11.091 and standard deviation is 14.405.
11.091 -/+ 1.65 x (14.405 / square root of 30) 11.091 -/+ 4.339 (margin of error) 6.752 < mu < 15.430
For a 99% confidence interval, Z a/2 =
2.58
A large department store found that it average 362 per hour. Assume the standard deviation is 29.6 and a random sample of 40 hours was used to determine the average. Find the 99% confidence interval of the population mean.
362 -/+ 2.58 x (29.6 / 40) 362 -/+ 12 350 < mu < 374 One can by 99% confident that the mean number of customers that the store average is between 350 and 374 customer per hour.
Determine the margin of error for the confidence internal for the mean: 69 < mu < 80
80-69 = 11/2 = 5.5
Z a/2 for confidence interval %: 90% 95% 99%
90% - 1.65 95% - 1.96 99% - 2.58 (show how to get them from tables/formula sheet)
When should a one-tailed test be used?
A one-tailed test is used when the null hypothesis should be rejected if the test value is in the critical region on one side of the mean.
Best description and example of the null hypothesis in a hypothesis test:
A statistical hypothesis that there is no difference between a parameter and a specific values, or between two parameters. Example: H0: mu = 90
When should a two-tailed test be used?
A two-tailed test is used when the null hypothesis should be rejected if the test value is in the critical region on either side of the mean.
For a random sample of 60 overweight men, the mean of the number of pounds that they were overweight was 32. The standard deviation of the population is 3.8 pounds. Round intermediate answers to at least three decimal places. Round your final answers to one decimal place. A. The best point estimate of the mean is ___ pounds. B. Find the
A. 32 B. C. D.
Using Table E, find the critical value (or values) for the left-tailed test with a = .07. Round to two decimal places, an enter the answers separated by a comma if needed.
Find the area closest to .0700 in Table E. In this case, .0694 so we select the z value corresponding to this area. That is -1.48
Summary of Hypothesis Testing and Critical Values: Left Tailed H1 mu < k (less than) & Right Tailed H1 mu >: k (greater than) What are the critical values for the following a = .10 a = .05 a = .01
Left Tailed a = .10, C.V. = -1.28 a = .05, C.V. = -1.65 a = .01, C.V. = -2.33 Right Tailed a = .10, C.V. = +1.28 a = .05, C.V. = +1.65 a = .01, C.V. = +2.33
Determine the margin of error for the confidence internal for the mean: 60 < mu < 87
Margin of error is half the width of the confidence interval; Find the width by subtracting the lower limit from the upper limit, then dividing by 2. 80-67 = 27/2 = 13.5
For the following conjecture, state the null and alternative hypothesis. The average experience (in seasons) for an NBA player is 4.71
Null - mu = 4.71 Alternative mu does not equal 4.71
For the following conjecture, state the null and alternative hypothesis. The average cost of DVD is player is at most 79.95.
Null - mu = 79.95 alternative mu > 79.95
For the following conjuncture, state the null and alternative hypothesis. The average age of first-year medical school students is less than 27 years.
Null hypothesis is H0 mu = 27 Alternative hypothesis is H1: mu < 27
Reject the null means
Null hypothesis wasn't true
For the following conjecture, state the null and alternative hypothesis. The average resting pulse rate of male marathon runners is greater than 70 beats per minute.
Null mu =70 alternative mu > 70
Type I Error
Rejecting null hypothesis when it is true (the one we want to avoid the most), alpha error
A random sample of 10 children found that their average growth for the first year was 9.8 inches. Assume the variable is normally distributed and the sample standard deviation is 0.96 inch. Find the 95% confidence interval of the population mean for growth during the first year.
Since s (and not sigma) is given, we use t-test. sample mean -/+ ta/2 x (sample std dev / square root of n). The degrees of freedom = n - 1 95%, d.f.= 9, ta/2 = 2.262 9.8 -/+ 2.262 x (0.96/ square root of 10) 9.8 -/+ .7 9.1 < mu < 10.5
Using Table E, find the critical value (or values) for the right-tailed test with a = .10.
Since this is a right tailed test, we need to find the area closest to 1-.10 = .9000 in Table E. In this case it's .8997 for z value = 1.28.
Using Table E, find the critical value (or value) for the two tailed test with a = .04. Round to two decimal places, and enter the answer separated by comma if needed.
Since this is a two tailed test, there are two area equivalent to a/2 or .04/2 = .02. For the left z critical value, find the area closet to .02. For the right critical value, find the area closest to 1 - .02 = .9800. The answer is +/- 2.05.
A researcher wishes to estimate the number of days it takes an automobile dealer to sell a Chevrolet Aveo. A sample of 50 cars had a mean time on the dealer's lot of 54 days. Assume the population standard deviation to be 6 days. Find the best point estimate of the population mean and the 95% confidence interval of the population mean.
The best point estimate of the mean is 54 days. Confidence Interval: 54 -/+ 1.96 x (6 / square root of 50) 54 -/+ 1.66 54 -/+ 2 (round rule) 52 < mu < 56 days One can say with 95% confidence that the interval between 52 and 56 days contains the population mean, based on a sample of 50 automobiles.
Determine the margin of error for the confidence interval for the mean: 24 +/- 1.5
The confidence interval is already written in the form point estimate +/- 1.5. Therefore, the margin of error is 1.5
In hypothesis testing, why can't the hypothesis be proved true?
The only way to prove something would be to use the entire population under study.
How to find ^p in a word problem
X / n
What statistic best estimates mu?
X bar
Formula for the Confidence Interval of the Mean for a Specific a
X bar (sample mean) +/- Z a/2 (standard deviation of population / square root of # samples) Z a/2 for confidence interval %: 90% - 1.65 95% - 1.96 99% - 2.58 (show how to get them from table)
A researcher wishes to estimate with 95% confidence the proportion of people who own a home computer. A previous study shows that 40% of those interviewed had a computer at home. The researcher wishes to be accurate within 2% of the true proportion. Find the minimum sample size necessary.
Za/2 = 1.96 p̂ = .40 q hat = .60 E- = 2% n = p hat x q hat x (Za/2 / E-)^2 n = .40 x .60 x (1.96 /.02)^2 n = 2305 people
Find the ^p and ^q for 27%
^p = .27 ^q = 1 - ^p, 1 - .27 = .73
How to find ^p and ^q for percentage
^p = percentage in decimal form q = 1 - ^p
Summary of Hypothesis Testing and Critical Values: Two Tailed What are the critical values for the following a = .10 a = .05 a = .01
a = .10, C.V. = -1.65, +1.65 a = .05, C.V. = -1.96, +1.96 a = .01, C.V. = -2.58, +2.58
Find Z a/2 for the 90% confidence interval
a = 1 - .90 = .1 a /2 = .1 / 2 = .05 1 - .05 = .9500 Find this area in Table E, the z value for this area is 1.64 (it was in the middle of two areas, go with the lower z value)
Find Z alpha/2 for the 95% confidence interval:
a = 1 - .95 (confidence interval) = .05, a = .05 a/2 = .05/2 = .025. 1 - .025 = .9750 Find this area in the table E, which is the Z values 1.96
Confidence interval
a specific interval estimate of a parameter determined by using data obtained from a sample and by using the specific confidence level of the estimate
The greek letter alpha represents _____ and the alpha/2 represents
alpha - the total area in both tails of the standard normal distribution curve alpha/2 - the area in each one of the tails
H1
alternative hypothesis, different from the mean
Confidence interval:
an interval estimate of a parameter is an interval or a range of values, this estimate may or may not contain the value of the parameter being estimated.
Decrease sample size and/or increase confidence interval % will cause the confidence interval range to become
bigger (This is worse) ***will be a multiple choice question on the test
The symbol d.f. will be used for
degrees of freedom
The data of 7 samples was taken of the number of home fires started by candles for the past several years. Find the 99% confidence interval for the mean number of home fires started by candles each year. Find the mean is 7041.4 and standard deviation s = 1610.3.
df = 7, 99%, ta/2 = 3.707 7041.4 -/+ 3.707 (1610.3 / square root of 7) 7041.4 -/+ 2256.2 4785.2 < mu < 9297.6
Type II Error
failing to reject a false null hypothesis, beta error
If it gives a d.f. that is not on the chart, go down in d.f
go down in d.f., n = 44 - 1, df = 43 --> 40 (round down to the next # of chart)
How to find the margin of error
half the width of the confidence interval; find the width of the confidence interval by taking the upper limit and subtracting the lower limit. Next, divide the width by 2, since the margin of error is half of the total distance between the endpoints.
Interval estimate
interval or range of values used to estimate the parameter, this estimate may or may not contain the value of the parameter being estimated. Better estimate that the point estimate. The bigger the interval, the better the chance that the mean is within this interval. If the interval is too big, it is kind of pointless. Increase the confidence interval, 99% confident, bigger net, bigger interval, more likely you'll catch the mean. 90% has a smaller net than 99% confidence interval. The confidence interval mean is the point estimate. Sometimes the confidence interval can miss the population mean. You can never be 100% certain with a confidence interval. Sometimes the sample average can be the same as the population average, but not very often.
Margin of error
maximum likely difference between the point estimate of a parameter and the actual value of the parameter. It is half of the width of the confidence interval.
Calculate the confidence interval
mean - z value ( std dev / square root of n) < mu < mean + z value ( std dev / square root of n)
If you have a desired margin of error (E), then you can calculate ahead of time the size of the sample you need, formula is
n = (Z a/2 x sigma / E)^2
A scientist wishes to estimate the average depth of a river. He wants to be 99% confident that the estimate is accurate within 2 feet (margin of error). From the previous study, the standard deviation of the depths measured was 4.33 feet. How large a sample is required?
n = (Z a/2 x sigma / E)^2 n = (2.58 x 4.33 / 2 feet)^2 n = 31.2 --> 32 feet (these type of problems always get rounded up to the next whole number) Therefore, to be 99% confident that the estimate is within 2 feet of the true mean depth, the scientist needs at least a sample of 32 measurements.
A researcher wishes to estimate the average number of minutes per day a person spends on the internet. How large a sample must she select if she wishes to by 99% confident that the population mean is within 8 minutes of the sample mean? Assume the population standard deviation is 42 minutes.
n = (Z a/2 x standard deviation / E - margin of error)^2 n = (2.58 x 42 / 8)^2 = 183.45 --> 184
Formula for the Minimum Sample Size Needed for an Interval Estimate of the Population Mean
n = (Z a/2 x standard deviation / margin of error E)^2. If necessary, round the answer up to obtain a whole number. Use the next whole number for the sample size.
Formula for Minimum Sample Size Needed for Interval Estimate of Population Proportion
n = p hat x q hat x (Za/2 / E-)^2 Find p hat and q hat from a prior study. If there is no prior study, use p hat and q hat as 50% (because that's the biggest you can get). If necessary, round up to the next whole number.
Noncritical region
nonrejection region; the range of test values that indicates that the difference was probably due to chance and that the null hypothesis should not be rejected
H0
null hypothesis
Type I error
occurs if you reject the null hypothesis when it is true; example would be convincing (rejecting the null hypothesis of innocence) a criminal defendant when the defendant is actually innocent (the null hypothesis is true)
population proportion symbol
p
Proportion
percentage
sample proportion symbol
p̂ (p hat)
For a sample proportion,
p̂ (p hat) = X / n q^ = n - X / n or 1 - p where X = number of sample unit that posses the characteristics of interest and n = sample size.
Formula for a Specific Confidence Interval for a Proportion
p̂ -/+ Za/2 square root of (p hat x q hat / n) when np and nq is greater or equal to 5.
A survey of 1898 people found that 45% of the adults said that dandelions were the toughest weeds to control in their yards. Find the 95% confidence interval of the true proportion who said that dandelions were the toughest weeds to control in their yards.
p̂ = .45 q hat = .55 Za/2 for 95% = 1.96 n = 1898 .45 -/+ 1.65 x square root of (.45 x .55 / 1898) .45 -/+ .022 0.428 < p < 0.472
A survey conducted by Sallie Mae and Gallup of 1404 respondents found that 323 students paid for their education by student loans. Find the 90% confidence the true proportion of students who paid for their education by student loans.
p̂ = 323/1404 = .23 q hat = .77 Za/2 for 90% = 1.65 n = 1404 .23 -/+ 1.65 x square root of (.23 x .77 / 1404) .23 -/+ .019 .211 < p < .249
A random sample of 200 works found that 128 drove to work alone. Find p hat and q hat where p hat is the proportion of works who drove to work alove.
p̂ = X / m, 128/200 = 0.64 = 64% q hat = 200-72/200 or 100-64 = 36% 64% of the people in the survey drive to work alone and 36% drive with others.
Compliment of sample proportion
q hat
Rounding rule when computing confidence interval
round off to the sample # decimal places as the sample mean given
The value of sigma, when it is not known, must be estimated using
s, the standard deviation of the sample
The best point estimate of the population mean (mu) is the
sample mean (X)
Formula for a Specific Confidence Interval for the Mean When sigma is Unknown and n < 30
sample mean -/+ t a/2 x (sample std dev / square root of n). The degrees of freedom = n - 1
Difference between sigma and s
sigma (standard deviation for the population s (standard deviation for the sample)
Increase sample size and or decrease your confidence interval % will cause the confidence interval range/width to become
smaller "net smaller" (This is better) ***will be a multiple choice question on the test
Point estimate
specific numerical value estimate of a parameter (confidence is better than a point estimate), the best point estimate of the population mean mu is the sample mean X
Confidence interval is a
spread out estimate about what you are looking at (confidence is better than a point estimate)
when s is used, especially when the sample size is small (less then 30), critical values greater than the values for z a/2 are used in confidence intervals in order to keep the interval at a given levels, such as the 95%. These values are taken from the
student t distribution, most often called the t distribution
The degrees of freedom for a confidence interval for the mean are found by
subtracting 1 from the sample size. This is, d.f = n -1.
What do you use when hypothesis testing problem gives sample mean and sample mean?
t test
What test do we use for confidence intervals that give you sample mean and sample mean?
t-test
if you only have the sample mean, you're going to use
t-test
Margin of error (also called the maximum error of the estimate)
the maximum likely difference between the point estimate or of parameter end the actual value of parameter (the number that you are adding or subtracting from the sample mean, the Z a/2 (sigma / square root of # of values))
Type II error
the null hypothesis is not rejected when it is false
the value Z a/2 is
the positive z value which is the right endpoint of the desired confidence interval
parameter
what you are looking at, mean, proportion, or standard deviation
Formula for Confidence Interval for aMean for a Specific a
x bar (sample mean) +/- Z a/2 (sigma / square root of n) Z a/2 for 90% - 1.65 95% - 1.96 99% - 2.58
If you have the sigma or proportion, you're going to use
z
A random sample of 335 college student were asked if they believed that places could be haunted and 131 responded yes. Estimate the true proportion of college students who believe in the possibility of haunted place with 90% confidence. According to Time magazine, 37% of Americans believe that places can be haunted. Round intermediate and final answers to three decimal places.
z a/2 for 99% confidence = 2.58 ^p = 131/335 = .391 ^q = .609 ^p +/- z a/2 square root of (^p x ^q) / n .391 +/- 2.58 square root of (.391 x .609)/335 = .332 < p < .460
What do you use when hypothesis testing problem gives a proportion?
z test
What do you use when hypothesis testing problem gives sample mean and sigma?
z test
What test do we use for proportions?
z test
What test do we use for confidence intervals that give you sample mean and sigma?
z-test
Z test vs. t-test vs. proportions
z-test is when you're given the sigma and sample mean, stronger so you do the z-test. (you'll go to the bottom to use the Z a/2). If you have the sigma (population standard deviation), use the z-test. t-test is when you're only given the s (sample standard deviation) not the population standard deviation and the sample mean, not as strong. So have to do the t-test. For the t-test, you have to use the degrees of freedom to find the t a/2. Use the t chart, that uses the df. If you only have the sample standard deviation, have to use the t-chart. 1324 For proportions, will use a different formula, but we go back to using the t.