Lesson 31: The t-test for the Population Mean (sigma unknown)
requirements for t-test (4)
1. Conclusion about mu 2. SRS 3. σ is UNKNOWN** 4. Population distribution single-peaked w/no excessively long tails
properties of t distribution (4)***
1. Symmetrical 2. Bell-shaped (unimodal) 3. Mean = 0 (standardized score) 4. Degrees of Freedom (df) [this is where z & t are DIFFERENT] The smaller the degrees of freedom (df)--based on SAMPLE size, the larger spread (because uncertainty is due to s) The larger the df --the smaller the spread and --the closer the t-distribution to standard normal
implications of t-tests (2)
1. The null distribution of our t-test statistic: t=x¯−μ0sn√ is the t distribution with (n-1) d.f. In other words, when Ho is true (i.e., when μ=μ0), our test statistic has a t distribution with (n-1) d.f., and this is the distribution under which we find p-values. 2. For a large sample size (n), the null distribution of the test statistic is approximately Z, so whether we use t(n - 1) or Z to calculate the p-values should not make a big difference. If we have a large n, our sample has more information about the population. Therefore, we can expect the sample standard deviation s to be close enough to the population standard deviation, σ, so that for practical purposes we can use s as the known σ, and we're back to the z-test.
t-test
Any hypothesis test in which the test statistic follows Student's t distribution if the null hypothesis is true.
Why is the spread greater on a t-test?
Because we don't know σ, so we are less sure, and that gives us a larger spread and greater variability.
How to look up degrees of (df, n-1)?
Degrees of freedom . Find the test statistic (t) Degrees of freedom (Calculate: n in sample - 1) Look up df on left side of the t distribution table (n-1, ie 91) Always round DOWN to be conservative. Find the p-value range (will rarely the the exact no.) the test statistic (t) falls between i.e. p-value is between .400 and .500 Draw conclusions in context (.400 < p < .500) Is p-value < alpha? Reject/fail to reject Statistically significant? In context
T/F A t-test is used for income.
False
T/F The difference between a t and a z score formula is in the numerator of the equation.
False The change in the t-score formula is in the denominator (divide by the standard error of x¯, namely sn√.
T/F Calculating a t-score, the standard distribution of a sample is used.
False. The sample histogram NOT sample distribution (it is just a single sample mean...).
four steps for the t-test:
I. State the claim (about the parameter of interest) II. a. Choose procedure b. Specify Ho, Ha c. Specify alpha III. Solve: a. Check conditions under which the t-test can used 1. SRS 2. Normality (not too skewed...if high enough n) (X¯ is at least approximately normal) 3. s (replaces sigma, which is UNKNOWN) b. Calculate test statistic (summarizing the data) (t) (standard score assuming Ho is true) c. Calculate p-value IV. Conclusions a. Compare p-value to alpha b. Reject/Fail to reject c. Is it statistically significant? l. In context:
the only difference between the formula for the Z statistic and the formula for the t statistic:
In the formula for the Z statistic, sigma (the standard deviation of the population) must be known; whereas, when sigma isn't known, then "s" (the standard deviation of the sample data) is used in place of the unknown sigma. That's the change that causes the statistic to be a t statistic.
Can we just use S instead of σ, and assume everything else is the same?
No.
N(mu, σ) =
Normal distribution with mean (mu) and standard dev. (σ)
In only a few cases σ, is known.What can we use to replace σ?
The sample standard deviation (s). (Note that this is exactly what we did when we discussed confidence intervals).
How is the t distribution fundamentally different from the normal z-distribution?
The spread. Greater spread, fatter tails, the center is not as peaked.
How does changing from a z-test to a t-test change the distribution of the data?
The t-test statistic in the test for the mean does NOT follow a standard NORMAL distribution. Rather, it follows another bell-shaped distribution called the t DISTRIBUTION
T/F Like all distributions that are used as probability models, the normal and the t distribution are both scaled, so the total area under each of them is 1.
True
T/F In the z-test we divide the numerator by sigma/sq. rt. of n. For a t-test we divide the numerator by the standard error of x¯.
True z test: standard deviation of x¯, namely σn√, for a t-test we divide by the standard error of x¯, namely sn√.
T/F A t test may be used if data is somewhat skewed.
True. As long as it isn't TOO skewed (tails aren't excessively long) AND sample size is LARGE. (90 is ample...I'm not sure how low it can go)
in the context of a test for the mean, the larger the sample size, the higher the degrees of freedom, and the closer the t distribution is to
a normal z distribution
t-statistic is used to test hypotheses about . . .
a population when the value of the population variance (and standard deviation) is unknown. It uses the same formula as the z-statistic except that the estimated standard error is substituted for the standard error in the denominator
the effect of the t distribution is most important for a study with
a relatively small sample size.
Confidence intervals ___ be used to carry out the two-sided test a. Cannot b. Can
can
higher degrees of freedom indicate that the t distribution is
closer to normal
The p-value of the t-test is found
exactly the same way as it is found for the z-test, except that the t distribution is used instead of the Z distribution
t distribution is more appropriate in cases where there is
more variability.
degrees of freedom (df) x
n-1 [Quizlet:] Look up on table (nO. of scores free to vary when estimating a population parameter; usually part of a formula for making that estimate—for example, in the formula for estimating the population variance from a single sample, the degrees of freedom is the number of scores minus 1 (n-1).
when sigma is unknown, the test statistic in the test for a mean uses
s (standard dev. for a sample) instead of σ (sigma )
standard error of x¯
s / √n
p-values for a t test are calculated under the
t distribution instead of under the Z distribution.
In the t-test, the test statistic is: t=X¯¯¯−μ0sn√ whose null distribution is
t(n - 1) (under which the p-values are calculated).
On a t test, the unit for x_ is
the same as the response variable
On a t test, the unit for s is
the same as the response variable.
In the z-test, the test statistic is: z=X¯¯¯−μ0σn√ whose null distribution is
the standard normal distribution (under which the p-values are calculated).
1-sample t-test is used . . .
to determine whether a hypothesized population mean differs significantly from an observed sample mean
due to the symmetry of the t distribution, for a given value of the test statistic t, the p-value for the two-sided test is ____ as large as the p-value of either of the one-sided tests.
twice
The main difference between the z-test and the t-test for the population mean is that
we use the sample standard deviation s instead of the unknown population standard deviation σ.
the _____ is a good approximation for the _____ for large sample sizes,
z-test t-test
The major difference in the "4 Steps for Test Significance" between a z-test and a t test is
σ is no longer known (we use s of a sample)