Stats Final Study Guide Ch. 10 and 11
Steps and Questions to know for the final
(a)State the null hypothesis and the alternate hypothesis. (b)What is the decision rule? (c)What is the value of the test statistic?(d)What is your decision regarding the null hypothesis? (e)What is the p-value? (f)Interpret the result. (g)What are the assumptions necessary for this test
choosing a level of significance rough guide
-.05 used for consumer research projects -.01 for quality assurance -.10 for political polling
Hypothesis
-hypothesis is a statement about a population parameter subject to verification. -A hypothesis is a proposed explanation for a phenomenon.
Hypothesis Testing
-significance testing is used to help make a judgment about a claim by addressing the question, Can the observed difference be attributed to chance? A procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement. Hypothesis testing is the use of statistics to determine the probability that a given hypothesis is true.
the probability of finding a z value of 1.55 or more is .0
.0606, found by .5000 − .4394.
Five-Step procedure for testing a hypothesis
1. State Null (H0) and alternate Hypothesis (H1) 2. Select a level of significance (5%,10%, 15%) 3. Identify the test statistic (z or t) 4. Formulate a decision rule (establish critical value) 5. Take a sample, make a decision (Do not reject H0 or Reject H0)
one-tailed 0.05 level of significance
1.65 critical value
tests concerning proportion using the z-distribution
A Proportion is the fraction or percentage that indicates the part of the population or sample having a particular trait of interest. The sample proportion is denoted by p and is found by x/n It is assumed that the binomial assumptions discussed in Chapter 6 are met: (1) the sample data collected are the result of counts; (2) the outcome of an experiment is classified into one of two mutually exclusive categories—a "success" or a "failure"; (3) the probability of a success is the same for each trial; and (4) the trials are independent Both n and n(1- ) are at least 5. When the above conditions are met, the normal distribution can be used as an approximation to the binomial distribution
alternative (research) Hypothesis (H1)
A theory that contradicts the null hypothesis. The theory generally represents that which we adopt only when sufficient evidence exists to establish its truth.
why do we prefer dependant samples to independant samples?
By using dependent samples, we are able to reduce the variation in the sampling distribution.
Making a decision
Compute the test statistic Comparing it to the critical value, and make a decision to reject or not to reject the null hypothesis.
two-sample tests of hypothesis: dependant samples
Dependent samples are samples that are paired or related in some fashion. For example: If you wished to buy a car you would look at the same car at two (or more) different dealerships and compare the prices. If you wished to measure the effectiveness of a new diet you would weigh the dieters at the start and at the finish of the program.
One-tail vs. 2-tail
Depends on your hypothesis
Comparing Population Means with Equal but unknown population standard deviations (the pooled t-test) Test Statistic
Finding the Value of the Test Statistic requires two steps: 1. pool the sample standard deviations 2.use the pooled standard deviation in the formula
Important things to remember about H0 and H1
H0 and H1 are mutually exclusive and collectively exhaustive H0 is always presumed to be true H1 has the burden of proof A random sample (n) is used to "reject H0" If we conclude 'do not reject H0', this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence to reject H0; rejecting the null hypothesis then, suggests that the alternative hypothesis may be true. Equality is always part of H0 (e.g. "=" , "≥" , "≤"). "≠""<" and ">" always part of H1 In actual practice, the status quo is set up as H
Dependant vs independant samples
How do we tell between dependent and independent samples?1.Dependent sample is characterized by a measurement followed by an intervention of some kind and then another measurement. This could be called a "before" and "after"study.2.Dependent sample is characterized by matching or pairing observation.
Comparing two population means: equal, unknown population variances.
No assumptions about the shape of the populations are required. The samples are from independent populations. The formula for computing the value of z is:
Type I error
Rejecting null hypothesis H0 when it is true - in a courtroom when an innocent person is prosecuted
Comparing Population means (Pooled t-test) example
Step 1: State the null and alternate hypotheses. (Keyword: "Is there a difference")H0: μW = μA H1: μW ≠ μA Step 2: State the level of significance. The 0.10 significance level is stated in the problem. Step 3: Select the appropriate test statistic. Because the population standard deviations are not known but are assumed to be equal, we use the pooled t-test Step 4: State the decision rule.Reject H0 ift > t/2,nW+nA-2 or t < - t/2, nW+nA-2 t > t.05,9 or t < - t.05,9 t > 1.833 or t < - 1.833 Step 5: Take a sample and make a decision-The decision is not to reject the null hypothesis because -0.662 falls in the region between -1.833 and 1.833. Step 6: Interpret the Result. The data show no evidence that there is a difference in the mean times to mount the engine on the frame between the Welles and Atkins methods.
Comparing two pop. means ex: Checkout time
Step 1: State the null and alternate hypotheses. (keyword: "longer than")H0: μS ≤ μU H1: μS > μU Step 2: Select the level of significance. The .01 significance level is stated in the problem. Step 3: Determine the appropriate test statistic. Because both population standard deviations are known, we can use z-distribution as the test statistic. Step 4: Formulate a decision rule.Reject H0 ifZ > ZZ > 2.33 Step 5: Compute the value of z and make a decision- The computed value of 3.13 is larger than the critical value of 2.33. Our decision is to reject the null hypothesis. The difference of .20 minutes between the mean checkout time using the standard method is too large to have occurred by chance.
two sample test of proportions- example
Step 1: State the null and alternate hypotheses. (keyword: "there is a difference")H0: π1 = π2H1: π1≠π2 Step 2: Select the level of significance. The .05 significance level is stated in the problem. Step 3: Determine the appropriate test statistic. We will use the z-distribution Step 4:Formulate the decision rule.Reject H0 ifZ > Z/2 or Z < - Z/2Z > Z.05/2 or Z < - Z.05/2Z > 1.96 or Z < -1.9 Step 5: select a sample and make a decision- Step 5: Select a sample and make a decisionThe computed value of-2.21 is in the area of rejection. Therefore, the null hypothesis is rejected at the .05 significance level. To put it another way, we reject the null hypothesis that the proportion of young women who would purchase Heavenly is equal to the proportion of older women who would purchase Heavenly. (formulations on the slide)
Hypothesis testing involving dependent samples- example
Step 1: State the null and alternate hypotheses. H0: d = 0H1: d ≠ 0 Step 2: State the level of significance. The .05 significance level is stated in the problem. Step 3: Find the appropriate test statistic using the t-test Step 4: State the decision rule.Reject H0 if t > t/2, n-1 or t < - t/2,n-1t > t.025,9 or t < - t.025, 9t > 2.262 or t < -2.262 Step 5: Compute the value of t and make a decision The computed value of t is greater than the higher critical value, so our decision is to reject the null hypothesis. We conclude that there is a difference in the mean appraised values of the homes.
Comparing population means with unknown and unequal population standard deviations- Example
Step 1: State the null and alternate hypotheses. H0: μ1 = μ2 H1:μ1 ≠ μ2 Step 2: State the level of significance. The .10 significance level is stated in the problem. Step 3: Find the appropriate test statistic. A t-test adjusted for unequal variances Step 4: State the decision rule.Reject H0 ift > t/2d.f. or t < - t/2,d.f.t > t.05,10 or t < - t.05, 10t > 1.812 or t < -1.812 Step 5: Compute the value of t and make a decision.The computed value of t (-2.474)is less than the lower critical value, so our decision is to reject the null hypothesis. Step 6: Interpret the result. We conclude that the mean absorption rate for the two towels is not the same
Testing Hypothesis of two equal population variances: The F distribution
The distribution is named to honor Sir Ronald Fisher, one of the founders of modern-day statistics. The distribution is: Used to test a hypothesis of equal population variances. Used to simultaneously test a hypothesis that several population means are equal. The simultaneous comparison of several population means is called analysis of variance (ANOVA). Are the variances of two samples equal
Comparing Population Means with Equal but unknown population standard deviations (the pooled t-test)
The t distribution is used as the test statistic if one or more of the samples have less than 30 observations. The required assumptions are: 1.Both populations must follow the normal distribution. 2.The populations must have equal standard deviations. 3.The samples are from independent populations. The method described does not require that we know the standard deviations of the populations. This gives us a great deal more flexibility when investigating the difference in sample means. There are two major differences in this test and the previous test described in this chapter. We assume the sampled populations have equal but unknown standard deviations. Because of this assumption, we combine or "pool" the sample standard deviations. We use the t distribution as the test statistic.
Comparing population means with unknown and unequal population standard deviations
Use the formula for the t-statistic shown if it is not reasonable to assume the population standard deviations are equal. -The degrees of freedom are adjusted downward by a rather complex approximation formula. The effect is to reduce the number of degrees of freedom in the test, which will require a larger value of the test statistic to reject the null hypothesis.
Two sample tests of proportion
We investigate whether two samples came from populations with an equal proportion of successes.The two samples are pooled using the following formula (only first): x1= number possessing trait in first sample x2=number possessing trait in second sample n1= number of observations in the first sample n2= number of observations in the second sample
Formulate the decision rule
a statement of specific conditions under which the null hypothesis is rejected or not rejected
Type II error
accepting the null hypothesis when it is false -when guilty but proven innocent
Hypothesis testing involving dependent samples
dbar= mean of the difference sd= standard deviation of the differences n= number of the pairs (differences)
p-value in hypothesis testing
he 'P ' stands for probability, and measures how likely it is that any observed difference between groups is due to chance. In other words, the P value is the probability of seeing the observed difference, or greater, just by chance if the null hypothesis is true. Being a probability, P can take any value between 0 and 1.
the test statistic for a proportion is computed:
p0= population proportion p= sample proportion n= sample size
when population standard deviation is unknown
t-statistic is used
Identify Test Statistic
test statistic is a value, determined from sample information, used to determine whether to reject the null hypothesis.
Critical Value
the dividing point between the region where the null hypothesis is rejected and the region where it is not rejected (will either be a z-score or a t-statistic)
Level of Significance (sigma)
the probability of rejecting the null hypothesis when it is true (level of risk)
Null (H0) Hypothesis
theory about the specific values of one or more population parameters. The theory generally represents the status quo, which we adopt until it is proven false.
two-tailed
used when direction is unknown
testing for population mean: Population Standard deviation unknown
when the population standard deviation (σ) is unknown, the sample standard deviation (s) is used in its place the t-distribution is used as test statistic, which is computed using the formula: xbar= sample mean u- hypothesized population mean s= sample standard deviation n=number of observations in a sample
one-tail
when your research hypothesis states the difference of the difference or relationship
the value of the test-statistic for two-sample proportions
z=(p₁-p₂)/√{[pc(1-pc)/n₁]+[pc(1-pc)/n₂]} where pc is the pooled proportion (formula
in the start of the procedure, there are two hypotheses :
◦1. The null hypothesis (H0) is "the defendant is not guilty" ◦2. The alternative hypothesis (H1) is "the defendant is guilty".