Chapter 8
Concerns About Hypothesis Testing
there are two serious limitations with using a hypothesis test to establish the significance of a treatment effect -when the null hypothesis is rejected, we are actually making a strong probability statement about the sample data, not about the null hypothesis -demonstrating a significant treatment effect does not necessarily indicate a substantial treatment effect
Scientific Method
specify a problem -form a hypothesis -test a hypothesis -collect data -analyze data -infer conclusions -make a theory
Measuring Effect Size
a measure of effect size is intended to provide a measurement of the absolute magnitude of a treatment effect, independent of the size of the sample(s) being used -Cohen's d measures the size of the mean difference in terms of the standard deviation -estimated Cohens d=mean difference/standard deviation
More About Hypothesis Tests
a result is said to be significant or statistically significant if it is very unlikely to occur when the null hypothesis is true -that is, the result is sufficient to reject the null hypothesis -thus, a treatment has a significant effect if the decision from the hypothesis test is to reject H0.
Hypothesis test
a statistical method that uses sample data to evaluate a hypothesis about a population; the general goal of a hypothesis test is to rule out chance(sampling error) as a plausible explanation for the results from a research study -state: hypothesis about the population -use: hypothesis to predict the characteristics the sample should have -obtain: a sample from the population -compare: data with the hypothesis prediction
Factors That Affect Statistical Power
as the effect size increases, the probability of rejecting H0 also increases, which means that the power of the test increases -one factor that has a huge influence on power is the size of the sample -reducing the alpha level for a hypothesis test also reduces the power of the test -if the treatment effect is in the predicted direction, changing from a two tailed test to a one tailed test increases power
Hypothesis Test: Step 3
compare the sample means(data) with the null hypothesis -compute the test statistic -the test statistic(z score) forms a ratio comparing the obtained difference between the sample mean and the hypothesized population mean vs. the amount of difference we would expect without any treatment effect
Uncertainty and Errors in Hypothesis Testing
hypothesis testing is an inferential process, which means that it uses limited information as the basis for reaching a general conclusion -specifically, a sample provides only limited or incomplete information about the whole population, and yet a hypothesis test uses a sample to draw a conclusion about the population -in this situation, there is always the possibility that an incorrect conclusion will be made
Hypothesis testing
if the individuals in a sample are noticeably different from the individuals in the original population, we have evidence that the treatment has an effect -however, it is also possible that the difference between the sample and the population is simply sampling error -the purpose of the hypothesis test is to decide between 2 explanations: -the difference between the sample and the population can be explained by sampling error -the difference between the sample and the population is too large to be explained by sampling error
Hypothesis Test: Step 4
if the test statistic results are in the critical region, we conclude that the difference is significant or that the treatment has a significant effect; in this case, we reject the null hypothesis -if the mean difference is not in the critical region, we conclude that the evidence from the sample is not sufficient, and the decision is to fail to reject the null hypothesis
Directional Tests
in a directional hypothesis test, or a one tailed test, the statistical hypotheses(H0 and H1) specify either an increase or decrease in the population mean -that is, they make a statement about the direction of the effect -when a specific direction is expected for the treatment effect, it is possible for the researcher to perform a directional test -the first step(and the most critical step) is to state the statistical hypotheses -the null hypothesis states that there is no treatment effect and that the alternative hypothesis says there is an effect -the two hypotheses are mutually exclusive and cover all the possibilities -the critical region is defined by sample outcomes that are very unlikely to occur if the null hypothesis is true(that is, if the treatment has no effect) -because the critical region is contained in one tail of the distribution, a directional test is commonly called a one tailed test -also note that the proportion specified by the alpha level is not divided between two tails, but rather is contained entirely in one tail
A Closer Look at the Z score Statistic cont'd
in the context of a hypothesis test, the z score formula has the following structure: sample mean-hypothesized population mean/standard error between M and u -thus, the z score formula forms a ratio
Assumptions for Hypothesis Tests with Z scores
it is assumed that the participants used in the study were selected randomly -the values in the sample must consist of independent observations -two events(or observations) are independent if the occurrence of the first event has no effect on the probability of the second event -the standard deviation for the unknown population is assumed to be the same as it was for the population before treatment -to evaluate hypotheses with z scores, we have used the unit normal table to identify the critical region -the table can be used only if the distribution of sample means is normal
Type II errors
occurs when a researcher fails to reject a null hypothesis that is really false -in a typical research situation, a type II error means that the hypothesis test has failed to detect a real treatment effect -a type II error occurs when the sample mean is not in the critical region even though the treatment has an effect on the sample; this often happens when the effect of the treatment is relatively small -the consequence of a Type II error are usually not as serious as those of a Type I error -in general terms, a type II error means that the research data doesn't show the results that the researcher had hoped to obtain -the researcher can accept this outcome and conclude that the treatment either has no effect or has only a small effect that is not worth pursuing, or the researcher can repeat the experiment and try to demonstrate that the treatment really does work
Type 1 Errors
occurs when a researcher rejects a null hypothesis that is actually true -in a typical research situation, a type 1 error means the researcher concludes that a treatment does have an effect when in fact it has no effect -a type 1 error occurs when a researcher unknowingly obtains an extreme, nonrepresentative sample -fortunately. the hypothesis test is structured to minimize the risk that this will occur -the alpha level for a hypothesis test is the probability that the test will lead to a Type I error -that is, the alpha level determines the probability of obtaining sample data in the critical region even though the null hypothesis is true
Hypothesis Test: Step 1
state the hypothesis about the unknown population -the null hypothesis states that there is no change in the general population before and after an intervention. In the context of the experiment, H0 predicts that the independent variable had no effect on the dependent variable. -the alternative hypothesis states that there is a change in the general population following an intervention. in the context of an experiment, it predicts that the independent variable did have an effect on the dependent variable.
Hypothesis Test: Step 2
the alpha level establishes a criterion, or "cut off" for making a decision about the null hypothesis. The alpha level also determines the risk of a Type 1 error. alpha=.01, alpha=.05, alpha=.001 -the critical region consists of outcomes that are very unlikely to occur if the null hypothesis is true. That is, the critical region is defined by sample means that are almost impossible to obtain if the treatment has no effect.
Factors that Influence a Hypothesis Test
the final decision in a hypothesis test is determined by the value obtained for the z score statistic -two factors help determine whether the z score will be large enough to reject Ho. -in a hypothesis test, higher variability can reduce the chances of finding a significant treatment effect -increasing the number of scores in the sample produces a smaller standard error and a larger value for the z score
Comparison of one tailed vs. two tailed tests
the major distinction between one tailed and two tailed tests is in the criteria they use for rejecting H0. -a one tailed test allows you to reject the null hypothesis when the difference between the sample and the population is relatively small, provided the difference is in the specified direction -a two tailed test requires a relatively large difference independent of direction
Statistical power
the power of a statistical test is the probability that the test will correctly reject a false null hypothesis; power is the probability that the test will identify a treatment effect if one really exists -researchers typically calculate power as a means of determining whether a research study is likely to be successful -i.e. before they actually conduct the research -to calculate power, however, it is first necessary to make assumptions about a variety of factors that influence the outcome of a hypothesis test -factors such as the sample size, the size of the treatment effect, and the value chosen for the alpha level can all influence a hypothesis test
Selecting an Alpha Level
the primary concern when selecting an alpha level is to minimize the risk of a Type I error -thus, alpha levels tend to be very small probability values -by convention, the largest permissible value is a=.05 -however, as the alpha level is lowered, the hypothesis test demands more evidence from the research results
A Closer Look at the Z score statistic
the z score statistic that is used in the hypothesis test is the first specific example of what is called a test statistic -the term test statistic simply indicates that the sample data are converted into a single, specific statistic that is used to test the hypotheses -in a hypothesis test with z scores, we have a formula for z scores but we don't know the value for the population mean -therefore, we try the following steps: -make a hypothesis about the value of the population mean This is the null hypothesis -plug the hypothesized value in the formula along with the other values -if the formula produces a z score near zero, we conclude that the hypothesis was correct -on the other hand, if the formula produces an extreme value, we conclude that the hypothesis was wrong
Type II errors, cont'd
unlike a Type I error, it is impossible to determine a single, exact probability for a Type II error -instead, the probability of a Type II error depends on a variety of factors and therefore is a function, rather than a specific number -nonetheless, the probability of a Type II error is represented by the symbol B, the Greek letter beta