Stats Ch. 8 Intro to Hypothesis Testing

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

When do researchers calculate the power of a hypothesis test?

Before they actually conduct research study

True or false. In a research report, the term significant is used when the null hypothesis is rejected.

True.

To differentiate between real, systematic patterns and random, chance occurrences, researchers rely on a statistical technique known as ________________ ____________

hypothesis testing

Alpha is .01. What are the corresponding z scores?

plus and minus 2.58

Alpha is .001. What are the corresponding z scores?

plus and minus 3.3

Definition of a "test statistic"

A statistic that summarizes the sample data in a hypothesis test. The test statistic is used to determine whether or not the data are in the critical region.

The extremely unlikely values, as defined by the alpha level, make up what is called the _______ ______.

critical region.

The first, and most important, of the two hypotheses is called the ____ hypothesis.

null

The power of a test is defined as

the probability that the test will reject the null hypothesis if the treatment really has an effect.

The alpha level for a hypothesis test is the probability that the test will lead to a Type __ error. That is, the alpha level determines the probability of obtaining sample data in the critical region even though the null hypothesis is true.

type 1

Z score formula in words

z score equals the sample mean minus the hypothesized population over the standard error between M and u

Commonly used alpha levels are:

α = .05 (5%), α = .01 (1%), and α = .001 (0.1%)

The p-value must be smaller than ____ to be considered statistically significant.

.05

Hypothesis test definition

A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis about a population.

As the power of a test increases, what happens to the probability of a Type II error?

As power increases, the probability of a Type II error decreases.

How does increasing sample size influence the value of Cohen's d?

Cohen's d is not influenced at all by the sample size.

True or false. In a research report, the results of a hypothesis test include the phrase "z = 3.15, p < .01." This means that the test failed to reject the null hypothesis.

False. The probability is less than .01, which means it is very unlikely that the result occurred without any treatment effect. In this case, the data are in the critical region, and H0 is rejected.

How does increasing sample size influence the power of a hypothesis test?

Increase power

How does increasing sample size influence the outcome of a hypothesis test?

Increasing sample size increases the likelihood of rejecting the null hypothesis.

a value of d = 1.00 is considered a ____ treatment effect.

LARGE

Consequences of choosing a low alpha level

Specifically, a lower alpha level means less risk of a Type I error, but it also means that the hypothesis test demands more evidence from the research results.

For a particular hypothesis test, the power is .50 (50%) for a 5-point treatment effect. Will the power be greater or less than .50 for a 10-point treatment effect?

The hypothesis test is more likely to detect a 10-point effect, so power will be greater.

____ is the probability of a type 2 error.

beta

One of the simplest and most direct methods for measuring effect size is _____ ____

Cohen's d. estimated Cohen's d measures the distance between two means and is typically reported as a positive number even when the formula produces a negative value.

True or false. If the alpha level is increased from α = .01 to α = .05, then the boundaries for the critical region move farther away from the center of the distribution.

False. A larger alpha means that the boundaries for the critical region move closer to the center of the distribution.

True or false. If the sample data are sufficient to reject the null hypothesis for a one-tailed test, then the same data would also reject H0 for a two-tailed test.

False. Because a two-tailed test requires a larger mean difference, it is possible for a sample to be significant for a one-tailed test but not for a two-tailed test.

True or false. If a sample mean is in the critical region with α = .05, it would still (always) be in the critical region if alpha were changed to α = .01.

False. With α = .01, the boundaries for the critical region move farther out into the tails of the distribution. It is possible that a sample mean could be beyond the .05 boundary but not beyond the .01 boundary.

Suppose that you do a hypothesis test and reject the null hypothesis with α = .05. Can you conclude that there is a 5% probability that you are making a Type I error? Can you also conclude that there is a 95% probability that your decision is correct and the treatment does have an effect?

For both questions, the answer is no.

Symbol for null hypothesis

H0

In the context of an experiment, ___ predicts that the independent variable (treatment) has no effect on the dependent variable (scores) for the population.

H0, or the null hypothesis

How does variability influence a hypothesis test?

In a hypothesis test, higher variability can reduce the chances of finding a significant treatment effect. In general, increasing the variability of the scores produces a larger standard error and a smaller value (closer to zero) for the z-score. If other factors are held constant, then the larger the variability, the lower the likelihood of finding a significant treatment effect.

How does the number of scores in the sample influence a hypothesis test?

In general, increasing the number of scores in the sample produces a smaller standard error and a larger value for the z-score. If all other factors are held constant, the larger the sample size, the greater the likelihood of finding a significant treatment effect.

The consequences of a Type 1 error

In most research situations, the consequences of a Type I error can be very serious. Because the researcher has rejected the null hypothesis and believes that the treatment has a real effect, it is likely that the researcher will report or even publish the research results. A Type I error, however, means that this is a false report. Thus, Type I errors lead to false reports in the scientific literature. Other researchers may try to build theories or develop other experiments based on the false results. A lot of precious time and resources may be wasted.

The treatment with alcohol had a significant effect on the birth weight of newborn rats, z = 3.00, p < .05. What is meant by the word significant?

In statistical tests, a significant result means that the null hypothesis has been rejected, which means that the result is very unlikely to have occurred merely by chance. For this example, the null hypothesis stated that the alcohol has no effect, however the data clearly indicate that the alcohol did have an effect. Specifically, it is very unlikely that the data would have been obtained if the alcohol did not have an effect.

True or false. If a researcher predicts that a treatment will increase scores, then the critical region for a one-tailed test would be located in the right-hand tail of the distribution.

True. A large sample mean, in the right-hand tail, would indicate that the treatment worked as predicted.

True or false: If other factors are held constant, increasing the size of the sample increases the likelihood of rejecting the null hypothesis.

True. A larger sample produces a smaller standard error, which leads to a larger z-score.

True or false. A small value (near zero) for the z-score statistic is evidence that the sample data are consistent with the null hypothesis.

True. A z-score near zero indicates that the data support the null hypothesis.

True or false. A z-score value in the critical region means that you should reject the null hypothesis.

True. A z-score value in the critical region means that the sample is not consistent with the null hypothesis.

True or false. If a sample mean is in the critical region with α = .01, it would still (always) be in the critical region if alpha were changed to α = .05.

True. With α = .01, the boundaries for the critical region are farther out into the tails of the distribution than for α = .05. If a sample mean is beyond the .01 boundary it is definitely beyond the .05 boundary.

True or false. A researcher obtains z = 2.43 for a hypothesis test. Using α = .01, the researcher should reject the null hypothesis for a one-tailed test but fail to reject for a two-tailed test.

True. he one-tailed critical value is z = 2.33 and the two-tailed value is z = 2.58

In a hypothesis test, there are two different kinds of errors that can be made.

Type 1 errors and Type 2 errors

Describe a type 2 error

a Type II error is the failure to reject a false null hypothesis. In more straightforward English, a Type II error means that a treatment effect really exists, but the hypothesis test fails to detect it.

The null hypothesis and the alternative hypothesis are ____________ __________and exhaustive. They cannot both be true, and one of them must be true. The data determine which one should be rejected.

mutually exclusive

Alpha is .05. What are the corresponding z scores?

plus and minus 1.96

Instead of measuring effect size directly, an alternative approach to determining the size or strength of a treatment effect is to measure the ______ of the statistical test.

power

The assumptions for hypothesis tests with z-scores are summarized as follows:

random sampling, independent observations, the value of the standard deviation is unchanged by the treatment, and normal sampling distribution

Hypothesis testing is a statistical procedure that allows researchers to use ______ data to draw inferences about the population of interest.

sample

Factors that affect powder

size of the treatment effect, sample size (increase to increase power), alpha level (reduce alpha reduces power), one tailed test increase powder

Whenever the data from a research study produce a sample mean that is located in the critical region, we conclude that:

the data are not consistent with the null hypothesis, and we reject the null hypothesis. we can also define the critical region as sample values that provide convincing evidence that the treatment really does have an effect.

The primary concern when selecting an alpha level is

to minimize the risk of a Type I error.

The power of a hypothesis test is equal to 1 − ___

β

True or false. If other factors are held constant, are you more likely to reject the null hypothesis with a standard deviation of σ = 2 or with σ = 10?

σ = 2. A smaller standard deviation produces a smaller standard error, which leads to larger z-score.

Definition of a result being "significant"

A result is said to be this, if it is very unlikely to occur when the null hypothesis is true. That is, the result is sufficient to reject the null hypothesis. Thus, a treatment has a significant effect if the decision from the hypothesis test is to reject H0.

In very simple terms, the logic underlying the hypothesis-testing procedure is as follows (4):

1. First, we state a hypothesis about a population. Usually the hypothesis concerns the value of a population parameter. For example, we might hypothesize that American adults gain an average of μ = 7 pounds between Thanksgiving and New Year's Day each year. 2. Before we select a sample, we use the hypothesis to predict the characteristics that the sample should have. For example, if we predict that the average weight gain for the population is μ = 7 pounds, then we would predict that our sample should have a mean around 7 pounds. Remember: The sample should be similar to the population, but you always expect a certain amount of error. 3. Next, we obtain a random sample from the population. For example, we might select a sample of n = 200 American adults and measure the average weight change for the sample between Thanksgiving and New Year's Day. 4. Finally, we compare the obtained sample data with the prediction that was made from the hypothesis. If the sample mean is consistent with the prediction, then we conclude that the hypothesis is reasonable. But if there is a big discrepancy between the data and the prediction, then we decide that the hypothesis is wrong.

Notice that a directional (one-tailed) test requires two changes in the step-by-step hypothesis-testing procedure.

1. In the first step of the hypothesis test, the directional prediction is incorporated into the statement of the hypotheses. 2. In the second step of the process, the critical region is located entirely in one tail of the distribution.

The following four steps outline the hypothesis-testing procedure that allows us to use sample data to answer questions about an unknown population.

1. State the hypothesis- two opposing hypotheses, null hypothesis and scientific/alternative hypothesis 2. Set the criteria for a decision- alpha value (level of significance) 3. Collect data and compute sample statistics, particularly the z-score of the sample mean. 4. Make a decision. Two outcomes.

Two possible outcomes from a hypothesis test.

1. The sample data are located in the critical region. Reject the null hypothesis. 2. The sample data are not in the critical region. Fail to reject the null hypothesis.

Jury trial analogy with hypothesis testing

1. The test begins with a null hypothesis stating that there is no treatment effect. The trial begins with a null hypothesis that the defendant did not commit a crime (innocent until proven guilty). 2. The research study gathers evidence to show that the treatment actually does have an effect, and the police gather evidence to show that the defendant really did commit a crime. Note that both are trying to refute the null hypothesis. 3. If there is enough evidence, the researcher rejects the null hypothesis and concludes that there really is a treatment effect. If there is enough evidence, the jury rejects the hypothesis and concludes that the defendant is guilty of a crime. 4. If there is not enough evidence, the researcher fails to reject the null hypothesis. Note that the researcher does not conclude that there is no treatment effect, simply that there is not enough evidence to conclude that there is an effect. Similarly, if there is not enough evidence, the jury fails to find the defendant guilty. Note that the jury does not conclude that the defendant is innocent, simply that there is not enough evidence for a guilty verdict.

List factors that influence a hypothesis test

1. the difference bteween the sample mean and the hypothesized population mean from the null hypothesis 2. the variability of the scores (standard deviation or the variance) (influences standard error in the denominator) 3. the number of scores in the sample (influences the size of the standard error in the denominator)

Describe a type 1 error

A Type I error occurs when a researcher rejects a null hypothesis that is actually true. In a typical research situation, a Type I error means that the researcher concludes that a treatment does have an effect when, in fact, it has no effect.

Describe a directional hypothesis test, or a one-tailed test

A directional test is a hypothesis test that includes a directional prediction in the statement of the hypotheses and places the critical region entirely in one tail of the distribution. the statistical hypotheses (H0 and H1) specify either an increase or a decrease in the population mean. That is, they make a statement about the direction of the effect.

Describe effect size

A measure of the size of the treatment effect that is separate from the statistical significance of the effect

The goal of the hypothesis test:

is to determine whether the treatment has any effect on the individuals in the population

When Type 2 errors happen

Often this happens when the effect of the treatment is relatively small. In this case, the treatment does influence the sample, but the magnitude of the effect is not big enough to move the sample mean into the critical region.

Differences between a one tailed and two tailed test

One group of researchers contends that a two-tailed test is more rigorous and, therefore, more convincing than a one-tailed test. Remember that the two-tailed test demands more evidence to reject H0 and thus provides a stronger demonstration that a treatment effect has occurred. Other researchers feel that one-tailed tests are preferable because they are more sensitive. That is, a relatively small treatment effect may be significant with a one-tailed test but fail to reach significance with a two-tailed test. Also, there is the argument that one-tailed tests are more precise because they test hypotheses about a specific directional effect instead of an indefinite hypothesis about a general effect. In general, two-tailed tests should be used in research situations when there is no strong directional expectation or when there are two competing predictions.

What does the null hypothesis state?

That the treatment has no effect, no change, no difference, no relationship.

What is the alpha value?

The alpha (α) value is a small probability that is used to identify the low probability samples. The alpha level, or the level of significance, is a probability value that is used to define the concept of "very unlikely" in a hypothesis test.

Describe the alternative hypothesis

The alternative hypothesis (H1) states that there is a change, a difference, or a relationship for the general population. In the context of an experiment, H1 predicts that the independent variable (treatment) does have an effect on the dependent variable.

The consequences of a Type 2 error

The consequences of a Type II error are usually not as serious as those of a Type I error. In general terms, a Type II error means that the research data do not show the results that the researcher had hoped to obtain. The researcher can accept this outcome and conclude that the treatment either has no effect or has only a small effect that is not worth pursuing, or the researcher can repeat the experiment (usually with some improvement, such as a larger sample) and try to demonstrate that the treatment really does work.

Does adding a constant change the mean, shape of the population, or standard deviation?

The mean will change but shape and standard deviation do not change.

What is the p-value?

The p value is the probability that the result would occur if H0 were true (without any treatment effect), which is also the probability of a Type I error. It is essential that this probability be very small.

The treatment with alcohol had a significant effect on the birth weight of newborn rats, z = 3.00, p < .05. What is the meaning of z = 3.00?

The z indicates that a z-score was used as the test statistic to evaluate the sample data and that its value is 3.00. Value was 3 standard deviations above the mean.

The treatment with alcohol had a significant effect on the birth weight of newborn rats, z = 3.00, p < .05. What is meant by p < .05?

This part of the statement is a conventional way of specifying the alpha level that was used for the hypothesis test. It also acknowledges the possibility (and the probability) of a Type I error. Specifically, the researcher is reporting that the treatment had an effect but admits that this could be a false report. That is, it is possible that the sample mean was in the critical region even though the alcohol had no effect. However, the probability (p) of obtaining a sample mean in the critical region is extremely small (less than .05) if there is no treatment effect.

Z score formula as a recipe: analogy

This situation is similar to trying to follow a cake recipe in which one of the ingredients is not clearly listed. For example, the recipe may call for flour but there is a grease stain that makes it impossible to read how much flour. Faced with this situation, you might try the following steps: 1 Make a hypothesis about the amount of flour. For example, hypothesize that the correct amount is 2 cups. 2 To test your hypothesis, add the rest of the ingredients along with the hypothesized amount of flour and bake the cake. 3 If the cake turns out to be good, you can reasonably conclude that your hypothesis was correct. But if the cake is terrible, you conclude that your hypothesis was wrong. In a hypothesis test with z-scores, we do essentially the same thing. We have a formula (recipe) for z-scores but one ingredient is missing. Specifically, we do not know the value for the population mean, μ.

The ____ level determines the probability of a Type I error.

alpha


Kaugnay na mga set ng pag-aaral

AP Statistics Semester 1 Quiz/Checkpoint Questions

View Set

Pénzügyi és gazdasági kultúra

View Set

Didactique des langues étrangères

View Set

Structural Materials - Chapter 5: Wood

View Set