stats ch. 5

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

setting up the null and alternative hypotheses

1. we need to define how unusual a sample mean must be to cause you to reject H0 and accept HA (by ''unusual'' we mean unlikely to occur in the null hypothesis distribution).

significance criterion

Note that the area in each rejection region is equal to α/2, or .05/2 or .025, and that the total area of rejection (or critical region) is equal to the significance criterion (α) or .05. A less common, and more stringent, criterion is the 1% or .01 criterion of significance. Using this criterion, two-tailed hypotheses about the population mean are rejected if, when H0 is true, an obtained sample mean is so unlikely to occur by chance that no more than 1% of sample means would be so extreme. Here again, α/2 or .005 (or 1/2 of 1%) is in each tail.

H0 and zscore

Once the sample mean has been converted to a z score, the .05 criterion of significance states that H0 should be rejected if z is less than or equal to −1.96 or z is greater than or equal to +1.96.

POWER

The (conditional) probability of reaching this correct decision is called the power of the statistical test and is equal to 1−β.

null hypothesis testing (NHT)

The decision process just alluded to, in which we use a p value to decide whether our study found a true effect

criterion of significance (alpha level)

The proportion of null experiments that will be considered ''significant'' is called the criterion of significance and is symbolized by the first letter of the Greek alphabet, alpha (α).

two-tailed test example

We have just seen that the p value for sample means 34.25 points above the mean is .0304, and the p value for sample means 34.25 below the mean is also .0304. Therefore, the two-tailed p value for this experiment is .0304 + .0304 = .0608. Thus, you can see how the symmetry of the normal curve allows you to double your one-sided p value to go from a one-tailed to a two-tailed significance test. Note that .0608 is not less than .05, so H0 cannot be rejected at the .05 level for a two-tailed test.

One- Versus Two-Tailed Tests of Significance

The use of one-tailed tests of significance is sometimes justified in behavioral science research. If the results are in the direction predicted by the researcher, it is more likely that statistical significance will be obtained. But if the results are in the opposite direction, the entire experiment should be repeated using a two-tailed design before any conclusions are drawn. Therefore, the two-tailed test is more commonly reported, especially in the more selective psychological journals.

standard error of the mean

To test null hypotheses about the mean of one population, first compute the standard error of the mean. This is a measure of how accurate any one sample mean is likely to be as an estimate of the population mean. Then use the standard error of the mean to determine whether the difference between an observed sample mean and the hypothesized value of the population mean is or is not ''sufficiently unlikely'' to occur if H0 is true (i.e., if the hypothesized value of the population mean is correct).

Null Hypothesis Testing: General Considerations

1. State the null hypothesis (denoted by the symbol H0) and the alternative hypothesis (denoted by HA). It is at this point that you should decide whether to perform a one-tailed or two-tailed test. (It is important to understand that you cannot prove whether H0 or HA is true because you cannot measure the entire population.) 2. Begin with the assumption that H0 is true. Then obtain the data and test out this assumption. If this assumption leads to unlikely results, you will abandon it and switch your bets to HA; otherwise, you will retain it. 3. Before you collect any data, it is necessary to define in numerical terms what is meant by ''unlikely results.'' In statistical terminology, this is called selecting a criterion (or level) of significance, represented by the symbol α. (a) Using the .05 criterion of significance, ''unlikely'' is defined as having a probability of .05 or less. In symbols, α = .05. (b) Using the .01 criterion of significance, ''unlikely'' is defined as having a probability of .01 or less. In symbols, α = .01. Other criteria of significance can be used, but these are the most common. 4. Having selected a criterion of significance, and either a one- or two-tailed test, obtain your data and compute the appropriate statistical test. If the test shows that H0 is unlikely to be true, reject H0 in favor of HA. Otherwise, retain H0. 5. Though you are siding with the hypothesis indicated by the statistical test (H0 or HA), you could be wrong. You could be making either of the following two kinds of error, depending on your decision: (a) Type I error: Rejecting H0 when H0 is in fact true. This error is less likely when the .01 criterion of significance is used. The Type I error rate is a conditional probability; it is the probability of rejecting H0 given that H0 is actually true, and it is equal to the chosen criterion of significance, α. p is also a conditional probability. p is the probability of obtaining results as extreme or more extreme as the ones you got, if H0 is true. It's easy to confuse p with α—try not to. (b) Type II error: Retaining H0 when H0 is in fact false. This error is less likely when the .05 rather than the .01 criterion of significance is chosen. The probability of this kind of error, denoted by the symbol β, is not conveniently determined, but is discussed in detail in Chapter 11. It is the relative importance and consequences of each kind of error that help to determine which criterion of significance to use.

Why do scientists use samples?

Although behavioral researchers are usually interested in what the average response of an entire population would be to some treatment or experimental condition, practical constraints generally lead them to select and study a relatively small, random sample from the population of interest. After the members of the sample have been measured, the following steps can be applied in order to draw some conclusion about the population. Their studies usually involve a group of individuals, all of whom share some common characteristic (e.g., the same psychopathology diagnosis), or all of whom have been subjected to the same experimental treatment.

one- tailed test

Although behavioral researchers often have a theory that predicts the direction of their results, and could therefore argue for performing a one- tailed test (also called a directional test), to be conservative (i.e., which in this context means particularly cautious about Type I errors), they usually test their results in terms of two-tailed tests. However, a one-tailed test implies a promise that we would not have presented our results if our sample mean had been very much lower than the mean of the population—even if it were so low as to easily exceed the critical value for a two-tailed test in the other (i.e., lower) direction.

Point estimate

Any statistic computed from a single sample (such as X or s) that provides an estimate of the corresponding population parameter (such as μ or σ) is called a point estimate.

simpler way to make decisions about the null hypothesis

If you are planning to conduct a two-tailed test with, say, α = .05, you can find the minimum z score needed for significance ahead of time and then just check the z score for your sample to see if it is larger than the borderline z score. The z scores that fall right at the borderline between the significant and nonsignificant portions of the null hypothesis distribution are called critical values (or, less formally, cutoff scores). For a two-tailed test there is a critical value on each side of the distribution, as illustrated in Figure 5.4.

alternative hypothesis (symbolized by HA)

It specifies another (usually complementary) value or set of values for the population parameter. In the present study, HA: μ ̸= 500 This alternative hypothesis states simply that the population from which the sample comes does not have a μ equal to 500. That is, the difference of 34.25 between the sample mean and the null-hypothesized population mean is due to the fact that the null hypothesis is false (As stated above, the alternative hypothesis is nondirectional—no expectation is implied about whether the mean of the tested population will be larger or smaller than the general population. However, in the current example, we actually expect the tested population to have a higher mean. We deal with directional as compared to nondirectional hypotheses a little later.) Usually, the theory that the scientist hopes to support is identified with the alternative hypothesis.

Alternative vs. Null Hypothesis

Rejecting H0 will cause the scientist to conclude that his theory has been supported, and failing to reject H0 will cause him to conclude that his theory has not been supported. This is a sensible procedure, because scientific caution dictates that researchers should not jump to conclusions. They should claim success only when there is a strong indication to that effect—that is, when the results are sufficiently striking to cause rejection of H0.

Assumptions Required by the Statistical Test for the Mean of a Single Population

The most important of these is that the individuals in our sample were selected randomly (and independently) from the population. but the validity of the experiment would then depend critically on the compliance of those who were selected for the new experimental condition. the results could easily become biased in one direction The next most important assumption is that the variable being measured (e.g., math SAT) is normally distributed in the population. The Central Limit Theorem tells us that for fairly large samples, we will get reasonably accurate p values even if the original distribution is not very ''normal.'' However, to use the z-score formula at all, we had to know σ, the standard deviation of the individual scores in the population. Unfortunately, we will know σ only for a few variables that have been studied extensively in the population. More often we will not know σ, and we will have to estimate it from the data we do have. Estimating σ is an additional step that slightly complicates our test of a single population mean.

Why, then, don't researchers always use the .01 criterion, instead of the .05 criterion?

The short answer is that as you make alpha smaller, the critical value that an experimental result has to exceed to be statistically significant becomes larger. Therefore, more and more experiments for which the null hypothesis is not true will nonetheless fail to yield significant results. This is the undesirable consequence of setting a smaller value for alpha that we alluded to earlier.

critical values

The z scores that fall right at the borderline between the significant and nonsignificant portions of the null hypothesis distribution are called critical values (or, less formally, cutoff scores). For a two-tailed test there is a critical value on each side of the distribution, as illustrated in Figure 5.4.

two-tailed, or nondirectional, tests of significance.

This means that the null hypothesis regarding the population mean, μ, is rejected if a z value is obtained that is either extremely high (far up in the upper tail of the curve) or extremely low (far down in the lower tail of the curve). Consequently, there is a rejection area equal to α/2 in each tail of the distribution. As we have seen, the corresponding null and alternative hypotheses are: H0: μ = a specific value HA: μ ̸= this value (i.e., μ is greater than the value specified by H0 or is less than this value)

The z Score for Sample Means

To draw inferences about the population mean, first calculate zscore Then look in Table A to find the percentage of the normal curve that extends beyond your calculated z. Converted to a proportion, this is the one-tailed p value corresponding to your sample mean. Double this value when you want to obtain a two-tailed p value.

Type II error (beta)

When the null hypothesis is not true but you fail to reject H0 because your experimental results are not extreme enough Whereas alpha is fixed by the researcher, the value for beta associated with any particular experiment depends on a number of factors.

inferential parametric statistics.

When the testing of a statistical hypothesis is designed to help make a decision about a population parameter (here, μ), the overall method falls under the category of inferential parametric statistics.

one-tailed p value

applies only to the upper end of the distribution.

Type I error

false positive, Rejecting the null hypothesis for an experiment in which the null hypothesis is actually true

alpha is .05

note that setting alpha to .05 fixes only the Type I error rate and does not determine the total number of Type I errors. Although no attempt is made to control how many null experiments are performed, universal adoption of the .05 level ensures that only (about) 5% of null experiments (i.e., experiments for which the null hypothesis is actually true) that are performed will lead to a rejection of H0 and therefore to results that are (mistakenly) regarded as statistically significant (i.e., Type I errors). routinely setting a lower value for alpha (e.g., .01) is an option that would sometimes result in undesirable consequences that will be discussed shortly, and a larger value for alpha would lead to a rather high rate of Type I errors, so the .05 significance level is considered a reasonable compromise. It may seem that because .0304 is less than .05, we should reject the null hypothesis using the .05 decision rule.

inferential statistics

techniques for drawing inferences about an entire population, based on data obtained from a sample drawn from that population. One useful procedure is to estimate the values of population parameters. inferential statistics to determine an interval that is likely to include the population mean. The other major use of inferential statistics is to assess the probability of obtaining certain kinds of sample results under certain population conditions in order to test a specific hypothesis.

two-tailed test

the implication is that we would also have tested our sample mean for statistical significance if it were below the population mean. The two-tailed test is the more cautious option To perform a two-tailed instead of a one-tailed test, all you have to do is double your one-tailed p-value (which you obtained from Table 5.1) and compare it to your alpha level. A two-tailed p is defined as the probability of obtaining a result as extreme as the one you got—or more extreme—if the null hypothesis is true. Therefore, if you have a two-tailed hypothesis—one for which you would be interested in an extreme result in either direction from the mean—then you should add the area beyond one extreme score with the area beyond the opposite extreme score to get the total p-value.

statistically significant

we can try to control the rate at which Type I errors are made. We do this by deciding on the proportion of all null experiments we will allow to be declared statistically significant (i.e., the null hypothesis has been rejected).

null hypothesis (symbolized by H0)

which specifies the hypothesized population parameter, null hypothesis states that there is no effect or no difference of the kind that the experiment is seeking to establish. H0: μ = 500 This null hypothesis implies that the sample with mean equal to 534.25 is a random sample from the population with μ equal to 500 (and that the difference between 534.25 and 500 is due to chance factors).

stats ch. 5

Ensembles d'études connexes

SPED Praxis Study Cards

Deca Economics Section 1

The Phantom of the Opera Test

basic chemistry exam #1

Willox quizzes for final

Linux Final

Reading Quiz 3

AP Government Ch. 6

World Geography Mid-Term

Tragedy of the Commons

Ch 3

Gobind rai gastrointestinal system

Top Hat

SCM 146 Chapter 3

World History: Semesters 1 & 2

PFL Lesson 1.1 Review

homework 12

Chapter 3

Micro Exam 4

In the frame_questions (Translate the questions into Russian)