Unit 3: Principles of the Scientific Method: Validity and Control Book
selection
A confound that can occur due to assignment of subjects to groups
experimenter bias
A large number of studies indicate that the experimenter can unintentionally bias the results of an experiment.
What is experimenter bias? How might such a bias be overcome?
A large number of studies indicate that the experimenter can unintentionally bias the results of an experiment. Experiments are often designed to allow bias to operate freely. In some studies, evidence exists that the experimenters may simply have fudged the data rather than biasing them. Nevertheless, there is widespread agreement that an experimenter's biases can subtly influence experiments. The effects of experimenter bias are so ubiquitous that a standard procedure in many disciplines is for the experimenter to be "blind" to the condition a subject experiences. This method of preventing experimenter bias is excellent and fool- proof, but it is not always possible in a psychological experiment. Only after the experiment did we inquire about smoking habits. This technique was probably effective in most cases, but it could not make the experimenter blind to obvious tobacco stains or tobacco odor. So, as you can see, trying to blind an experimenter isn't always easy. Blind experimenters may also devise their own hypotheses about experiments and thus unintentionally bias the participants in the direction of their concocted hypotheses. Another basic strategy for reducing experimenter bias is to standardize or automate experiments as much as possible. In some experiments, testing parti- cipants in all conditions at the same time may be possible. The various condi- tions can be induced by written instructions given to each subject. If participants must be tested individually, instructions can be tape-recorded or presented via computer, so that each subject receives the same experience. Variations on these basic strategies can be constructed for use in particular situations.
What is replication? Distinguish two types of replication: direct and systematic.
A method of control seldom described as such is replication—the repeating of an experiment to see if the same results are found the second time. Laypeople sometimes assume that once a result has been found by a scientific experiment, the conclusions are fixed permanently. The truth is that an experiment seldom stands by itself, particularly if the results are surprising. In fact, an unusual result remains in a kind of limbo until other experimenters have successfully replicated the experiment. If other experimenters obtain the same results, they become part of our scientific knowledge. If the replication is not successful, the supposed facts found in the original experiment are invalid and are forgotten. Two types of replication are commonly distinguished: direct and systematic. Direct replication occurs when someone repeats essentially the identical experiment in an attempt to obtain the same results. Systematic replication occurs when Researcher B says, "If A's theory is correct, then the following should happen." Then B performs an experiment different from A's but based on it. If A's results and theory are correct, B should find a result that supports the theory. Direct replication is seldom carried out because finding exactly the same thing as someone else did brings little glory. More specifically, it is difficult to get grants for replications, journals tend to avoid publishing such research, and professors who spend time replicating other people's work do not get promoted. Direct replication is usually attempted only when systematic replication has failed. Investigators then go back and repeat the original method more exactly to pinpoint the source of the difference in results. Systematic replication is the usual way that experiments are replicated. Researcher B will do an experiment similar to Researcher A's but with different types of subjects, with different values of the stimulus, or with different ways of measuring the theoretical concepts. All of these approaches are con- sidered systematic replication. As long as results consistent with A's are found, A's original experiment is supported by B's work. You will notice that systematic replication tests external validity by using different subjects, species, or situations. Construct validity is tested when different ways are used to measure the theoretical concepts. Statistical validity is tested in all replications, both direct and systematic.
maturation
A source of error in an experiment related to the amount of time between measurements
Briefly describe the major threats to internal validity.
Ambiguous Temporal Precedence: Temporal precedence is important in determining whether one thing caused another. The cause must always come before the effect, but in some cases, it is unclear in ATP. This often happens in correlational studies, in which two things are clearly related, but which one caused the other is not obvious. Although two variables are related, it is not clear which one is the cause and which one is the effect History: Whenever an experiment is conducted in such a way that different experi- mental conditions are presented to subjects at different times, it is possible for events outside the laboratory to influence the results, and this type of confound is called history. Maturation: Maturation is a more critical problem in research involving children because they change more rapidly over time than do adults. Yet the fact that adults change with age is now becoming widely appreciated. Effects of Testing: Simply being in an experiment or being tested will influence people's performance in a later experiment or administration of the test. The participants may become sophisticated about the testing procedure or may learn how to take tests so that there is an effect of repeat testing and their later behavior is changed by the earlier experience. Regression Effect: The regression effect, one of the most insidious threats to validity, arises in many situations. The regression effect operates when there is less than a perfect correlation between two variables. Individuals who performed at the extremes on one test tend to score closer to the mean on the other test. Selection: Many studies compare two or more groups on some dependent variable. Any bias in the selection of the members of the groups can undermine internal validity. Mortality: Mortality is sometimes called selective subject loss or attrition. Even if there is no bias in selecting participants and you are able to constitute groups that are the same in every respect, your study may be invalid if all subjects do not complete all phases of it. Mortality is a threat to validity because the participants who drop out of a study may be different from those who complete it. Biases can result if particular kinds of participants drop out.
What is random assignment? How can random assignment be performed?
Another powerful control method is random assignment of subjects to groups in a between-subjects experimental design. The term random assignment, or random allocation, is used here in a specific sense: The allocation of subjects to conditions is random when each subject has an equal and independent chance of being assigned to every condition. The advantage of random allocation of subjects to conditions is that once subjects have been randomly assigned, the only way that confounding of subject-related variables with the experimental variable can occur is by chance. Although it is often used in common speech, the statistical term random has a specific meaning that is not to be confused with haphazard, arbitrary, or without a plan. The first step is to assign numbers to individuals. If you have 20 subjects, you assign numbers one through 20 to the individuals in a way that is nonsystematic. Then look at a random-number table. You could decide that the first subject would go into Group 1 or Group 2 depending on whether a one or a two came up first in the random-number table. An alternative would be to consider Group 1 to be odd and Group 2 even. You would then put a subject into Group 1 if an odd number came up and into Group 2 if an even number came up.
control
Any means used to rule out threats to the validity of research
validity
By validity, we mean simply that the researcher's conclusion is true or correct—that it corresponds to the actual state of the world. An indication of accuracy in terms of the extent to which a research conclusion corresponds with reality
What does the concept of control mean in experimental research?
Control is the other side of the validity coin. The heart of the experimental approach to knowledge is to ask the following two questions: (1) What are the threats to the validity of a contemplated piece of research? (2) What means are available to neutralize those threats? This approach is so basic that anyone who is acquainted with research has heard of control groups. Every experiment must have a control group, right? Wrong. It is true that you must have some method of countering every plausible alternative explanation of the results of your experiment. It is also true that this often involves the use of a group of participants who do not experience the manipulation—that is, a control group—to serve as a standard against which to compare the effect of the variable of interest. We define control as any means used to rule out possible threats to the validity of a piece of research. The concept of control is essentially a way of establishing that two individuals (or groups, or conditions) are identical except for the variable of interest. When that is the case, the research is internally valid. In psychology, the concept of control is used in two rather different ways: first, as standard of comparison, and second, as a way of reducing variability.
matching
Control procedure to ensure that experimental and control groups are equated on one or more variables before the experiment
What is meant by confounding?
Error that occurs when the effects of two variables in an experiment cannot be separated, resulting in a confused interpretation of the results Confounding is one of the biggest threats to validity in experimentation. This experiment lacks internal validity because you cannot conclude that feedback caused any differences between the groups. This problem is known as confounding. When some condition co-varies with the independent variable in such a way that their separate effects cannot be sorted out, the two variables are confounded.
history
Events that occur outside of the experiment that could influence the results of the experiment
Why is matching a strategy for achieving control? Under what conditions should matching of subjects be done? Describe the proper matching procedure.
Experimental precision can sometimes be improved by matching subjects on a pretest before randomly allocating them to conditions. When the subjects differ among themselves on an independent variable known or suspected to affect the dependent variable of interest, matching may be necessary. The first requirement to justify matching is a strong suspicion that there is an important variable on which the subjects differ that can be controlled. Further, you must believe that a substantial correlation will be present between the matching variable and the dependent variable. A second condition necessary to justify matching is that it must be feasible to present a pretest to the subjects before assigning them to the conditions. Let us emphasize a final point about the mechanics of matching. Even when you have matched your subjects, you must still randomly allocate the members of the pairs to conditions. If you have 10 pairs of rats matched for weight, you must flip a coin or follow some procedure that will ensure that the members of each pair are allocated to groups randomly. Otherwise, your procedure for placing them into groups could introduce confounding results.
internal validity
Extent to which a study provides evidence of a cause-effect relationship between the independent and dependent variables
statistical conclusion validity
Extent to which data are shown to be the result of cause-effect relationships rather than accident
construct validity
Extent to which the results sup- port the theory behind the research
Define external validity.
External validity concerns whether the results of the research can be generalized to another situation: different subjects, settings, times, treatments, observations, and so forth. The idea that experimental results obtained in a laboratory setting might be different from those obtained in a natural setting reflects a question about ecological validity. This type of validity is closely related, but not identical, to external validity, and is concerned with how close an experimental situation is to the real world.
Distinguish the meanings of control: control experiment and experimental control.
Group 1 receives Treatment A (say, 5 mg of caffeine); it is the experimental group. Group 2 receives no treatment; it is the control group. The control group serves as the basis of comparison for the experimental group. If the two groups were equal before the experimental treatment, then any post-experimental difference between them can be attributed to the treatment. We have a control condition, when each subject experiences every condition, we say that each subject serves as his or her own control. An experiment of this kind is called a within-subjects experiment because the differences between conditions are tested within individual subjects. An experiment in which different groups of subjects experience different conditions is a between-subjects experiment because the differences between conditions are tested between different subjects. A second meaning of the term control is distinct from the first but closely related—namely, the ability to restrain or guide sources of variability in research. Why is it important to reduce variability? Limiting the things that change to those mandated by the experimental design (such as the independent variable) reduces the chances of confounding variables or measurement error and increases our confidence in the experimental results. We can characterize the difference between the two meanings by use of the terms control experiment and experimental control. When we have experimental control (secondary meaning), we have a much more sensitive situation in which to rule out alternative explanations of the experimental results (primary meaning).
external validity
How well the findings of an experiment generalize to other situations or populations
Briefly describe building nuisance variables in an experiment as a strategy for achieving control.
In addition to random assignment or matching, another way to handle variables that cannot easily be removed from the experiment is to design the experiment so that these nuisance variables become independent variables in the study. Nuisance variables are known or suspected to affect the dependent variable, but variables in which you have no theoretical interest. Left uncontrolled, these variables may affect the dependent variable so strongly that they hide the true effects of the independent variable. Building these nuisance variables into your study allows you to measure their effects and to examine the effects of your independent variable. Nuisance variables should not be confused with confounded variables. A confounded variable is one that varies with the independent variable. A nuisance variable is treated as a second independent variable that is varied separately from the first one.
Define internal validity.
Internal validity is the most fundamental type because it concerns the logic of the relationship between the independent and dependent variables. An experi- ment has internal validity if there are sound reasons to believe that a cause and effect relationship really is present between the independent and depen- dent variables. In other words, in an experiment with high internal validity, it really was the independent variable that caused the dependent variable to change.
Briefly describe two threats to construct validity.
Loose Connection Between Theory and Method: The experiment described earlier for testing the effects of anxiety on learning was an extreme example of a loose connection between a theoretical construct and method. Nail biting is a poor method of measuring anxiety, and writing with the toes is likely to be a poor measure of learning. Much psychological research suffers from poor operational definition of theoretical concepts. Ambiguous Effect of Independent Variables: An experimenter may carefully design an experiment in which all reasonable confounding variables seem to be well controlled, only to have the results compromised because the participants perceive the situation differently than the experimenter does. Because some participants may see the situation in the same way as the experimenter, but others understand it differently, the experimental circumstances are ambiguous and the independent variable may be affected differently in the participants. The ambiguous effect of the independent variables results from the fact that any psychological experiment for which a person has volunteered must be considered to be a social situation in which both the participant and the experimenter have preconceived ideas about what is expected.
statistical control
Mathematical means of comparing subjects on paper when they cannot be equated as they exist in fact
What is mortality? When is it a threat to internal validity?
Mortality is sometimes called selective subject loss or attrition. Even if there is no bias in selecting participants and you are able to constitute groups that are the same in every respect, your study may be invalid if all subjects do not complete all phases of it. Mortality is a threat to validity because the participants who drop out of a study may be different from those who complete it. Biases can result if particular kinds of participants drop out.
Briefly describe three threats to external validity.
Other Subjects: A common indictment of psychological research is that it uses mainly college students and white rats as experimental subjects. The reasons psychologists rely on these two species is their accessibility to researchers and presumed representativeness. Human participants should be chosen with equal attention to their representativeness relative to some larger population. If you are doing an experiment with college students on bargaining and negotiation, will the results validly predict what a secretary of state or a general would do? Other Times: Would the same experiment conducted at another time produce the same results? Many historical trends render particular research findings invalid, whether they concern use of language, attitudes toward foreign countries, or perception of deviant groups. Other Settings: A pervasive problem that can hinder external validity involves the question of how the phenomenon observed in one laboratory can be related to a similar phenomenon observed in another laboratory or in the real world. Many psychologists have given up laboratory work altogether in favor of field research, or even armchair speculation, for this very reason. Though laboratory research ensures a higher level of control, it is sometimes not easy to decide if a certain effect is simply a laboratory effect or whether it would survive transplantation to the world outside the laboratory.
subject bias
Participants act the way they think the experimenter wants them to act. They may deliberately feign a naive attitude about the expected results even though they can guess the true purpose of the experiment.
role demands (or demand characteristics)
Participants' expectations of what an experiment requires them to do.
testing
Performance on a second test is influenced by simply having taken a first test
Define construct validity.
Recall that the construct validity of a measurement concerns whether it measures what it is intended to measure and nothing else. Construct validity of research concerns the question of whether the results support the theory behind the research. In other words, can you generalize from the specific operations of your experiment (including people and set- tings) to the general theoretical construct about the population in question? Would another theory predict the same experimental results? You can see that the two types of construct validity are related. Both are concerned with how well the underlying idea (or theory) is reflected in the measurement. If the measurement used in some research lacks construct validity, the research as a whole will also lack construct validity.
replicability
Repeating an experiment to see if the results will be the same
Define statistical conclusion validity.
Statistical conclusion validity is similar to internal validity. Here the question is, did the independent variable truly cause a change in the dependent variables, or was the result accidental, and thus caused by pure chance? It also asks how strong the relationship is between the independent and dependent variables. To establish statistical conclusion validity, appropriate sampling and measurement techniques must be used, and inferential statistics must be used properly, in keeping with their underlying assumptions.
Briefly describe the strategy of subject as own control. What are the limitations of this strategy?
Subject as Own Control (Within-Subjects Control): One of the most powerful control techniques is to have each participant experience every condition of the experiment. In this way, variation caused by differences between people is greatly reduced. The experimenter is wise to adopt the strategy of using participants as their own controls whenever possible. In many experiments, however, using subjects as their own controls sim ply is not possible. For example, once the participant has learned something by one method, learning the same problem again by using a different method is impossible. The information cannot be unlearned, so using the subjects as their own controls is not possible. Another situation in which using subjects as their own controls is not feasible occurs when contrast effects exist between the conditions of the experiment, so that experiencing one condition may carry over and influence the response to another condition. Sometimes contrast effects can simply exaggerate an outcome that would occur between subjects, as in this example. Other times they produce outcomes that would not otherwise be found. The difference between using a within-subjects design and a between- subjects design can cause puzzling discrepancies in the results of experiments. In summary, you should consider using subjects as their own controls whenever three conditions can be met: 1. Using subjects as their own controls is logically possible. 2. Participating in all conditions of the experiment will not destroy the naiveté of the subject. 3. Serious contrast effects between conditions will not be present.
regression effects
Tendency of subjects with extreme scores on a first measure to score closer to the mean on a second testing
mortality
The dropping out of some subjects before an experiment is completed, causing a threat to validity
What are role demands (or demand characteristics)? How might they be overcome?
The participants' knowledge that they are participating in an experiment constitutes a set of expectations about how they are to behave. These expectations are called role demands, or demand characteristics, of the experiment. Perhaps the most prevalent type of role demand is the good-subject tendency. Participants act the way they think the experimenter wants them to act. They may deliberately feign a naive attitude about the expected results even though they can guess the true purpose of the experiment. Perhaps they have heard about the experiment or have learned of similar experiments conducted elsewhere. Another kind of participant expectancy is the concern that the experimental procedure in some way measures the participant's competence. Some participants are convinced that the experiment is a carefully disguised measure of intelligence or emotional adjustment. This expectancy gives rise to evaluation apprehension, in which participants tailor their behavior to make themselves look as normal as possible. Another name for this problem is social desirability. Much ingenuity has been devoted to keeping the influence of role demands from undermining the validity of experiments. The most obvious and seemingly simplest solution is to deceive the subject about the experiment's purpose. A cover story is devised that provides a plausible rationale, and the true hypothesis is not revealed. This ploy often works but has several drawbacks, not the least of which involves ethics. The story may cause participants to behave in a way that interacts with the true hypothesis. Inevitably, too, people hear that many psychological experiments are not what they seem to be on the surface. This knowledge increases the difficulty of devising a believable cover story and may even influence the results of experiments that do not use deception. Another approach is to divide the experiment in such a way that parts of the data are obtained in another setting. This design makes it less likely that participants will put two and two together and surmise the hypothesis. An additional method of counteracting bias is to use a measure that is unlikely to be influenced by participants' guesses about the hypothesis.
Why is confounding particularly acute in research in which a subject variable is used?
The problem of confounding is particularly acute in research in which the experimenter cannot control the independent variable—when participants are selected according to the presence or absence of a condition and not selected simply to have a condition assigned to them. Such variables are called subject variables. A difference between subjects that cannot be controlled but can only be selected
What is the regression effect? When does it arise?
The regression effect, one of the most insidious threats to validity, arises in many situations. The regression effect operates when there is less than a perfect correlation between two variables. Individuals who performed at the extremes on one test tend to score closer to the mean on the other test.
List several specific strategies for achieving control.
There are three general strategies for achieving control in psychological research: using a laboratory setting, considering the research setting as a preparation, and instrumenting the response. Subject as Own Control (Within-Subjects Control) Random Assignment (Between-Subjects Control) Matching (Between-Subjects Control) Building Nuisance Variables into the Experiment Application, Replication
What are the two broad categories of bias resulting from the interaction between subject and experimenter?
This concern comes from the realization that an experiment is a social situation with its own set of rules. Both the participant and the experimenter have expectations about how they should behave in an experiment.
Briefly describe the several ways to determine if a test has construct validity.
To improve the validity of your experiment, you might have used a manipulation check, such as including the Beck Anxiety Inventory as a way to make sure that the fingernail biting was a good way to classify your subjects on anxiety. Manipulation checks aim to see that a variable (usually the independent variable) is working in the way that you think that it is, and these checks are often built right into the experimental design. Manipulation Check: Aspect of an experiment designed to make certain that variables have changed in the way that was intended New Study: Construct validity is similar to internal validity. In internal validity, you strive to rule out alternative variables as potential causes of the behavior of interest; in construct validity, you must rule out other possible theoretical explanations of the results. In either case, you may have to perform another study to rule out a threat to validity. For internal validity, you may find it possible to redesign the study to control for the source of confounding. In the case of construct validity, you must design a new study that will permit a choice between the two competing theoretical explanations of the results.
randomization
Unbiased assignment process that gives each subject an equal and independent chance of being placed in every condition
How is statistical control used as a strategy for achieving control?
Usually, however, that old devil variability cannot be completely exorcised from the experiment. Then it is necessary to use statistical control. Statistical control in the broad sense is synonymous with inferential statistics, the branch of statistics that deals with making decisions in the face of uncertainty.