Psych Chapter 2

Ace your homework & exams now with Quizwiz!

Validity, Reliability, and Power

• There are many ways to define and detect a property such as happiness, so which ways are best? The most important feature of an operational definition is validity, the goodness with which a concrete event defines a property. For example, the concrete event called frequency of smiling is a valid way to define the property called happiness because, as we all know, people tend to smile more often when they feel happy. Do they eat more or talk more or spend more money? Well, maybe. But maybe not. And that's why food consumption or verbal output or financial expenditures would probably be regarded by most people as invalid measures of happiness (though perfectly valid measures of something else). Validity is to some extent in the eye of the beholder, but most beholders would agree that the frequency of smiles is a more valid way to operationally define happiness than is frequency of eating, talking, or spending. • What then is the most important feature of an instrument? Actually, there are two. First, a good instrument has reliability, which is the tendency for an instrument to produce the same measurement whenever it is used to measure the same thing. For example, if a person smiles just as much on Tuesday as on Wednesday, then a smile-detecting instrument should produce identical results on those two days. If it produced different results (i.e., if the instrument detected differences that weren't actually there), it would lack reliability. Second, a good instrument has power, which is an instrument's ability to detect small magnitudes of the property. If a person smiled just slightly more often on Tuesday than on Wednesday, then a good smile-detector should produce different results on those two days. If it produced the same result (i.e., if it failed to detect a small difference that was actually there), hen it would lack power.

Observation: Discovering What People Do

• To observe means to use one's senses to learn about the properties of an event (e.g., a storm or a parade) or an object (e.g., an apple or a person). For example, when you observe a round, red apple, your brain is using the pattern of light that is coming into your eyes to draw an inference about the apple's identity, shape, and color. That kind of informal observation is fine for buying fruit but not for doing science. Why? First, casual observations are notoriously unstable. The same apple may appear red in the daylight and crimson at night or spherical to one person and elliptical to another. Second, casual observations can't tell us about all of the properties that might interest us. No matter how long and hard you look, you will never be able to discern an apple's crunchiness or pectin content simply by watching it. • Luckily, scientists have devised techniques that allow them to overcome these problems. In the first section (Measurement), we'll see how psychologists design instruments and then use them to make measurements. In the second section (Description), we'll see what psychologists do with their measurements once they've made them.

PREVIEW

• WE'LL START BY EXAMINING THE GENERAL PRINCIPLES THAT GUIDE scientific research and distinguish it from every other way of knowing. Next, we'll see that the methods of psychology are meant to answer two basic questions: what do people do, and why do they do it? Psychologists answer the first question by observing and measuring, and they answer the second question by looking for relationships between the things they measure. We'll see that scientific research allows us to draw certain kinds of conclusions and not others, and we'll see that most people have problems thinking critically about scientific evidence. Finally, we'll consider the unique ethical questions that confront scientists who study people and other animals.

Causation

• We observe correlations all the time: between automobiles and pollution, between bacon and heart attacks, between sex and pregnancy. Natural correlations are the correlations observed in the world around us, and although such observations can tell us whether two variables have a relationship, they cannot tell us what kind of relationship these variables have. For example, many studies have found a positive correlation between the amount of violence to which a child is exposed through media such as television, movies, and video games (variable X) and the aggressiveness of the child's behavior (variable Y; Anderson & Bushman, 2001; Anderson et al., 2003; Huesmann et al., 2003). The more media violence a child is exposed to, the more aggressive that child is likely to be. These variables clearly have a relationship— they are imperfectly positively correlated—but why?

Empiricism: How to Know Stuff

• When ancient Greeks sprained their ankles, caught the flu, or accidentally set their togas on fire, they had to choose between two kinds of doctors: dogmatists (from dogmatikos, meaning "belief"), who thought that the best way to understand illness was to develop theories about the body's functions, and empiricists (from empeirikos, meaning "experience"), who thought that the best way to understand illness was to observe sick people. The rivalry between these two schools of medicine didn't last long because the people who went to see dogmatists tended to die, which was bad for business. Today we use the word dogmatism to describe the tendency for people to cling to their assumptions, and the word empiricism to describe the belief that accurate knowledge can be acquired through observation. The fact that we can answer questions about the natural world by examining it may seem painfully obvious to you, but this painfully obvious fact has only recently gained wide acceptance. For most of human history, people trusted authority to answer important questions, and it is only in the last millennium (and especially in the past three centuries) that people have begun to trust their eyes and ears more than their elders.

We Expect and Want

• When two people are presented with the same evidence, they often draw different conclusions. Sir Francis Bacon knew why. "The human understanding, once it has adopted opinions... draws everything else to support and agree with them," thus our "first conclusion colors and brings into conformity with itself all that come after." In other words, our preexisting beliefs color our view of new evidence, causing us to see what we expect to see. As such, evidence often seems to confirm what we believed all along. • This tendency has been widely documented in psychological science. For instance, participants in one study (Darley & Gross, 1983) learned about a little girl named Hannah. One group of participants was told that Hannah came from an affluent family and another group was told that Hannah came from a poor family. All participants were then shown some evidence about Hannah's academic abilities (specifically, they watched a video of Hannah taking a reading test) and were then asked to rate Hannah. Although the video was exactly the same for all participants, those who believed that Hannah was affluent rated her performance more positively than did those who believed that Hannah was poor. What's more, both groups of participants defended their conclusions by citing evidence from the video! Experiments like this one suggest that when we consider evidence, what we see depends on what we expected to see. • Our beliefs aren't the only things that color our views of evidence. Those views are also colored by our preferences and prejudices, our ambitions and aversions, our hopes and needs and wants and dreams. As Bacon noted, "The human understanding is not a dry light, but is infused by desire and emotion which give rise to wishful science. For man prefers to believe what he wants to be true." Research suggests that Bacon was right about this as well. For example, participants in one study (Lord, Ross, & Lepper, 1979) were shown some scientific evidence about the effectiveness of the death penalty. Some of the evidence suggested that the death penalty deterred crime, and some suggested it did not. What did participants make of this mixed bag of evidence? Participants who originally supported the death penalty became even more supportive, and participants who originally opposed the death penalty became even more opposed. In other words, when presented with exactly the same evidence, participants saw what they wanted to see and ended up feeling even more sure about their initial views. Subsequent research has shown that the same pattern emerges when professional scientists are asked to rate the quality of scientific studies that either confirm or disconfirm what they want to believe (Koehler, 1993). • Exactly how do beliefs and desires shape our view of the evidence? People hold different kinds of evidence to different standards. When evidence confirms what we believe or want to believe, we tend to ask ourselves, "can I believe it?" and our answer is usually yes; but when evidence disconfirms what we believe or want to believe, we tend to ask ourselves, "must I believe it?" and the answer is often no (Gilovich, 1991). Can you believe that people with college degrees are happier than people without them? Yes! There are plenty of surveys showing that just such a relationship exists and a reasonable person who studied the evidence could easily defend this conclusion. Now, must you believe it? Well, no. After all, those surveys didn't measure every single person on earth, did they? And if the survey questions had been asked differently they might well have produced different answers, right? A reasonable person who studied the evidence could easily conclude that the relationship between education and happiness is not yet clear enough to warrant an opinion. • Our beliefs and desires also influence which evidence we consider in the first place. Most people surround themselves with others who believe what they believe and want what they want, which means that our friends and families are much more likely to validate our beliefs and desires than to challenge them. Studies also show that when given the opportunity to search for evidence, people preferentially search for evidence that confirms their beliefs and fulfills their desires (Hart et al., 2009). What's more, when people find evidence that confirms their beliefs and fulfills their desires, they tend to stop looking, but when they find evidence that does the opposite, they keep searching for more evidence (Kunda, 1990). • What all of these studies suggest is that evidence leaves room for interpretation, and that's the room in which our beliefs and desires spend most of their time. Because it is so easy to see what we expect to see or to see what we want to see, the first step in critical thinking is simply to doubt your own conclusions. One of the best ways to reduce your own certainty is to seek out people who doubt you and listen carefully to what they have to say. Scientists go out of their way to expose themselves to criticism by sending their papers to the colleagues who are most likely to disagree with them or by presenting their findings to audiences full of critics, and they do this in large part so they can achieve a more balanced view of their own conclusions. If you want to be happy, take your friend to lunch; if you want to be right, take your enemy.

The Skeptical Stance

• Winston Churchill once said that democracy is the worst form of government, except for all the others. Similarly, science is not an infallible method for learning about the world; it's just a whole lot less fallible than the other methods. Science is a human enterprise, and humans make mistakes. They see what they expect to see, they see what they want to see, and they rarely consider what they can't see at all. • What makes science different than most other human enterprises is that science actively seeks to discover and remedy its own biases and errors. Scientists are constantly striving to make their observations more accurate and their reasoning more rigorous, and they invite anyone and everyone to examine their evidence and challenge their conclusions. As such, science is the ultimate democracy—one of the only institutions in the world in which the lowliest nobody can triumph over the most celebrated someone. When an unknown Swiss patent clerk named Albert Einstein challenged the greatest physicists of his day, he didn't have a famous name, a fancy degree, powerful friends, or a fat wallet. He just had evidence. And he prevailed for one reason: His evidence was right. • So think of the remaining chapters in this book as a report from the field—a description of the work that psychological scientists have done as they stumble toward knowledge. These chapters tell the story of the men and women who have put their faith in Sir Francis Bacon's method and used it to pry loose small pieces of the truth about who we are, how we work, and what we are all doing here together on the third stone from the sun. Read it with interest, but also with skepticism. Some of the things we are about to tell you simply aren't true; we just don't yet know which things they are. We invite you to think critically about what you read here, and everywhere else. Now, let the doubting begin.

Description

• You now know how to generate a valid operational definition, how to design a reliable and powerful instrument, and how to use that instrument while avoiding demand characteristics and observer bias. So where does that leave you? With a big page filled with numbers—and if you are like most people, a big page filled with numbers just doesn't seem very informative. Don't worry, most psychologists feel the same way, and that's why they have two techniques for making sense of big pages full of numbers: graphic representations and descriptive statistics.

Explanation: Discovering Why People Do What They Do

It would be interesting to know whether happy people are healthier than unhappy people, but it would be even more interesting to know why. Does happiness make people healthier? Does being healthy make people happier? Does being rich make people healthy and happy? These are the kinds of questions scientists often wish to answer, and scientists have developed some clever ways of using their measurements to do just that. In the first section (Correlation) we'll examine techniques that can tell us whether two things are related. In the second section (Causation), we'll examine techniques that can tell us whether the relationship between two things is one of cause and effect. In the third section (Drawing Conclusions) we'll see what kinds of conclusions these techniques allow us to draw. Finally, in the fourth section, we'll discuss the difficulty that most of us have thinking critically about scientific evidence.

Descriptive Statistics

• A frequency distribution depicts every measurement and thus provides a full and complete picture of those measurements. But sometimes a full and complete picture is just TMI. * When we ask a friend how she's been, we don't want her to show us a frequency distribution of her happiness scores on each day of the previous 6 months. We want a brief summary statement that captures the essential information that such a graph would provide (e.g., "I've been doing pretty well," or, "I've been having some ups and downs lately"). In psychology, brief summary statements that capture the essential information from a frequency distribution are called descriptive statistics. There are two important kinds of descriptive statistics: those that describe the central tendency of a frequency distribution and those that describe the variability in a frequency distribution. • Descriptions of central tendency are statements about the value of the measurements that tend to lie near the center or midpoint of the frequency distribution. When a friend says that she's been "doing pretty well," she is describing the central tendency (or approximate location of the midpoint) of the frequency distribution of her happiness over time (see FIGURE 2.3). The three most common descriptions of central tendency are: the mode (the value of the most frequently observed measurement); the mean (the average value of all the measurements); and the median (the value that is in the middle; i.e., greater than or equal to half the measurements and less than or equal to half the measurements). FIGURE 2.4 shows how each of these descriptive statistics is calculated. When you hear a descriptive statistic such as "the average American college student sleeps 8.3 hours per day," you are hearing about the central tendency of a frequency distribution (in this case, the mean). • In a normal distribution, the mean, median, and mode all have the same value, but when the distribution is not normal, these three descriptive statistics can differ. For example, imagine that you measured the net worth of 40 college professors, and Mark Zuckerberg. The frequency distribution of your measurements would not be normal, but positively skewed. As you can see in FIGURE 2.5, the mode and the median of a positively skewed distribution are much lower than the mean because the mean is more strongly influenced by the value of a single extreme measurement (which, in case you've been sleeping for the last few years, would be the net worth of Mark Zuckerberg). When distributions become skewed, the mean gets dragged off toward the tail, the mode stays home at the hump, and the median goes to live between the two. When distributions are skewed, a single measure of central tendency can paint a misleading picture of the measurements. For example, the average net worth of the people you measured is probably about a billion dollars each, but that statement makes the college professors sound a whole lot richer than they are. You could provide a much better description of the net worth of the people you measured if you also mentioned that the median net worth is $300,000 and that the modal net worth is $288,000. Indeed, you should always be suspicious when you hear some new fact about "the average person" but don't hear anything about the shape of the frequency distribution. • Whereas descriptions of central tendency are statements about the location of the measurements in a frequency distribution, descriptions of variability are statements about the extent to which the measurements differ from each other. When a friend says that she has been "having some ups and downs lately," she is offering a brief summary statement that describes how measurements of her happiness taken at different times tend to differ from one another. The simplest description of variability is the range, which is the value of the largest measurement in a frequency distribution minus the value of the smallest measurement. When the range is small, the measurements don't vary as much as when the range is large. The range is easy to compute, but like the mean it can be dramatically affected by a single measurement. If you said that the net worth of people you had measured ranged from $40,000 to $14 billion, a listener might get the impression that these people were all remarkably different from each other when, in fact, they were all quite similar save for one very rich guy from California. • Other descriptions of variability aren't quite as susceptible to this problem. For example, the standard deviation is a statistic that describes the average difference between the measurements in a frequency distribution and the mean of that distribution. In other words, on average, how far are the measurements from the center of the distribution? As FIGURE 2.6 shows, two frequency distributions can have the same mean, but very different ranges and standard deviations. • For example, studies show that men and women have the same mean IQ, but that men have a larger range and standard deviation, which is to say that a man is more likely than a woman to be much more or much less intelligent than the average person of his or her own gender.

Graphic Representations

• A picture may be worth a thousand words, but it is worth a million digits. As you'll learn in the Sensation and Perception chapter, vision is our most sophisticated sense, and human beings typically find it easier to understand things when they are represented visually than numerically or verbally. Psychologists are people too, and they often create graphic representations of the measurements they collect. The most common kind is the frequency distribution, which is a graphic representation of measurements arranged by the number of times each measurement was made. FIGURE 2.2 shows a pair of frequency distributions that represent the hypothetical performances of a group of men and women who took a test of fine motor skills (i.e., the ability to manipulate things with their hands). Every possible test score is shown on the horizontal axis. The number of times (or the frequency with which) each score was observed is shown on the vertical axis. Although a frequency distribution can have any shape, a common shape is the bell curve, which is technically known as the Gaussian distribution or the normal distribution, which is a mathematically defined distribution in which the frequency of measurements is highest in the middle and decreases symmetrically in both directions. The mathematical definition of the normal distribution isn't important. (Well, for you anyway. For statisticians it is slightly more important than breathing.) What is important for you is what you can easily see for yourself: the normal distribution is symmetrical (i.e., the left half is a mirror image of the right half), has a peak in the middle, and trails off at both ends. • The picture in Figure 2.2 reveals in a single optical gulp what a page full of numbers never can. For instance, the shape of the distributions instantly tells you that most people have moderate motor skills, and that only a few have exceptionally good or exceptionally bad motor skills. You can also see that the distribution of men's scores is displaced a bit to the left of the distribution of women's scores, which instantly tells you that women tend to have somewhat better motor skills than men. And finally, you can see that the two distributions have a great deal of overlap, which tells you that although women tend to have better motor skills than men, there are still plenty of men who have better motor skills than plenty of women.

Respecting People

• During World War II, Nazi doctors performed truly barbaric experiments on human subjects, such as removing organs or submerging them in ice water just to see how long it would take them to die. When the war ended, the international community developed the Nuremberg Code of 1947 and then the Declaration of Helsinki in 1964, which spelled out rules for the ethical treatment of human subjects. Unfortunately, not everyone obeyed them. For example, from 1932 until 1972, the U.S. Public Health Service conducted the infamous Tuskegee experiment in which 399 African American men with syphilis were denied treatment so that researchers could observe the progression of the disease. As one journalist noted, the government "used human beings as laboratory animals in a long and inefficient study of how long it takes syphilis to kill someone" (Coontz, 2008). • In 1974, the U.S. Congress created the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. In 1979, the U.S. Department of Health, Education and Welfare released what came to be known as the Belmont Report, which described three basic principles that all research involving human subjects should follow. First, research should show respect for persons and their right to make decisions for and about themselves without undue influence or coercion. Second, research should be beneficent, which means that it should attempt to maximize benefits and reduce risks to the participant. Third, research should be just, which means that it should distribute benefits and risks equally to participants without prejudice toward particular individuals or groups. • The specific ethical code that psychologists follow incorporates these basic principles and expands them. (You can find the American Psychological Association's Ethical Principles of Psychologists and Code of Conduct (2002) at http://www.apa.org/ethics/code/index. aspx.) Here are a few of the most important rules that govern the conduct of psychological research: o Informed consent: Participants may not take part in a psychological study unless they have given informed consent, which is a written agreement to participate in a study made by an adult who has been informed of all the risks that participation may entail. This doesn't mean that the person must know everything about the study (e.g., the hypothesis), but it does mean that the person must know about anything that might potentially be harmful or painful. If people cannot give informed consent (e.g., because they are minors or are mentally incapable), then informed consent must be obtained from their legal guardians. And even after people give informed consent, they always have the right to withdraw from the study at any time without penalty. o Freedom from coercion: Psychologists may not coerce participation. Coercion not only means physical and psychological coercion but monetary coercion as well. It is unethical to offer people large amounts of money to persuade them to do something that they might otherwise decline to do. College students may be invited to participate in studies as part of their training in psychology, but they are ordinarily offered the option of learning the same things by other means. o Protection from harm: Psychologists must take every possible precaution to protect their research participants from physical or psychological harm. If there are two equally effective ways to study something, the psychologist must use the safer method. If no safe method is available, the psychologist may not perform the study. o Risk-benefit analysis: Although participants may be asked to accept small risks, such as a minor shock or a small embarrassment, they may not even be asked to accept large risks, such as severe pain, psychological trauma, or any risk that is greater than the risks they would ordinarily take in their everyday lives. Furthermore, even when participants are asked to take small risks, the psychologist must first demonstrate that these risks are outweighed by the social benefits of the new knowledge that might be gained from the study. o Deception: Psychologists may only use deception when it is justified by the study's scientific, educational, or applied value and when alternative procedures are not feasible. They may never deceive participants about any aspect of a study that could cause them physical or psychological harm or pain. o Debriefing: If a participant is deceived in any way before or during a study, the psychologist must provide a debriefing, which is a verbal description of the true nature and purpose of a study. If the participant was changed in any way (e.g., made to feel sad), the psychologist must attempt to undo that change (e.g., ask the person to do a task that will make him or her happy) and restore the participant to the state he or she was in before the study. o Confidentiality: Psychologists are obligated to keep private and personal information obtained during a study confidential. • These are just some of the rules that psychologists must follow. But how are those rules enforced? Almost all psychology studies are done by psychologists who work at colleges and universities. These institutions have institutional review boards (IRBs) that are composed of instructors and researchers, university staff, and laypeople from the community (e.g., business leaders or members of the clergy). If the research is federally funded (as much research is), then the law requires that the IRB include at least one nonscientist and one person who is not affiliated with the institution. A psychologist may conduct a study only after the IRB has reviewed and approved it. • As you can imagine, the code of ethics and the procedure for approval are so strict that many studies simply cannot be performed anywhere, by anyone, at any time. For example, psychologists would love to know how growing up without exposure to language affects a person's subsequent ability to speak and think, but they cannot ethically manipulate that variable in an experiment. They can only study the natural correlations between language exposure and speaking ability, and so may never be able to firmly establish the causal relationships between these variables. Indeed, there are many questions that psychologists will never be able to answer definitively because doing so would require unethical experiments that violate basic human rights

The Scientific Method

• Empiricism is the essential element of the scientific method, which is a procedure for finding truth by using empirical evidence. In essence, the scientific method suggests that when we have an idea about the world—about how bats navigate, or where the moon came from, or why people can't forget traumatic events—we should gather empirical evidence relevant to that idea and then modify the idea to fit with the evidence. Scientists usually refer to an idea of this kind as a theory, which is a hypothetical explanation of a natural phenomenon. We might theorize that bats navigate by making sounds and then listening for the echo, that the moon was formed when a small planet collided with the Earth, or that the brain responds to traumatic events by producing chemicals that facilitate memory. Each of these theories is an explanation of how something in the natural world works. • When scientists set out to develop a theory, they generally follow the rule of parsimony, which says that the simplest theory that explains all the evidence is the best one. Parsimony comes from the Latin word parcere, meaning "to spare," and the rule is often credited to the 14th-century logician William Ockham, who wrote "Plurality should not be posited without necessity." Ockham wasn't suggesting that nature is simple or that complex theories are wrong. He was merely suggesting that it makes sense to start with the simplest theory and then make the theory more complicated only if one must. Part of what makes E = mc 2 such a lovely theory is that it has exactly three letters and one number. • We want our theories to be as simple as possible, but we also want them to be right. How do we decide if a theory is right? Theories make specific predictions about what we should observe in the world. For example, if bats really do navigate by making sounds and then listening for echoes, then we should observe that deaf bats can't navigate. That "should statement" is technically known as a hypothesis, which is a falsifiable prediction made by a theory. The word falsifiable is a critical part of that definition. Some theories, such as "God created the universe," simply do not specify what we should observe if they are true, and thus no observation can ever falsify them. Because these theories do not give rise to hypotheses, they can never be the subject of scientific investigation. That doesn't mean they're wrong—it just means that we can't evaluate them by using the scientific method. • So what happens when we test a hypothesis? Albert Einstein is reputed to have said that, "No amount of experimentation can ever prove me right, but a single experiment can prove me wrong." Why should that be? Well, just imagine what you could possibly learn about the navigation-by-sound theory if you observed a few bats. If you saw the deaf bats navigating every bit as well as the hearing bats, then the navigation by-sound theory would instantly be proved wrong; but if you saw the deaf bats navigating more poorly than the hearing bats, your observation would be consistent with the navigation-by-sound theory but would not prove it. After all, even if you didn't see a deaf bat navigating perfectly today, it is still possible that someone else did, or that you will see one tomorrow. We can't observe every bat that has ever been and will ever be, which means that even if the theory wasn't disproved by your observation there always remains some chance that it will be disproved by some other observation. When evidence is consistent with a theory, it increases our confidence in it, but it never makes us completely certain. The next time you see a newspaper headline that says "Scientists prove theory X correct," you are hereby authorized to roll your eyes. • The scientific method suggests that the best way to learn the truth about the world is to develop theories, derive hypotheses from them, test those hypotheses by gathering evidence, and then use that evidence to modify the theories. But what exactly does gathering evidence entail?

The Art of Looking

• For centuries, people rode horses. And for centuries when they got off their horses they sat around and argued about whether all four of a horse's feet ever leave the ground at the same time. Some said yes, some said no, and some said they really wished they could talk about something else for a change. In 1877, Eadweard Muybridge invented a technique for taking photographs in rapid succession, and his photos showed that when horses gallop, all four feet do indeed leave the ground. And that was that. Never again did two riders have the pleasure of a flying horse debate because Muybridge had settled the matter, once and for all time. • But why did it take so long? After all, people had been watching horses gallop for quite a few years, so why did some say that they clearly saw the horse going airborne while others said that they clearly saw at least one hoof on the ground at all times? Because as wonderful as eyes may be, there are a lot of things they cannot see and a lot of things they see incorrectly. We can't see germs but they are very real. The Earth looks perfectly flat but it is imperfectly round. As Muybridge knew, we have to do more than just look if we want to know the truth about the world. Empiricism is the right approach, but to do it properly requires an empirical method, which is a set of rules and techniques for observation. • In many sciences, the word method refers primarily to technologies that enhance the powers of the senses. Biologists use microscopes and astronomers use telescopes because the things they want to observe are invisible to the naked eye. Human behavior, on the other hand, is quite visible, so you might expect psychology's methods to be relatively simple. In fact, the empirical challenges facing psychologists are among the most daunting in all of modern science, thus psychology's empirical methods are among the most sophisticated in all of modern science. These empirical challenges arise because people have three qualities that make them unusually difficult to study: o Complexity: No galaxy, particle, molecule, or machine is as complicated as the human brain. Scientists can describe the birth of a star or the death of a cell in exquisite detail, but they can barely begin to say how the 500 million interconnected neurons that constitute the brain give rise to the thoughts, feelings, and actions that are psychology's core concerns. o Variability: In almost all the ways that matter, one E. coli bacterium is pretty much like another. But people are as varied as their fingerprints. No two individuals ever do, say, think, or feel exactly the same thing under exactly the same circumstances, which means that when you've seen one, you've most definitely not seen them all. o Reactivity: An atom of cesium-133 oscillates 9,192,631,770 times per second regardless of whether anyone is watching. But people often think, feel, and act one way when they are being observed and a different way when they are not. When people know they are being studied, they don't always behave as they otherwise would. • The fact that human beings are complex, variable, and reactive presents a major challenge to the scientific study of their behavior, and psychologists have developed two kinds of methods that are designed to meet these challenges head-on: methods of observation, which allow them to determine what people do, and methods of explanation, which allow them to determine why people do it. We'll examine both of these methods in the sections that follow.

Measurement

• For most of human history, people had no idea how old they were because there was no simple way to keep track of time—or weight, or volume, or density, or temperature, or anything else for that matter. Today we live in a world of rulers, clocks, calendars, odometers, thermometers, and mass spectrometers. Measurement is not just a basic part of science, it is a basic part of modern life. But what exactly does measurement require? Whether we want to measure the intensity of an earthquake, the distance between molecules, or the attitude of a registered voter, we must always do two things—define the property we wish to measure and then find a way to detect it.

Correlation

• How much sleep did you get last night? Okay, now, how many U. S. presidents can you name? If you asked a dozen college students those two questions, you'd probably find that the students who got a good night's sleep are better president namers than are students who pulled an all-nighter. A pattern of responses like the one shown in TABLE 2.1 would probably lead you to conclude that sleep deprivation causes memory problems. But on what basis did you draw that conclusion? How did you manage to use measurement to tell you not only about how much sleeping and remembering had occurred among the students you measured, but also about the relationship between sleeping and remembering?

Drawing Conclusions

• If we applied all the techniques discussed so far, we could design an experiment that had a very good chance of establishing the causal relationship between two variables. That experiment would have internal validity, which is an attribute of an experiment that allows it to establish causal relationships. When we say that an experiment is internally valid, we mean that everything inside the experiment is working exactly as it must in order for us to draw conclusions about causal relationships. But what exactly are those conclusions? If our imaginary experiment revealed a difference between the aggressiveness of children in the exposed and unexposed groups, then we could conclude that media violence as we defined it caused aggression as we defined it in the people whom we studied. Notice those phrases in italics. Each corresponds to an important restriction on the kinds of conclusions we can draw from an experiment, so let's consider each in turn.

Measuring the Direction and Strength of a correlation

• If you predict that a sleep-deprived person will have better memory than a well-rested person, you will be right more often than wrong. But you won't be right in every single instance. Statisticians have developed a way to estimate how accurate such predictions are likely to be by measuring the direction and strength of the correlation on which the predictions are based. • Direction is easy to measure because the direction of a correlation is either positive or negative. A positive correlation exists when two variables have a "more-is-more" or "less-is-less" relationship. So, for example, when we say that more sleep is associated with more memory or that less sleep is associated with less memory, we are describing a positive correlation. Conversely, a negative correlation exists when two variables have a "more-is-less" or "less-is-more" relationship. When we say that more cigarette smoking is associated with less longevity or that less cigarette smoking is associated with more longevity, we are describing a negative correlation. • The direction of a correlation is easy to measure, but the strength is a little more complicated. The correlation coefficient is a mathematical measure of both the direction and strength of a correlation and it is symbolized by the letter r (as in "relationship"). Like most measures, the correlation coefficient has a limited range. What does that mean? Well, if you were to measure the number of hours of sunshine per day in your hometown, that number could range from 0 to 24. Numbers such as -7 and 36.8 would be meaningless. Similarly, the value of r can range from -1 to 1, and numbers outside that range are meaningless. What, then, do the numbers inside that range mean? o If every time the value of one variable increases by a fixed amount the value of the second variable also increases by a fixed amount, then the relationship between the variables is called a perfect positive correlation and r = 1 (see FIGURE 2.7a). For example, if every 30-minute increase in sleep was associated with a 2-president increase in memory, then sleep and memory would be perfectly positively correlated. o If every time the value of one variable increases by a fixed amount the value of the second variable decreases by a fixed amount, then the relationship between the variables is called a perfect negative correlation and r = -1 (see FIGURE 2.7b). For example, if every 30-minute increase in sleep was associated with a 2-president decrease in memory, then sleep and memory would be perfectly negatively correlated. o If every time the value of one variable increases by a fixed amount the value of the second variable neither increases nor decreases systematically, then the two variables are said to be uncorrelated and r = 0 (see FIGURE 2.7c). For example, if a 30-minute increase in sleep was sometimes associated with an increase in memory, sometimes associated with a decrease in memory, and sometimes associated with no change in memory at all, then sleep and memory would be uncorrelated. • Perfect correlations are extremely rare. As you'll learn in the Consciousness chapter, sleep really does enhance memory performance, but the relationship is not perfect. It isn't as though every 18 minutes of sleep buys you exactly one third of a remembered president! Sleep and memory are positively correlated (i.e., as one increases, the other also increases), but they are imperfectly correlated, thus r will lie somewhere between 0 and 1. But where? That depends on how many exceptions there are to the "X more minutes of sleep = Y more presidents remembered" rule. If there are only a few exceptions, then r will lie much closer to 1 than to 0. But as the number of exceptions increases, then the value of r will begin to move toward 0. • FIGURE 2.8 shows four cases in which two variables are positively correlated but have different numbers of exceptions, and as you can see, the number of exceptions changes the value of r quite dramatically. Two variables can have a perfect correlation (r = 1), a strong correlation (e.g., r = .90), a moderate correlation (e.g., r = .70), or a weak correlation (e.g., r = .30). The correlation coefficient, then, is a measure of both the direction and strength of the relationship between two variables. The sign of r (plus or minus) tells us the direction of the relationship and the absolute value of r (between 0 and 1) tells us about the number of exceptions and hence about how confident we can be when using the correlation to make predictions

Thinking Critically about Evidence

• In 1620, Sir Francis Bacon published a book called Novum Organum in which he described a new method for discovering the truth about the natural world. Today, his so-called Baconian Method is more simply known as the scientific method, and that method has allowed human beings to learn more in the last four centuries than in all the previous centuries combined. • As you've seen in this chapter, the scientific method allows us to produce empirical evidence. But empirical evidence is only useful if we know how to think about it, and the fact is that most of us don't. Using evidence requires critical thinking, which involves asking ourselves tough questions about whether we have interpreted the evidence in an unbiased way, and about whether the evidence tells not just the truth, but the whole truth. Research suggests that most people have trouble doing both of these things, and that educational programs designed to teach or improve critical thinking skills are not particularly effective (Willingham, 2007). Why do people have so much trouble thinking critically? • Consider the armadillo. Some animals freeze when threatened, and others duck, run, or growl. Armadillos jump. This natural tendency served armadillos quite well for millennia because for millennia the most common threat to an armadillo's well-being was a rattlesnake. Alas, this natural tendency serves armadillos rather poorly today because when they wander onto a Texas highway and are threatened by a speeding car, they jump up and hit the bumper. This is a mistake no armadillo makes twice. • Human beings also have natural tendencies that once served them well but no longer do. Our natural and intuitive ways of thinking about evidence, for example, worked just fine when we were all hunter-gatherers living in small groups on the African savannah. But now most of us live in large-scale, complex societies, and these natural ways of thinking interfere with our ability to reason in the modern world. Sir Francis Bacon understood this quite well. In the very same book in which he developed the scientific method, he argued that two ancient and all-too-human tendencies-to see what we expect or want to see and to ignore what we can't see-are the enemies of critical thinking. We See What

We consider What We See and Ignore What We Don't

• In another part of his remarkable book, Sir Francis Bacon recounted an old story about a man who visited a Roman temple. The priest showed the man a portrait of several sailors who had taken religious vows and then miraculously survived a shipwreck, and suggested that this was clear evidence of the power of the gods. The visitor paused a moment and then asked precisely the right question: "But where are the pictures of those who perished after taking their vows?" According to Bacon, most of us never think to ask this kind of question. We consider the evidence we can see and forget about the evidence we can't. Bacon claimed that "little or no attention is paid to things invisible" and he argued that this natural tendency was "the greatest impediment and aberration of the human understanding." • Bacon was right when he claimed that people rarely consider what they can't see. For example, participants in one study (Newman, Wolff, & Hearst, 1980) played a game in which they were shown a set of trigrams, which are three-letter combinations such as SXY, GTR, BCG, and EVX. On each trial, the experimenter pointed to one of the trigrams in the set and told the participants that this trigram was the special one. The participants' job was to figure out what made the special trigram so special. How many trials did it take before participants figured it out? It depended on the trigram's special feature. For half the participants, the special trigram was always the one that contained the letter T, and participants in this condition needed to see about 34 sets of trigrams before they figured out that the presence of T was what made the trigram special. But for the other half of the participants, the special trigram was always the one that lacked the letter T. How many trials did it take before participants figured it out? They never figured it out. Never. What this study shows is that we naturally consider the evidence we see and rarely, if ever, consider the evidence we don't. • The tendency to ignore missing evidence can cause us to draw all kinds of erroneous conclusions. Consider a study in which participants were randomly assigned to play one of two roles in a game (Ross, Amabile, & Steinmetz, 1977). The "quizmasters" were asked to make up a series of difficult questions, and the "contestants" were asked to answer them. If you give this a quick try, you will discover that it's very easy to generate questions that you can answer but that most other people cannot. For example, think of the last city you visited. Now give someone the name of the hotel you stayed in and ask them what street it's on. Very few will know. • So participants who were cast in the role of quizmaster asked lots of clever-sounding questions and participants who were cast in the role of contestant gave lots of wrong answers. Now comes the interesting part. Quizmasters and contestants played this game while another participant—the observer—watched. After the game was over, the observer was asked to make some guesses about what the players were like in their everyday lives. The results were clear: observers consistently concluded that the quizmaster was a more knowledgeable person than the contestant! Observers saw the quizmaster asking sharp questions and saw the contestant saying "um, gosh, I don't know," and observers considered this evidence. What they failed to consider was the evidence they did not see. Specifically, they failed to consider what would have happened if the person who had been assigned to play the role of quizmaster had instead been assigned to play the role of contestant, and vice versa. If that had happened, then surely the contestant would have been the one asking clever questions and the quizmaster would have been the one struggling to answer them. Bottom line? If the first step in critical thinking is to doubt what you do see, then the second step is to consider what you don't.

Respecting Truth

• Institutional review boards ensure that data are collected ethically. But once the data are collected, who ensures that they are ethically analyzed and reported? No one does. Psychology, like all sciences, works on the honor system. No authority is charged with monitoring what psychologists do with the data they've collected, and no authority is charged with checking to see if the claims they make are true. You may find that a bit odd. After all, we don't use the honor system in stores ("Take the television set home and pay us next time you're in the neighborhood"), banks ("I don't need to look up your account, just tell me how much money you want to withdraw"), or courtrooms ("If you say you're innocent, well then, that's good enough for me"), so why would we expect it to work in science? Are scientists more honest than everyone else? • Definitely! Okay, we just made that up. But the honor system doesn't depend on scientists being especially honest, but on the fact that science is a community enterprise. When scientists claim to have discovered something important, other scientists don't just applaud, they start studying it too. When physicist Jan Hendrik Schön announced in 2001 that he had produced a molecular-scale transistor, other physicists were deeply impressed—that is, until they tried to replicate his work and discovered that Schön had fabricated his data (Agin, 2007). Schön lost his job and his doctoral degree was revoked, but the important point is that such frauds can't last long because one scientist's conclusion is the next scientist's research question. This doesn't mean that all frauds are uncovered swiftly: psychologist Diederik Stapel lied, cheated, and made up his data for decades before people became suspicious enough to investigate (Levelt Committee, Noort Committee, Drenth Committee, 2012). But it does mean that the important frauds are uncovered eventually. The psychologist who fraudulently claims to have shown that chimps are smarter than goldfish may never get caught because no one is likely to follow up on such an obvious finding, but the psychologist who fraudulently claims to have shown the opposite will soon have a lot of explaining to do. • What exactly are psychologists on their honor to do? At least three things. First, when they write reports of their studies and publish them in scientific journals, psychologists are obligated to report truthfully on what they did and what they found. They can't fabricate results (e.g., claiming to have performed studies that they never really performed) or fudge results (e.g., changing records of data that were actually collected), and they can't mislead by omission (e.g., by reporting only the results that confirm their hypothesis and saying nothing about the results that don't). Second, psychologists are obligated to share credit fairly by including as co-authors of their reports the other people who contributed to the work, and by mentioning in their reports the other scientists who have done related work. And third, psychologists are obligated to share their data. The American Psychological Association's code of conduct states that ethical psychologists "do not withhold the data on which their conclusions are based from other competent professionals who seek to verify the substantive claims through reanalysis." The fact that anyone can check up on anyone else is part of why the honor system works as well as it does.

Respecting Animals

• Not all research participants have human rights because not all research participants are human. Some are chimpanzees, rats, pigeons, or other nonhuman animals. The American Psychological Association's code specifically describes the special rights of these nonhuman participants, and some of the more important ones are these: o All procedures involving animals must be supervised by psychologists who are trained in research methods and experienced in the care of laboratory animals and who are responsible for ensuring appropriate consideration of the animal's comfort, health, and humane treatment. o Psychologists must make reasonable efforts to minimize the discomfort, infection, illness, and pain of animals. o Psychologists may use a procedure that subjects an animal to pain, stress, or privation only when an alternative procedure is unavailable and when the procedure is justified by the scientific, educational, or applied value of the study. o Psychologists must perform all surgical procedures under appropriate anesthesia and must minimize an animal's pain during and after surgery. • That's good—but is it good enough? Some people don't think so. For example, philosopher Peter Singer (1975) argued that all creatures capable of feeling pain have the same fundamental rights, and that treating nonhumans differently than humans is a form of speciesism that is every bit as abhorrent as racism or sexism. Singer's philosophy has inspired groups such as People for the Ethical Treatment of Animals to call for an end to all research involving nonhuman animals. Unfortunately, it has also inspired some groups to attack psychologists who legally conduct such research. As two researchers (Ringach & Jentsch, 2009, p. 11417) recently reported: We have seen our cars and homes firebombed or flooded, and we have received letters packed with poisoned razors and death threats via e-mail and voicemail. Our families and neighbors have been terrorized by angry mobs of masked protesters who throw rocks, break windows, and chant that "you should stop or be stopped" and that they "know where you sleep at night." Some of the attacks have been cataloged as attempted murder. Adding insult to injury, misguided animal-rights militants openly incite others to violence on the Internet, brag about the resulting crimes, and go as far as to call plots for our assassination "morally justifiable." • Where do most people stand on this issue? The vast majority of Americans consider it morally acceptable to use nonhuman animals in research and say they would reject a governmental ban on such research (Kiefer, 2004; Moore, 2003). Indeed, most Americans eat meat, wear leather, and support the rights of hunters, which is to say that most Americans see a sharp distinction between animal and human rights. Science is not in the business of resolving moral controversies, and every individual must draw his or her own conclusions about this issue. But whatever position you take, it is important to note that only a small percentage of psychological studies involve animals, and only a small percentage of those studies cause animals pain or harm. Psychologists mainly study people, and when they do study animals, they mainly study their behavior.

Demand characteristics

• Once we have a valid definition and a reliable and powerful instrument, are we are finally ready to measure behavior? Yes, as long as we want to measure the behavior of an amoeba or a raindrop or anything else that doesn't care if we are watching it. But if we want to measure the behavior of a human being, then we still have some work to do, because while we are trying to discover how people normally behave, normal people will be trying to behave as they think we want or expect them to. Demand characteristics are those aspects of an observational setting that cause people to behave as they think someone else wants or expects. We call these demand characteristics because they seem to "demand" or require that people say and do certain things. When someone you love asks, "Do these jeans make me look fat?" the right answer is always no, and if you've ever been asked this question, then you have experienced demand. Demand characteristics make it hard to measure behavior as it typically unfolds. • One way that psychologists avoid the problem of demand characteristics is by observing people without their knowledge. Naturalistic observation is a technique for gathering scientific information by unobtrusively observing people in their natural environments. For example, naturalistic observation has shown that the biggest groups leave the smallest tips in restaurants (Freeman et al., 1975), that hungry shoppers buy the most impulse items at the grocery store (Gilbert, Gill, & Wilson, 2002), that golfers are most likely to cheat when they play several opponents at once (Erffmeyer, 1984), that men do not usually approach the most beautiful woman at a singles' bar (Glenwick, Jason, & Elman, 1978), and that Olympic athletes smile more when they win the bronze medal than the silver medal (Medvec, Madey, & Gilovich, 1995). Each of these conclusions is the result of measurements made by psychologists who observed people who didn't know they were being observed. It seems unlikely that the same observations could have been made if the diners, shoppers, golfers, singles, and athletes had realized that they were being scrutinized. • Unfortunately, naturalistic observation isn't always a viable solution to the problem of demand characteristics. First, some of the things psychologists want to observe simply don't occur naturally. If we wanted to know whether people who have undergone sensory deprivation perform poorly on motor tasks, we would have to hang around the shopping mall for a very long time before a few dozen blindfolded people with earplugs just happened to wander by and start typing. Second, some of the things that psychologists want to observe can only be gathered from direct interaction with a person, for example, by administering a survey, giving tests, conducting an interview, or hooking someone up to a machine. If we wanted to know how often people worry about dying, how accurately they can remember their high school graduations, how quickly they can solve a logic puzzle, or how much electrical activity their brains produce when they feel jealous, then simply watching them from the bushes won't do. Luckily, there are other ways to avoid demand characteristics. For instance, people are less likely to be influenced by demand characteristics when they cannot be identified as the originators of their actions, and psychologists often take advantage of this fact by allowing people to respond privately (e.g., by having them complete questionnaires when they are alone) or anonymously (e.g., by not collecting personal information, such as the person's name or address). Another technique that psychologists often use to avoid demand characteristics is to measure behaviors that cannot easily be demanded. For instance, a person's behavior can't be influenced by demand characteristics if that behavior isn't under the person's voluntary control. You may not want a psychologist to know that you are extremely interested in the celebrity gossip magazine that she's asked you to read, but you can't prevent your pupils from dilating, which is what they do when you are engaged. Behaviors are also unlikely to be influenced by demand characteristics when people don't know that the demand and the behavior are related. For example, you may want the psychologist to believe that you are concentrating hard on the Wall Street Journal article that she's asked you to read, but you probably don't realize that your blink rate slows when you are concentrating, thus you probably won't fake a slow blink. • One of the best ways to avoid demand characteristics is to keep the people who are being observed from knowing the true purpose of the observation. When people are "blind" to the purpose of an observation, they can't behave the way they think they should behave because they don't know how they should behave. For instance, if you didn't know that a psychologist was studying the effects of music on mood, you wouldn't feel obligated to smile when music was played. This is why psychologists typically don't reveal the true purpose of an observation to the people who are being observed until the study is over. • Of course, people are clever and curious, and when psychologists don't tell them the purpose of their observations, people generally try to figure it out for themselves. That's why psychologists sometimes use cover stories, or misleading explanations that are meant to keep people from discerning the true purpose of an observation. For example, if a psychologist wanted to know how music influenced your mood, he or she might falsely tell you that the purpose of the study was to determine how quickly people can do logic puzzles while music plays in the background. (We will discuss the ethical implications of deceiving people later in this chapter.) In addition, the psychologist might use filler items, or pointless measures that are designed to mislead you about the true purpose of the observation. So, for example, the psychologist might ask you a few questions whose answers are of real interest to him or her (How happy are you right now?), as well as a few questions whose answers are not (Do you like cats more or less than dogs?). This makes it difficult for you to guess the true purpose of the observation from the nature of the questions you were asked

Random Assignment

• Once we have manipulated an independent variable and measured a dependent variable, we've done one of the two things that experimentation requires. The second thing is a little less intuitive but equally important. Imagine that we began our exposure and aggression experiment by finding a group of children and asking each child whether he or she would like to be in the experimental group or the control group. Imagine that half the children said that they'd like to play violent video games and the other half said they would rather not. Imagine that we let the children do what they wanted to do, measured aggression some time later, and found that the children who had played the violent video games were more aggressive than those who had not. Would this experiment allow us to conclude that playing violent video games causes aggression? Definitely not—but why not? After all, we switched exposure on and off like a cell phone, and we watched to see whether aggression went on and off too. So where did we go wrong? • We went wrong when we let the children decide for themselves whether or not they would play violent video games. After all, children who ask to play such games are probably different in many ways from those who ask not to. They may be older, or stronger, or smarter—or younger, weaker, or dumber—or less often supervised or more often supervised. The list of possible differences goes on and on. The whole point of doing an experiment was to divide children into two groups that differed in only one way, namely, in terms of their exposure to media violence. The moment we let the children decide for themselves whether they would be in the experimental group or the control group, we had two groups that differed in countless ways, and any of those countless differences could have been a third variable that was responsible for any differences we may have observed in their measured aggression. Self-selection is a problem that occurs when anything about a person determines whether he or she will be included in the experimental or control group. Just as we cannot allow nature to decide which of the children in our study is exposed to media violence, we cannot allow the children to decide either. Okay, then who decides? • The answer to this question is a bit spooky: No one does. If we want to be sure that there is one and only one difference between the children in our study who are and are not exposed to media violence, then their inclusion in the experimental or control groups must be randomly determined. If you flipped a coin and a friend asked what had caused it to land heads up, you would correctly say that nothing had. This is what it means for the outcome of a coin flip to be random. Because the outcome of a coin flip is random, we can put coin flips to work for us to solve the problem that self-selection creates. If we want to be sure that a child's inclusion in the experimental or control group was not caused by nature, was not caused by the child, and was not caused by any of the countless third variables we could name if we only had the time to name them, then all we have to do is let it be caused by the outcome of a coin flip—which itself has no cause! For example, we could walk up to each child in our experiment, flip a coin, and, if the coin lands heads up, assign the child to play violent video games. If the coin lands heads down, then we could assign the child to play no violent video games. Random assignment is a procedure that lets chance assign people to the experimental or the control group. • What would happen if we assigned children to groups with a coin flip? As FIGURE 2.12 shows, the first thing we would expect is that about half the children would be assigned to play violent video games and about half would not. Second—and much more important—we could expect the experimental group and the control group to have roughly equal numbers of supervised kids and unsupervised kids, roughly equal numbers of emotionally stable and unstable kids, roughly equal numbers of big kids and small kids, of active kids, fat kids, tall kids, funny kids, and kids with blue hair named Larry McSweeny. In other words, we could expect the two groups to have roughly equal numbers of kids who are anything-you-can-ever-name-and-everything you-can't! Because the kids in the two groups will be the same on average in terms of height, weight, emotional stability, adult supervision, and every other variable in the known universe except the one we manipulated, we can be sure that the variable we manipulated (exposure) was the one and only cause of any changes in the variable we measured (aggression). Because exposure was the only difference between the two groups of children when we started the experiment, it must be the cause of any differences in aggression we observe at the end of the experiment

The Third-Variable Problem

• One possibility is that exposure to media violence (X) causes aggressiveness (Y). For example, media violence may teach children that aggression is a reasonable way to vent anger and solve problems. A second possibility is that aggressiveness (Y) causes children to be exposed to media violence (X). For example, children who are naturally aggressive may be especially likely to seek opportunities to play violent video games or watch violent movies. A third possibility is that a third variable (Z) causes children to be aggressive (Y) and to be exposed to media violence (X), neither of which is causally related to the other. For example, lack of adult supervision (Z) may allow children to get away with bullying others and to get away with watching television shows that adults would normally not allow. If so, then being exposed to media violence (X) and behaving aggressively (Y) may not be causally related to each other at all and may instead be the independent effects of a lack of adult supervision (Z). In other words, the relation between aggressiveness and exposure to media violence may be a case of thirdvariable correlation, which means that two variables are correlated only because each is causally related to a third variable. FIGURE 2.9 shows three possible causes of any correlation. • How can we determine by simple observation which of these three possibilities best describes the relationship between exposure to media violence and aggressiveness? Take a deep breath. The answer is: We can't. When we observe a natural correlation, the possibility of third-variable correlation can never be dismissed. But don't take this claim on faith. Let's try to dismiss the possibility of third-variable correlation and you'll see why such efforts are always doomed to fail. • The most straightforward way to determine whether a third variable, such as lack of adult supervision (Z), causes both exposure to media violence (X) and aggressive behavior (Y) is to eliminate differences in adult supervision (Z) among a group of children and see if the correlation between exposure (X) and aggressiveness (Y) is eliminated too. For example, we could observe children using the matched samples technique, which is a technique whereby the participants in two groups are identical in terms of a third variable (see FIGURE 2.10). For instance, we could measure only children who are supervised by an adult exactly Q% of the time, thus ensuring that every child who was exposed to media violence had exactly the same amount of adult supervision as every child who was not exposed. Alternatively, we could observe children using the matched pairs technique, which is a technique whereby each participant is identical to one other participant in terms of a third variable. We could measure children who have different amounts of adult supervision, but we could make sure that for every child we measure who is exposed to media violence and is supervised Q% of the time, we also observe a child who is not exposed to media violence and is supervised Q% of the time, thus ensuring that children who are and are not exposed to media violence have the same amount of adult supervision on average. Regardless of which technique we used, we would know that children who were and were not exposed to media violence had equal amounts of adult supervision on average. So if those who were exposed are on average more aggressive than those who were not exposed, we can be sure that lack of adult supervision was not the cause of this difference. • So we solved the problem, right? Well, not exactly. The matched samples technique and the matched pairs technique can be useful, but neither eliminates the possibility of third-variable correlation entirely. Why? Because even if we used these techniques to dismiss a particular third variable (such as lack of adult supervision), we would not be able to dismiss all third variables. For example, as soon as we finished making these observations, it might suddenly occur to us that emotional instability could cause children to gravitate toward violent television or video games and to behave aggressively. Emotional instability would be a new third variable (Z) and we would have to design a new test to investigate whether it explains the correlation between exposure (X) and aggression (Y). Unfortunately, we could keep dreaming up new third variables all day long without ever breaking a sweat, and every time we dreamed one up, we would have to rush out and do a new test using matched samples or matched pairs to determine whether this third variable was the cause of the correlation between exposure and aggressiveness. • Are you starting to see the problem? There are a humongous number of third variables, so there are a humongous number of reasons why X and Y might be correlated. • And because we can't perform a humongous number of studies with matched samples or matched pairs, we cannot be absolutely sure that the correlation we observe between X and Y is evidence of a causal relationship between them. The third-variable problem refers to the fact that a causal relationship between two variables cannot be inferred from the naturally occurring correlation between them because of the ever-present possibility of third variable correlation. In other words, if we care about causality, then naturally occurring correlations just won't tell us what we really want to know. Luckily, another technique will.

Representative People

• Our imaginary experiment on exposure to media violence and aggression would allow us to conclude that exposure as we defined it caused aggression as we defined it in the people whom we studied. That last phrase represents another important restriction on the kinds of conclusions we can draw from experiments. Who are the people whom psychologists study? Psychologists rarely observe an entire population, which is a complete collection of people, such as the population of human beings (about 7 billion), the population of Californians (about 38 million), or the population of people with Down syndrome (about 1 million). Rather, they observe a sample, which is a partial collection of people drawn from a population. How big can a sample be? The size of a population is signified by the uppercase letter N, the size of a sample is signified by the lowercase letter n, so 0 < n < N. If you read this as an emoticon it means . . . oh well, never mind. • In most studies n is closer to 0 than to N, and in some cases n = 1. For example, sometimes single individuals are so remarkable that they deserve close study, and when psychologists study them they use the case method, which is a procedure for gathering scientific information by studying a single individual. We can learn a lot about memory by studying someone like Akira Haraguchi, who can recite the first 100,000 digits of pi; about consciousness by studying someone like Henry Molaison, whose ability to look backward and forward in time was destroyed by damage to his brain; about intelligence and creativity by studying someone like 14-year-old Jay Greenburg, whose musical compositions have been recorded by the Julliard String Quartet and the London Symphony Orchestra. Cases such as these are interesting in their own right, but they also provide important insights into how the rest of us work. • Of course, most of the psychological studies you will read about in this book included samples of ten, a hundred, a thousand, or a few thousand people. So how do psychologists decide which people to include in their samples? One way to select a sample from a population is by random sampling, which is a technique for choosing participants that ensures that every member of a population has an equal chance of being included in the sample. When we randomly sample participants from a population, the sample is said to be representative of the population. This allows us to generalize from the sample to the population—that is, to conclude that what we observed in our sample would also have been observed if we had measured the entire population. You probably already have solid intuitions about the importance of random sampling. For example, if you stopped at a farm stand to buy a bag of cherries and the farmer offered to let you taste a few that he had handpicked from the bag, you'd be reluctant to generalize from that sample to the population of cherries in the bag. But if the farmer invited you to pull a few cherries from the bag at random, you'd probably be willing to take those cherries as representative of the cherry population. • Random sampling sounds like such a good idea that you might be surprised to learn that most psychological studies involve nonrandom samples—and that most psychologists don't mind. Indeed, virtually every participant in every psychology experiment you will ever read about was a volunteer, and most were college students who were significantly younger, smarter, healthier, wealthier, and Whiter than the average Earthling. About 96% of the people whom psychologists study come from countries that have just 12% of the world's population, and 70% come from the United States alone (Henrich, Heine, & Norenzayan, 2010). • So why do psychologists sample nonrandomly? They have no choice. Even if there were a computerized list of all the world's human inhabitants from which we could randomly choose our research participants, how would we ever find the 72-year-old Bedouin woman whose family roams the desert so that we could measure the electrical activity in her brain while she watched cartoons? How would we convince the 3-week-old infant in New Delhi to complete a lengthy questionnaire about his political beliefs? Most psychology experiments are conducted by professors and graduate students at colleges and universities in the Western hemisphere, and as much as they might like to randomly sample the population of the planet, the practical truth is that they are pretty much stuck studying the folks who volunteer for their studies. • So how can we learn anything from psychology experiments? Isn't the failure to sample randomly a fatal flaw? No, it's not, and there are three reasons why. First, sometimes the similarity of a sample and a population doesn't matter. If one pig flew over the Statue of Liberty just one time, it would instantly disprove the traditional theory of porcine locomotion. It wouldn't matter if all pigs flew or if any other pigs flew. If one did, then that's enough. An experimental result can be illuminating even when the sample isn't typical of the population. • Second, when the ability to generalize an experimental result is important, psychologists perform new experiments that use the same procedures but on different samples. For example, after measuring how a nonrandomly selected group of American children behaved after playing violent video games, we might try to replicate our experiment with Japanese children, or with American teenagers, or with deaf adults. In essence, we could treat the attributes of our sample, such as culture and age and ability, as independent variables, and we could do experiments to determine whether these attributes influenced our dependent variable. If the results of our study were replicated in these other samples, then we would be more confident (but never completely confident) that the results describe a basic human tendency. If the results do not replicate, then we would learn something about the influence of culture or age or ability on aggressiveness. Replicating research with new samples drawn from different populations is a win-win strategy: No matter what happens, we learn something interesting. • Third, sometimes the similarity of the sample and the population is simply a reasonable starting assumption. Instead of asking, "Do I have a compelling reason to believe that my sample is representative of the population?" we could instead ask, "Do I have a compelling reason not to?" For example, few of us would be willing to take an experimental medicine if a nonrandom sample of 7 participants took it and died. Indeed, we would probably refuse the medicine even if the 7 participants were mice. Although these nonrandomly sampled participants were different from us in many ways (including tails and whiskers), most of us would be willing to generalize from their experience to ours because we know that even mice share enough of our basic biology to make it a good bet that what harms them can harm us too. By this same reasoning, if a psychology experiment demonstrated that some American children behaved violently after playing violent video games, we might ask whether there is a compelling reason to suspect that Ecuadorian college students or middle-aged Australians would behave any differently. If the answer was yes, then experiments would provide a way for us to investigate that possibility.

Significance

• Random assignment is a powerful tool, but like a lot of tools, it doesn't work every time you use it. If we randomly assigned children to watch or not watch televised violence, we could expect the two groups to have roughly equal numbers of supervised and unsupervised kids, roughly equal numbers of emotionally stable and unstable kids, and so on. The key word in that sentence is roughly. When you flip a coin 100 times, you can expect it to land heads up roughly 50 times. But every once in a while, 100 coin flips will produce 80 heads, or 90 heads, or even 100 heads, by sheer chance alone. This does not happen often, of course, but it does happen. Because random assignment is achieved by using a randomizing device such as a coin, every once in a long while the coin will assign more unsupervised, emotionally unstable kids to play violent video games and more supervised, emotionally stable kids to play none. When this happens, random assignment has failed—and when random assignment fails, the third-variable problem rises up out of its grave like a guy with a hockey mask and a grudge. When random assignment fails, we cannot conclude that there is a causal relationship between the independent and dependent variables. • How can we tell when random assignment has failed? Unfortunately, we can't tell for sure. But we can calculate the odds that random assignment has failed each time we use it. It isn't important for you to know how to do this calculation, but it is important for you to understand how psychologists interpret its results. Psychologists perform this calculation every time they do an experiment, and they do not accept the results of those experiments unless the calculation tells them that if random assignment had failed, then there is less than a 5% chance that they would have seen those particular results. • When there is less than a 5% chance that a result would happen if random assignment had failed, then that result is said to be statistically significant. You've already learned about descriptive statistics, such as the mean, median, mode, range, and standard deviation. There is another kind of statistics—called inferential statistics—that tells scientists what kinds of conclusions or inferences they can draw from observed differences between the experimental and control groups. For example, p (for "probability") is an inferential statistic that tells psychologists the likelihood that random assignment failed in a particular experiment. When psychologists report that p < .05, they are saying that according to the inferential statistics they calculated, the odds that their results would have occurred if random assignment had failed are less than 5%, and given that those results did occur, a failure of random assignment is unlikely to have happened. Therefore, differences between the experimental and control groups were unlikely to have been caused by a third variable.

The Ethics of Science: First, Do No Harm

• Somewhere along the way, someone probably told you that it isn't nice to treat people like objects. And yet, it may seem that psychologists do just that by creating situations that cause people to feel fearful or sad, to do things that are embarrassing or immoral, and to learn things about themselves and others that they might not really want to know. Don't be fooled by appearances. The fact is that psychologists go to great lengths to protect the well-being of their research participants, and they are bound by a code of ethics that is as detailed and demanding as the professional codes that bind physicians, lawyers, and accountants. That code requires that psychologists show respect for people, for animals, and for the truth. Let's examine each of these obligations in turn.

Defining and Detecting

• The last time you said, "just give me a second," you probably didn't know you were talking about atomic decay. Every unit of time has an operational definition, which is a description of a property in concrete, measurable terms. The operational definition of a second is the duration of 9,192,631,770 cycles of microwave light absorbed or emitted by the hyperfine transition of cesium-133 atoms in their ground state undisturbed by external fields (which takes roughly 6 seconds just to say). To actually count the cycles of light emitted as cesium-133 decays requires an instrument, which is anything that can detect the condition to which an operational definition refers. An instrument known as a "cesium clock" can count cycles of light, and when it counts 9,192,631,770 of them, one second has officially passed. • The steps we take to measure a physical property are the same steps we take to measure a psychological property. For example, if we wanted to measure a person's intelligence, or shyness, or happiness, we would have to start by generating an operational definition of that property—that is, by specifying some concrete, measurable event that indicates it. For example, we might define happiness as the frequency with which a person smiles. Once we do, we just need a smile-detecting instrument, such as a computer-assisted camera or maybe just a human eye. Having an operational definition that specifies a measurable event and an instrument that measures that event are the keys to scientific measurement.

Experimentation

• The matched pairs and matched samples techniques eliminate a single difference between two groups: for example, the difference in adult supervision between groups of children who were and were not exposed to media violence. The problem is that they only eliminate one difference and a humongous number remain. If we could just find a technique that eliminated all of the humongous number of differences then we could conclude that exposure and aggression are causally related. If exposed kids were more aggressive than unexposed kids, and if the two groups didn't differ in any way except for that exposure, then we could be sure that their level of exposure had caused their level of aggression. • In fact, scientists have a technique that does exactly that. It is called an experiment, which is a technique for establishing the causal relationship between variables. The best way to understand how experiments eliminate all the differences between groups is by examining their two key features: manipulation and random assignment.

Manipulation

• The most important thing to know about experiments is that you already know the most important thing about experiments because you've been doing them all your life. Imagine that you are surfing the Web on a laptop when all of a sudden you lose your wireless connection. You suspect that another device—say, your roommate's new cell phone—has somehow bumped you off the network. What would you do to test your suspicion? Observing a natural correlation wouldn't be much help. You could carefully note when you did and didn't have a connection and when your roommate did and didn't use his cell phone, but even if you observed a correlation between these two variables you still couldn't conclude that the cell phone was causing you to lose your network connection. After all, if your roommate was afraid of loud noises and called his mommy for comfort whenever there was an electrical storm, and if that storm zapped your router and crashed your wireless network, then the storm (Z) would be the cause of both your roommate's cell phone usage (X) and your connectivity problem (Y). • So how could you test your suspicion? Well, rather than observing the correlation between cell phone usage and connectivity, you could try to create a correlation by intentionally making a call on your roommate's cell phone, hanging up, making another call, hanging up again, and observing changes in your laptop's connectivity as you did so. If you observed that "laptop connection off" only occurred in conjunction with "cell phone on" then you could conclude that your roommate's cell phone was the cause of your failed connection, and you could sell the phone on eBay and then lie about it when asked. The technique you intuitively used to solve the third-variable problem is called manipulation, which involves changing a variable in order to determine its causal power. Congratulations! You are now officially a manipulator. • Manipulation is a critical ingredient in experimentation. Until now, we have approached science like polite dinner guests, taking what we were offered and making the best of it. Nature offered us children who differed in how much violence they were exposed to and who differed in how aggressively they behaved, and we dutifully measured the natural patterns of variation in these two variables and computed their correlations. The problem with this approach is that when all was said and done, we still didn't know what we really wanted to know, namely, whether these variables had a causal relationship. No matter how many matched samples or matched pairs we observed, there was always another third variable that we hadn't yet dismissed. Experiments solve this problem. Rather than measuring exposure and measuring aggression and then computing the correlation between these two naturally occurring variables, experiments require that we manipulate exposure in exactly the same way that you manipulated your roommate's cell phone. In essence, we need to systematically switch exposure on and off in a group of children and then watch to see whether aggression goes on and off too. • There are many ways to do this. For example, we might ask some children to participate in an experiment, then have half of them play violent video games for an hour and make sure the other half does not. Then, at the end of the hour, we could measure the children's aggression and compare the measurements across the two groups. When we compared these measurements, we would essentially be computing the correlation between a variable that we manipulated (exposure) and a variable that we measured (aggression). Because we manipulated rather than measured exposure, we would never have to ask whether a third variable (such as lack of adult supervision) caused children to experience different levels of exposure. After all, we already know what caused that to happen. We did! • Experimentation involves three critical steps (and several ridiculously confusing terms): o First, we perform a manipulation. We call the variable that is manipulated the independent variable because it is under our control, and thus it is "independent" of what the participant says or does. When we manipulate an independent variable (such as exposure to media violence), we create at least two groups of participants: an experimental group, which is the group of people who are exposed to a particular manipulation, and a control group, which is the group of people who are not exposed to that particular manipulation. o Second, having manipulated one variable (exposure), we now measure another variable (aggression). We call the variable that is measured the dependent variable because its value "depends" on what the person being measured says or does. o Third and finally, we look to see whether our manipulation of the independent variable produced changes in the dependent variable. FIGURE 2.11 shows exactly how manipulation works

Observer Bias

• The people being observed aren't the only ones who can make measurement a bit tricky. In one study, students in a psychology class were asked to measure the speed with which a rat learned to run through a maze (Rosenthal & Fode, 1963). Some students were told that their rat had been specially bred to be "maze-dull" (i.e., slow to learn a maze) and others were told that their rat had been specially bred to be "mazebright" (i.e., quick to learn a maze). Although all the rats were actually the same breed, the students who thought they were measuring the speed of a maze-dull rat reported that their rats took longer to learn the maze than did the students who thought they were measuring the speed of a maze-bright rat. In other words, the measurements revealed precisely what the students expected them to reveal. • Why did this happen? First, expectations can influence observations. It is easy to make errors when measuring the speed of a rat, and our expectations often determine the kinds of errors we make. Does putting one paw over the finish line count as learning the maze? If the rat falls asleep, should the stopwatch be left running or should the rat be awakened and given a second chance? If a rat runs a maze in 18.5 seconds, should that number be rounded up or rounded down before it is recorded in the log book? The answers to these questions may depend on whether one thinks the rat is bright or dull. The students who timed the rats probably tried to be honest, vigilant, fair, and objective, but their expectations influenced their observations in subtle ways that they could neither detect nor control. Second, expectations can influence reality. Students who expected their rats to learn quickly may have unknowingly done things to help that learning along, for example, by muttering, "Oh no!" when the bright rat looked the wrong direction or by petting the dull rat less affectionately. (We'll discuss both of these phenomena more extensively in the Social Psychology chapter.) • Observers' expectations, then, can have a powerful influence on both their observations and on the behavior of those whom they observe. Psychologists use many techniques to avoid these influences, and one of the most common is the double-blind observation, which is an observation whose true purpose is hidden from both the observer and the person being observed. For example, if the students had not been told which rats were bright and which were dull, then they wouldn't have had any expectations about their rats, thus their expectations couldn't have influenced their measurements. That's why it is common practice in psychology to keep the observers as blind as the participants. For example, measurements are often made by research assistants who do not know what is being studied or why, and who therefore don't have any expectations about what the people being observed will or should do. Indeed, studies nowadays are often carried out by the world's blindest experimenter—a computer—which can present information to people and measure their responses while having no expectations at all.

Representative Variables

• The results of any experiment depend, in part, on how the independent and dependent variables are defined. For instance, we are more likely to find that exposure to media violence causes aggression when we define exposure as "watching two hours of gory axe murders" rather than "watching 10 minutes of football," or when we define aggression as "interrupting another person" rather than "smacking someone silly with a tire iron." The way we define variables can have a profound influence on what we find, so which of these is the right way? One answer is that we should define variables in an experiment as they are defined in the real world. • External validity is an attribute of an experiment in which variables have been defined in a normal, typical, or realistic way. It seems pretty clear that the kind of aggressive behavior that concerns teachers and parents lies somewhere between an interruption and an assault, and that the kind of media violence to which children are typically exposed lies somewhere between sports and torture. If the goal of an experiment is to determine whether the kinds of media violence to which children are typically exposed causes the kinds of aggression with which societies are typically concerned, then external validity is essential. When variables are defined in an experiment as they typically are in the real world, we say that the variables are representative of the real world. External validity sounds like such a good idea that you may be surprised to learn that most psychology experiments are externally invalid—and that most psychologists don't mind. The reason for this is that psychologists are rarely trying to learn about the real world by creating tiny replicas of it in their laboratories. Rather, they are usually trying to learn about the real world by using experiments to test hypotheses derived from theories, and externally invalid experiments can often do that quite nicely (Mook, 1983). • To see how, consider an example from physics. Physicists have a theory stating that heat is the result of the rapid movement of molecules. This theory gives rise to a hypothesis, namely, that when the molecules that constitute an object are slowed, the object should become cooler. Now imagine that a physicist tested this hypothesis by performing an experiment in which a laser was used to slow the movement of the molecules in a rubber ball, whose temperature was then measured. Would you criticize this experiment by saying, "Sorry, but your experiment teaches us nothing about the real world because in the real world, no one actually uses lasers to slow the movement of the molecules in rubber balls"? Let's hope not. The physicist's theory (molecular motion causes heat) led to a hypothesis about what would happen in the laboratory (slowing the molecules in a rubber ball should cool it), and the events that the physicist manipulated and measured in the laboratory served to test the theory. Similarly, a well thought out theory about the causal relationship between exposure to media violence and aggression should lead to hypotheses about how children in a laboratory will behave after watching a violent movie, and thus their reaction to the movie serves to test the theory. If children who watched Iron Man 3 were more likely to push and shove each other on their way out of the laboratory, then any theory that says that media violence cannot influence aggression has just been proved wrong. • In short, theories allow us to generate hypotheses about what can, must, or will happen under particular circumstances, and experiments are usually meant to create those circumstances, test the hypotheses, and thereby provide evidence for or against the theories that generated them. Experiments are not usually meant to be miniature versions of everyday life, and thus external invalidity is not necessarily a problem (see the Hot Science box, Do Violent Movies Make Peaceful Streets?)


Related study sets

Ortografía: palabras con m antes de p y b, y n antes de v

View Set

Anatomy & Physiology Unit 1.1.1 - 1.1.3

View Set

2-3-1 Working with Powers of 10 and Metric Prefixes

View Set

Chapter 5 Medical Terminology Module review

View Set