Research Methods in Psychology Chapter 11

¡Supera tus tareas y exámenes ahora con Quizwiz!

Observer Bias

-A bias that occurs when observer expectations influence the interpretation of participant behaviors or the outcome of the study. -Although comparison groups can prevent many threats to internal validity, they do not necessarily control for observer bias. -Observer bias can threaten two kinds of validity in an experiment 1) Internal validity: b/c an alternative explanation exists for the results. -Did the therapy work, or was Dr. Yuki biased? 2) Construct validity of the DV: b/c it means the depression ratings given by Dr. Yuki do not represent the true levels of depression of her participants.

Selection Effects: Threat to Internal Validity

-A confound exists because the different IV groups have systematically different types of participants. -EX: Study of intensive therapy for autism, in which children who received the intensive treatment did improve over time. -We are not sure if their improvement was caused by the therapy or by greater overall involvement on the part of the parents who elected to be in the intensive treatment group. -Those parents' greater motivation could have been an alternative explanation for the improvement of children in the intensive treatment group.

Null Effect

-A finding that an IV did not make a difference in the DV -There is no significant covariance between the two -Sometimes when a study has a null effect, it might be that the IV really didn't affect the DV. -Other times when there's a null result, it's because the study wasn't designed or conducted properly, so the IV actually did cause the DV, but some obscuring factor got in the way of the researchers detecting the difference. There are two types of obscuring factors: 1) There might not have been enough difference between groups 2) There might have been too much variability within groups. -A different possibility is that there is a true effect, but this particular study did not detect it.

Regression to the Mean

-A phenomenon in which an extreme finding is likely to be closer to its own typical, or mean, level the next time it is measured, because the same combination of chance factors that made the finding extreme are not present the second time. -Regression works at both extremes. -An unusually good performance or outcome is likely to regress downward (toward its mean) the next time. -And an unusually bad performance or outcome is likely to regress upward (toward its mean) the next time. -Either extreme is explainable by an unusually lucky, or an unusually unlucky, combination of random events.

Placebo Effect

-A response or effect that occurs when people receiving an experimental treatment experience a change or improvement only because they believe they are receiving a valid treatment. -Aren't imaginary and can be strong treatments. -Can occur whenever any kind of treatment is used to control symptoms, such as an herbal remedy to enhance wellness -Been shown to reduce real symptoms and side effects, both psychological and physical, including depression postoperative pain or anxiety, terminal cancer pain, and epilepsy. -They are not always beneficial or harmless; physical side effects, including skin rashes and headaches, can be caused -People's symptoms appear to respond not just to the active ingredients in medications or to psychotherapy, but also to their belief in what the treatment can do to alter their situation.

Testing Threat

-A specific kind of order effect, in which there is a change in participants as a result of experiencing the DV (the test) more than once. -Their scores might go up due to practice (practice effect), or their scores might go down due to fatigue (fatigue effect). -Testing threats affect internal validity because it's not clear if the treatment caused the change in the DV or whether practice or fatigue did -People might have become more practiced at taking the test, leading to improved scores, or they may become fatigued or bored, which could lead to worse scores over time. -Therefore, testing threats include practice effects

Double-blind Study

-A study in which neither the participants nor the researchers who evaluate them know who is in the treatment group and who is in the comparison group. -EX: Nikhil decides to test his hypothesis as a double-blind study. He could arrange to have two cabins of equally lively campers and replace the sugary snacks with good-tasting low-sugar versions for only one group. The boys would not know which kind of snacks they were eating, and the people observing their behavior would also be blind to which boys were in which group.

Selection-history threat

-A threat to internal validity in which a historical or seasonal event systematically affects only the participants in the treatment group or only those in the comparison group, not both. -An outside event or factor affects only those at one level of the IV -EX 1: The dorm that was used as a comparison group was undergoing construction, and the construction crew used electric tools that drew on only that dorm's power supply. The researcher won't be sure: Was it b/c the Go Green campaign reduced student energy usage? Or was it only b/c the comparison group dorm used so many power tools? -EX 2: Students at one university were in your treatment group and students at another university were in your control group in a study of the effects of meditation on stress. However, during the course of the study, a stressful event occurs on one of the campuses

Selection-attrition Threat

-A threat to internal validity in which participants are likely to drop out of either the treatment group or the comparison group, not both. -Only one of the experimental groups experiences attrition. -EX: Participants in one group have to travel 1 mile for the study, and participants in the other group have to travel 20 miles for the study. You might have more attrition in the 20-mile group due to the distance from the lab, so you could be sure if differences between groups were due to the IV or the distance and attrition.

Instrumentation Threat

-A threat to internal validity that occurs when a measuring instrument changes over time. -In observational research, the people who are coding behaviors are the measuring instrument, and over a period of time, they might change their standards for judging behavior by becoming stricter or more lenient. -EX 1: Thus, maybe Nikhil's campers did not really become less disruptive; instead, the people judging the campers' behavior became more tolerant of loud voices and rough-and-tumble play. -EX 2: observers change their observation criteria over time, or a researcher uses different forms of a test at pretest and posttest and they're not equivalent forms.

Maturation Threat

-A threat to internal validity that occurs when an observed change in an experimental group could have emerged more or less spontaneously over time. -A change in behavior that emerges more or less spontaneously over time. -People adapt to changed environments -Children become better and faster at solving addition and subtraction problems as they get older -Plants grow taller with age -EX: A depressed women may have improved because the cognitive therapy was effective, but an alternative explanation is that a systematically high portion of them simply improved on their own. Sometimes the symptoms of depression or other disorders disappear, for no known reason, with time. This phenomenon, known as spontaneous remission , is a specific type of maturation.

History Threats to Internal Validity

-A threat to internal validity that occurs when it is unclear whether a change in the treatment group is caused by the treatment itself or by an external or historical factor that affects most members of the group. -Result when some external factor or "historical" event that systematically affects most members of the treatment group at the same time as the treatment, as the treatment itself, making it unclear whether the change is caused by the treatment received. -To be a history threat, the external factor must affect most people in the group in the same direction (systematically), not just a few people (unsystematically). -EX: You were studying the effects of meditation on stress levels among college students, and while you were conducting the study, a violent event occurred on the college campus at which you were collecting your data. -The meditation group did not show significant decreases in stress levels as expected, but was that because the treatment wasn't effective? -Perhaps it was effective but the campus violence raised people's stress levels, which made it look like it was not effective.

Regression Threats to Internal Validity

-A threat to internal validity which refers to a statistical concept called regression to the mean, a phenomenon in which any extreme finding is likely to be closer to its own typical, or mean, level the next time it is measured (with or without the experimental treatment or intervention) -When a group average (mean) is unusually extreme at Time 1, the next time that group is measured (Time 2), it is likely to be less extreme—closer to its typical or average performance. -Regression threats only occur in pretest/posttest designs, when a group is measured twice, and only when the group has an extreme score at pretest. -If the group has been selected because of its unusually high or low group mean at pretest, you can expect them to regress toward the mean somewhat when it comes time for the posttest. -Specifically, they only occur when a group has an extreme pretest score (high or low).

The Really Bad Experiment

-Also known as one-group pretest/posttest design, which means that there is one group of participants who are measured on a pretest, exposed to a treatment/intervention/change, and then measured on a posttest. -Such a design is problematic because it is vulnerable to threats to internal validity.

One-group, Pretest/posttest Design

-An experiment in which a researcher recruits one group of participants; measures them on a pretest; exposes them to a treatment, intervention, or change; and then measures them on a posttest. -This design differs from the true pretest/posttest design because it has only one group, not two. There is no comparison group.

Ceiling Effect

-An experimental design problem in which IV groups score almost the same on a DV, such that all scores fall at the high end of their possible distribution. -All the scores are squeed together at the high end -EX: Suppose the researcher manipulated anxiety by telling the groups they were about to receive an electric shock. The low-anxiety group was told to expect a 10-volt shock, the medium-anxiety group was told to expect a 50-volt shock, and the high-anxiety group was told to expect a 100-volt shock. This manipulation would probably result in a ceiling effect because expecting any amount of shock would cause anxiety, regardless of the shock's intensity. As a result, the various levels of the independent variable would appear to make no difference

Floor Effect

-An experimental design problem in which IV groups score almost the same on a DV, such that all scores fall at the low end of their possible distribution -All the scores cluster at the low end -EX: The participants' scores on the DV are clustered at the low end. Example: If a researcher really did manipulate the independent variable by giving people $0.00, $0.25, or $1.00, that would be a floor effect because these three amounts are all low—they're squeezed close to a floor of $0.00

EX: Instrumentation Threat & Prevention

-Another case of an instrumentation threat would be when a researcher uses different forms for the pretest and posttest, but the two forms are not sufficiently equivalent. -Dr. Yuki might have used a measure of depression at pretest on which people tend to score a little higher, and another measure of depression at posttest that tends to yield lower scores. -As a result, the pattern she observed was not a sign of how good the cognitive therapy is, but merely reflected the way the alternative forms of the test are calibrated. -Prevention: To control for the problem of different forms, Dr. Yuki could also counterbalance the versions of the test, giving some participants version A at pretest and version B at posttest, and giving other participants version B, and then version A.

Preventing History Threats

-As with maturation threats, a comparison group can help control for history threats. EX: In the Go Green study, the students would need to measure the kilowatt usage in another, comparable dormitory during the same 2 months, but not give the students in the second dorm the Go Green campaign materials. (This would be a pretest/posttest design rather than a one-group, pretest/posttest design.) -If both groups decreased their electricity usage about the same over time, the decrease probably resulted from the change of seasons, not from the Go Green campaign. -If the treatment group decreased its usage more than the comparison group did, you can rule out the history threat. -Both the comparison group and the treatment group experienced the same seasonal "historical" changes, so including the comparison group controls for this threat.

Weak Manipulations

-Based on the three examples we just looked at in the discussion of null effects, we can think of them in terms of weak manipulations. -Perhaps in the study on money and mood, with the levels of the IV being no cash, 25 cents, or a dollar, that just wasn't enough money to affect people's mood. -The difference didn't really matter.

Order Effects: Threat to Internal Validity

-In a within-groups design, there is an alternative explanation because the outcome might be caused by the IV, but it might also be caused by the order in which the levels of the variable are presented. -When there is an order effect, we do not know whether the independent variable is really having an effect, or whether the participants are just getting tired, bored, or well-practiced.

Individual Differences

-Can be another source of within-groups variability. -They can be a problem in independent-groups designs. -EX: In the experiment on money and mood, the normal mood of the participants must have varied. -Some people are naturally more cheerful than others, and these individual differences have the effect of spreading out the scores of the students within each group -In the $1.00 condition is Candace, who is typically unhappy. -The $1.00 gift might have made her happier, but her mood would still be relatively low because of her normal level of saltiness. -Michael, a cheerful guy, was in the no-money control condition, but he still scored high on the mood measure. -Overall, students who received money were slightly more cheerful than students in the control group, but the scores in the two groups overlapped a great deal. -Thus, the individual differences within each group obscured the between-groups difference.

Design Confounds Acting in Reverse

-Confounds are usually considered to be internal validity threats, alternative explanations for some observed difference in a study. -However, they can apply to null effects, too. -A design confound can counteract, or reverse, the true effect of an IV -EX 1: In the money and happiness study, perhaps the students who received the most money happened to be given the money by a grouchy experimenter, while those who received the least money were exposed to a more cheerful person. This confound would have worked against any true effect of money on mood. -EX 2: In the GRE study, perhaps that test-prep group was also under additional pressure to perform well on the GRE. As a result, they were actually receiving test prep and pressure, while the no-test prep group didn't have test prep or pressure. The added pressure that was applied is considered a confound. However, it didn't work in favor of the test-prep group; it worked against them by lowering their scores

EX 2: Null Effect

-Do online reading games make kids better readers? -An educational psychologist recruited a sample of 5-year-olds, all of whom did not yet know how to read. -She randomly assigned the children to two groups. -One group played with a commercially available online reading game for 1 week (about 30 minutes per day), and the other group continued "treatment as usual," attending their normal kindergarten classes. -Afterward, the children were tested on their reading ability. -The reading game group's scores were a little higher than those of the kindergarten-as-usual group, but the 95% CI for the estimated difference between two groups included zero.

EX: Observer Bias

-Dr. Yuki might be a biased observer of her patients' depression: She expects to see her patients improve, whether they do or do not. -Even if Dr. Yuki used a no-therapy comparison group, observer bias could still occur: If she knew which participants were in which group, her biases could lead her to see more improvement in the therapy group than in the comparison group

EX 1: Regression to the Mean

-During an early round of the 2019 Women's World Cup, the team from Italy outscored the team from Jamaica 5-0. -That's a big score; soccer teams hardly ever score 5 points in a game. -Without being familiar with either team, people who know about soccer would predict that in their next game, Italy would score fewer than 5 goals. -Why? Simply because most people have an intuitive understanding of regression to the mean. -Statistical Explanation: The Italian team's score was exceptionally high partly because of the team's talent, and partly because of a unique combination of random factors that happened to come out in their favor. -It was an early-round game, and the players felt confident because they were higher seeded. -The team's injury level was, just by chance, much lower than usual. -The European setting may have favored Italy. -Despite Italy's legitimate talent as a team, they also benefited from randomness, a chance combination of lucky events that would probably never happen in the same combination again, like flipping a coin and getting eight heads in a row. -Overall, the team's score in the subsequent game would almost necessarily be worse than in this game, not all eight flips will turn out in their favor again. -Indeed, the team did regress: In their next game, they lost to Brazil, 0-1. -In other words, Italy finished closer to an average level of performance.

EX: Selection-attrition Threat

-If Dr. Yuki conducted her depression therapy experiment as a pretest/posttest design, it might be the case that the most severely depressed people dropped out, but only from the treatment group, not the control group. -The treatment might have been especially arduous for the most depressed people, so they dropped out of the study. -Because the control group was not undergoing treatment, they are not susceptible to the same level of attrition. -Therefore, selection and attrition can combine to make Dr. Yuki unsure: Did the cognitive therapy really work, compared with the control group? -Or is it just that the most severely depressed people dropped out of the treatment group?

EX 1: Attrition Threat

-If any random camper leaves midweek, it might not be a problem for Nikhil's research, but it is a problem when the most rambunctious camper leaves early. -His departure creates an alternative explanation for Nikhil's results: Was the posttest average lower because the low-sugar diet worked, or because one extreme score is gone?

Measure More Instances: Solution for Reducing Measurement Error

-If researcher can't find a measurement tool that's reliable and valid, then the best alternative is to measure a larger sample of participants or take multiple measurements on the sample you have. -One solution to measuring badly is to take more measurements. -When a tool potentially causes a great deal of random error, the researcher can cancel out many errors simply by including more people in the sample or measuring multiple observations

Individual Differences: Solution- Add more participants

-If within-groups or matched-groups designs are inappropriate (and sometimes they are, because of order effects, demand characteristics, or other practical concerns), another solution to individual difference variability is to measure more people. -The principle is the same as it is for measurement error: When a great deal of variability exists because of individual differences, a simple solution is to increase the sample size. -The more people you measure, the less impact any single person will have on the group's average. -Adding more participants reduces the influence of individual differences within groups, thereby enhancing the study's ability to detect differences between groups. -Another reason to use a larger sample is that it leads to a more precise estimate. Computing the 95% CI for a set of data requires three elements: 1) Variability component; based on the standard deviation 2) Sample size component; where sample size goes in the denominator 3) Constant. -The larger the sample size, the more precise our estimate is and the narrower our CI is

INTERROGATING NULL EFFECTS: WHAT IF THE INDEPENDENT VARIABLE DOES NOT MAKE A DIFFERENCE?

-If you encounter a study in which the IV had no effect on the DV (a null effect), you can review the possible obscuring factors. -Obscuring factors can be sorted into two categories of problems. -One is the problem of not enough between groups difference, which results from weak manipulations, insensitive measures, ceiling or floor effects, or a design confound acting in reverse. -The second problem is too much within-groups variability, caused by measurement error, irrelevant individual differences, or situation noise. -These problems can be counteracted by using multiple measurements, more precise measurements, within-groups designs, large samples, and very controlled experimental environments. -If you can be reasonably sure a study avoided all the obscuring factors, then the study provides valuable evidence. -You should consider it, along with other studies on the same topic, to evaluate how strong some effect is in the real world.

Attrition Threat

-In a pretest/posttest, repeated-measures, or quasi-experimental study, a threat to internal validity is a reduction participant numbers that occurs when people drop out of the study before it ends. -Can happen when a pretest and posttest are administered on separate days and some participants are not available on the second day. -An attrition threat becomes a problem for internal validity when attrition is systematic; when only a certain kind of participant drops out.

Preventing Maturation Threats

-In a really bad experiment, remember that there is only one group of participants and they are getting the treatment. -But in a true experiment there is a comparison group in order to prevent maturation threats. -In the depression study graphed here, a comparison group has been added to Dr. Yuki's really bad experiment. Notice that the two groups are quite similar in their depression scores at pretest, but the depression scores at posttest differ with the therapy group having lower depression scores than the no-therapy group. -Thus, the effect of maturation could be "subtracted out" when interpreting the results of this study, and a maturation threat has been prevented.

EX: Testing Threat

-In an educational setting, students might perform better on a posttest than on a pretest, but not because of any educational intervention. -Instead, perhaps they were inexperienced the first time they took the test, and they did better on the posttest simply because they had more practice the second time around.

Manipulation Check

-In an experiment, an extra DV researchers included to determine how well an experimental manipulation worked. -EX: In the anxiety study, after telling people they were going to receive a 10-volt, 50-volt, or 100-volt shock, the researchers might have asked: How anxious are you right now, on a scale of 1 to 10? -If the manipulation check showed that participants in all three groups felt nearly the same level of anxiety you'd know the researchers did not effectively manipulate what they intended to manipulate. -If the manipulation check showed that the IV levels differed in an expected way, participants in the high-anxiety group really felt more anxious than those in the other two groups then you'd know the researchers did effectively manipulate anxiety. -If the manipulation check worked, the researchers could look for another reason for the null effect of anxiety on logical reasoning. -Perhaps the dependent measure has a floor effect; that is, the logical reasoning test might be too difficult, so everyone scores low -Or perhaps the effect of anxiety on logical reasoning is truly negligible.

Controlling for Observer Bias and Demand Characteristics Using Double-Blind Study

-In order to control or avoid observer bias and demand characteristics, researchers must do more than add a comparison group to their studies. -The most appropriate way to avoid such problems is to conduct a double-blind study which neither the participants nor the researchers who evaluate them know who is in the treatment group and who is in the comparison group.

Six Potential Internal Validity Threats in One-Group, Pretest/Posttest Designs

-Internal validity threats that apply to the really bad experiment, can be prevented with a good experimental design. -These include 1) Maturation threats 2) History threats 3) Regression threats 4) Attrition threats 5) Testing threats 6) Instrumentation threats -The final three threats: 7) Observer bias 8) Demand characteristics 9) Placebo effects -Potentially apply to any study.

EX 2: Attrition Threat

-It would not be unusual if two of 40 women in the depression therapy study dropped out over time. -However, if the two most depressed women systematically drop out, the mean for the posttest is going to be lower only because it does not include these two extreme scores (not because of the therapy). -Therefore, if the depression score goes down from pretest to posttest, you wouldn't know whether the decrease occurred because of the therapy or because of the alternative explanation, that the highest-scoring women had dropped out.

EX 1: Null Effect

-Many people believe having more money will make them happy. -But will it? -A group of researchers designed an experiment in which they randomly assigned people to three groups. -They gave one group nothing, gave the second group a little money, and gave the third group a lot of money. -The next day, they asked each group to report their happiness on a mood scale. -The groups who received cash (either a little or a lot) were not significantly happier, or in a better mood, than the group who received nothing. -The 95% CIs for the groups overlapped completely.

Use, Reliable, Precise Tools: Solution for Reducing Measurement Error

-Measurement errors are reduced when researchers use measurement tools that are reliable (internal, interrater, and test/retest) -When such tools also have good construct validity, there will be a lower error rate as well. -More precise and accurate measurements have less error.

EX: Placebo Effects

-Medications: one group receives a pill or an injection with the real drug, while another group receives a pill or an injection with no active ingredients, a sugar pill or a saline solution. -People can receive placebo psychotherapy, in which they simply talk to a friendly listener about their problems, but these placebo conversations have no therapeutic structure. -The inert pill, injection, or therapy is the placebo. -People who receive the placebo see their symptoms improve because they believe the treatment they are receiving is supposed to be effective.

EX: Masked Design (blind design)

-People took notes in longhand or on laptops. -The research assistants in that study were blind to the condition each participant was in when they graded their tests on the lectures. -The participants themselves were not blind to their notetaking method. -However, since the test takers participated in only one condition (an independent-groups design), they were not aware that the form of notetaking was an important feature of the experiment. -Therefore, they were blind to the reason they were taking notes in longhand or on a laptop.

Preventing Regression Threats

-Regression threats can be avoided by using comparison groups and regression threats, along with a careful inspection of the pattern of results. -If the comparison group and the experimental group are equally extreme at pretest, the researchers can account for any regression effects in their results. -Regression can be ruled out, and we can conclude that the therapy worked.

Preventing Testing Threats

-Researchers abandon a pretest altogether and use a posttest-only design -If they do use a pretest, they might opt to use alternative forms of the test for the two measurements. -The two forms might both measure depression, for example, but use different items to do so. -Another way is to use alternative forms of the test at pretest and posttest. -A comparison group will also help. -You can rule out testing threats if both groups take the pretest and the posttest, but the treatment group exhibits a larger change than the comparison group

EX 3: Null Effect

-Researchers have hypothesized that feeling anxious can cause people to reason less carefully and logically. -To test this hypothesis, a research team randomly assigned people to three groups: low, medium, and high anxiety. -After a few minutes of being exposed to the anxiety manipulation, the participants solved problems requiring logic, rather than emotional reasoning. -Although the researchers had predicted the anxious people would do worse on the problems, participants in the three groups scored roughly the same.

Summary

-Responsible experimenters may conduct double-blind studies, measure variables precisely, or put people in controlled environments to eliminate internal validity threats and increase a study's power to avoid false null effects

Insensitive Measures

-Sometimes a null result occurs because the researchers haven't operationalized the DV with enough sensitivity. -EX: If a medication reduces fever by a tenth of a degree, you wouldn't be able to detect it with a thermometer that was calibrated in one-degree increments—it wouldn't be sensitive enough. -EX 2: If online reading games improve reading scores by about 2 points, you wouldn't be able to detect the improvement with a simple pass/fail reading test (either passing or failing, nothing in between). -When it comes to dependent measures, it's smart to use ones that have detailed, quantitative increments—not just two or three levels.

Two Obscuring Factors of Null Effect

-Suppose you prepared two bowls of salsa: one containing two shakes of hot sauce and the other containing four shakes of hot sauce. -People might not taste any difference between the two bowls (a null effect!). 1) One reason is that four shakes is not that different from two: There's not enough between-groups difference. 2) A second reason is that each bowl contains many other ingredients (tomatoes, onions, jalapeños, cilantro, lime juice), so it's hard to detect any change in hot sauce intensity with all those other flavors getting in the way. -That's a problem of too much within-groups variability.

EX 2: Regression to the Mean

-Suppose you're normally cheerful and happy, but on any given day your usual upbeat mood can be affected by random factors, such as the weather, your friends' moods, and even parking problems. -Every once in a while, just by chance, several of these random factors will affect you negatively: It will pour rain, your friends will be grumpy, and you won't be able to find a parking space. -Your day is terrible! The good news is that tomorrow will almost certainly be better because all three of those random factors are unlikely to occur in that same, unlucky combination again. It might still be raining, but your friends won't be grumpy, and you'll quickly find a good parking space. -If even one of these factors is different, your day will go better and you will regress toward your average, happy mean.

Measurement Error

-The degree to which the recorded measure for a participant on some variable differs from the true value of the variable for that participant. -May be random, such that scores that are too high and too low cancel each other out; or they may be systematic, such that most scores are biased too high or too low. -One reason for high within-groups variability is measurement error, a human or instrument factor that can randomly inflate or deflate a person's true score on the DV -EX: A person who is 160 cm tall might be measured at 160.25 cm because of the angle of vision of the person using the meter stick, or they might be recorded as 159.75 cm because they slouched a bit. -All DVs involve a certain amount of measurement error, but researchers try to keep those errors as small as possible. -EX: The reading test used as a DV in the educational psychologist's study is not perfect. -A group's score on the reading test represents the group's "true" reading ability, the actual level of the construct in a group, plus or minus some random measurement error. -Maybe one child's batch of questions happened to be more difficult than average -Maybe one child was especially distracted during the test, and another was especially focused. -When these distortions of measurement are random, they cancel each other out across a sample of people and will not affect the group's average, or mean. -An operationalization with a lot of measurement error will result in a set of scores that are more spread out around the group mean

2 Solutions for Reducing Measurement Error

1) Use, Reliable, Precise Tools 2) Measure More Instances

Power

-The likelihood that a study will show a statistically significant result when an IV really has an effect in the population -The probability of not making a Type II error. -Studies with a lot of power are more likely to detect true differences -Power is an aspect of statistical validity. -EX: If GRE prep courses really work to increase GRE scores, then the study will detect this difference -A within-groups design, a strong manipulation, a larger number of participants, and less situation noise are all things that can improve the precision of our estimates. Of these, the easiest way to increase precision and power is to add more participants.

Measurement Error (part 2)

-The more sources of random error there are in a DVs measurement, the more variability there will be within each group in an experiment -In contrast, the more precisely and carefully a DV is measured, the less variability there will be within each group -And lower within-groups variability is better, making it easier to detect a difference (if one exists) between the different IV groups.

Design Confound: Threat to Internal Validity

-There is an alternative explanation because the experiment was poorly designed; another variable happened to vary systematically along with the intended independent variable. -EX: Study on pen versus laptop notetaking- If the test questions assigned to the laptop condition were more difficult than the those assigned to the pen condition, that would have been a design confound -It would not be clear whether the notetaking format or the difficulty of the questions caused the handwritten notes group to score better.

Null Effects from the 3 Examples (above)

-These three examples of null effects are all posttest-only designs. -A null effect can happen in a within-groups design or a pretest/posttest design, and in correlational studies -In all three of these cases, the IV manipulated by the experimenters did not result in a change in the DV. Why didn't these experiments show covariance between the independent and dependent variables?

Instrumentation VS. Testing Threat.

-These two threats are pretty similar, but here's the difference: In an instrumentation threat, the 'measuring instrument' has changed from Time 1 to Time 2 -Whereas a testing threat, the 'participant' changes over the period between Time 1 and Time 2.

Demand Characteristics

-This is bias that occurs when participants guess what the study is supposed to be about and change their behavior in the expected direction. -EX 1: Dr. Yuki's patients know they are getting therapy. If they think Dr. Yuki expects them to get better, they might change their self-reports of symptoms in the expected direction. -EX 2: Nikhil's campers, too, might realize something fishy is going on when they're not given their usual snacks. -Their awareness of a menu change could certainly change the way they behave.

Designing Studies to Rule Out the Placebo Effect: Double-blind Placebo Control Study

-To determine whether an effect is caused by a therapeutic treatment or by placebo effects, the standard approach is to include a special kind of comparison group. -As usual, one group receives the real drug or real therapy, and the second group receives the placebo drug or placebo therapy. -Neither the people treating the patients nor the patients themselves know who is in which group (the real group or the placebo group) -This experimental design is called a double-blind placebo control study -In order to determine whether there is in fact a placebo effect, the researcher might add a third comparison group that doesn't receive the true therapy and doesn't receive the placebo therapy. -They don't receive any therapy at all. If you have a placebo effect then the no treatment group should not improve as much as the placebo group.

Preventing Instrumentation Threats

-To prevent, researchers can use a posttest-only design. -However, if you need a pretest/posttest design, make sure the pretest and posttest forms are equivalent -To do so, they might collect data from each instrument to be sure the two are calibrated the same. -To avoid shifting standards of behavioral coders, researchers might retrain their coders throughout the experiment, establishing their reliability and validity at both pretest and posttest. -Using clear coding manuals would be an important part of this process. -Another way to prevent is to use a posttest-only design (in which behavior is measured only once). -In terms of making observations, you might retrain your observers throughout the study. -Counterbalance the order of the pretest and posttest forms, such that some participants get Form A at pretest and some get Form B, and then they get the other form at posttest.

Combined Threats

-True pretest/posttest designs (those with two or more groups) normally take care of many internal validity threats. -However, in some cases, a study with a pretest/posttest design might combine selection threats with history or attrition threats.

Situation Noise

-Unrelated events or distractions in the external environment that create unsystematic variability within groups and obscure true group differences. -It can be minimized by controlling the surroundings of an experiment -This includes smells, sights, and sounds that might distract participants and increase within-groups variability; it adds unsystematic variability to each group situation by controlling the surroundings of an experiment that might affect the DV. -EX: Suppose the money and mood researchers had conducted their study in the middle of the student union on campus. -The sheer number of distractions in this setting would make a mess of the data. -The smell of the nearby coffee shop might make some participants feel cozy, seeing friends at the next table might make some feel extra happy, and seeing the cute person from sociology class might make some feel nervous or self conscious. -The kind and amount of distractions in the student union would vary from participant to participant and from moment to moment. -The result would be unsystematic variability within each group. -Unsystematic variability, like that caused by random measurement error or irrelevant individual differences, will obscure true differences between groups.

Noise (error variance, unsystematic variance)

-Unsystematic variability among the members of a group in an experiment, which might be caused by situation noise, individual differences, or measurement error. -Another reason a study might return a null effect is that there is too much unsystematic variability within each group, referred to as noise -Noisy within-groups variability can get in the way of detecting a true difference between groups. -EX: In the salsa analogy, noise refers to the great number of the other flavors in the two bowls. If the two bowls of salsa contained nothing but tomatoes, the difference between two and four shakes of hot sauce would be more easily detectable because there would be fewer competing, "noisy" flavors within bowls. -The more unsystematic variability there is within each group, the more the scores in the two groups overlap with each other. -The greater the overlap, the less apparent the average difference.

Individual Differences: Solution-Change the Design

-Use a within-groups design instead of an independent-groups design -When you do this, each person receives both levels of the IV, and individual differences are controlled for. -It's easier to see the effect of the IV when individual differences aren't obscuring between-groups differences. -A within-groups design, in which all participants are compared with themselves, controls for irrelevant individual differences. -You can use a matched-groups design -Pairs of participants are matched on an individual differences variable, and it's easier to see the effects of the IV.

Masked Design (blind design)

-When a double-blind study is not possible, a variation might be an acceptable alternative. -Participants know which group they are in, but the observers do not -EX: The students exposed to the Go Green campaign would certainly be aware that someone was trying to influence their behavior. However, the raters who were recording their electrical energy usage should not know which dorm was exposed to the campaign and which was not. Keeping observers unaware is even more important when they are rating behaviors that are more difficult to code, such as symptoms of depression or behavior problems at camp.

Perhaps There Is Not Enough Between-Groups Difference: Null Result

-When a study returns a null result, sometimes the culprit is not enough between-groups difference. -Weak manipulations, insensitive measures, ceiling and floor effects, and reverse design confounds might prevent study results from revealing a true difference that exists between two or more experimental groups.

THREATS TO INTERNAL VALIDITY: DID THE INDEPENDENT VARIABLE REALLY CAUSE THE DIFFERENCE?

-When an experiment finds that an independent variable affected a dependent variable, you can interrogate the study for twelve possible internal validity threats. -The first three threats to internal validity to consider are design confounds, selection effects, and order effects -Six threats to internal validity are especially relevant to the one-group, pretest/posttest design: maturation, history, regression, attrition, testing, and instrumentation threats. -All of them can usually be ruled out if an experimenter conducts the study using a comparison group (either a posttest-only design or a pretest/posttest design). -Three more internal validity threats could potentially apply to any experiment: observer bias, demand characteristics, and placebo effects. -By interrogating a study's design and results, you can decide whether the study has ruled out all twelve threats. -If it passes all your internal validity queries, you can conclude with confidence that the study was a strong one: You can trust the result and make a causal claim.

Regression and Internal Validity: Dr.Yuki

-You might suspect that the 40 depressed women Dr. Yuki studied were, as a group, quite depressed. -Their group average at pretest was partly due to their true, baseline level of depression, but it's also true that people seek treatment when they are especially low. -In this group, a proportion are feeling especially depressed partly because of random events (the winter blues, a recent illness, family or relationship problems, job loss, divorce). -At the posttest, the same unlucky combination of random effects on the group mean probably would not be the same as they were at pretest (maybe some saw their relationships get better, or the job situation improved for a few), so the posttest depression average would go down. -The change would not occur because of the treatment, but simply because of regression to the mean, so in this case there would be an internal validity threat.

Preventing Attrition Threats

. -One way to prevent attrition is that when participants drop out of a study, the researcher removes their scores from the pretest average. -They look only at the scores of those who completed both parts of the study. -Another approach is to look at the pretest scores of the dropouts. -If they have extreme scores on the pretest, then they are more likely to threaten internal validity than if they have more moderate scores/ closer to the average group score

Individual Differences: Solutions

1) Change the design to a within-groups or matched groups design. 2) Add more participants.

3 Possible Threats to Internal Validity

1) Design confounds 2) Selection effects 3) Order effects. -All three of these threats involve an alternative explanation for the results.

Three Potential Internal Validity Threats in Any Study

1) Observer bias 2) Demand Characteristics 3) Placebo effects Observer bias -Are three more potential threats to internal validity, and they can occur, not only in the very bad experiment (one-group pretest/posttest design), but also in experiments that have a comparison group.

Measurement Error Formula

A child's score on the reading measure can be represented with the following formula: child's reading score = child's true reading ability +/-random error of measurement Or, more generally: DV score = participant's true score +/-random error of measurement

The Really Bad Experiment: Dormitory

A dormitory on a university campus has started a Go Green social media campaign, focused on persuading students to turn out the lights in their rooms when they're not needed. Dorm residents receive emails and see posts on social media that encourage energy-saving behaviors. At the start of the campaign, the head resident noted how many kilowatt hours the dorm was using by checking the electric meters on the building. At the end of the 2-month campaign, the head resident checks the meters again and finds that the usage has dropped. He compares the two measures (pretest and posttest) and finds they are significantly different.

Double-blind Placebo Control Study

A study that uses a treatment group and a placebo group and in which neither the researchers nor the participants know who is in which group.

Weak, Insensitive, Ceiling, Floor

As special cases of weak manipulations and insensitive measures, ceiling and floor effects can cause independent variable groups to score almost the same on the dependent variable.

The Really Bad Experiment: Dr. Yuki

Dr. Yuki has recruited a sample of 40 depressed women, all of whom are interested in receiving psychotherapy to treat their depression. She measures their level of depression using a standard depression inventory at the start of therapy. For 12 weeks, all the women participate in Dr. Yuki's style of cognitive therapy. At the end of the 12-week session, she measures the women again and finds that, on the whole, their levels of depression have significantly decreased.

The Really Bad Experiment: Nikhil

Nikhil, a summer camp counselor and psychology major, has noticed that his current cabin of 15 boys is an especially rowdy bunch. He's heard a change in diet might help them calm down, so he eliminates the sugary snacks and desserts from their meals for 2 days. As he expected, the boys are much quieter and calmer by the end of the week, after refined sugar has been eliminated from their diets

Ceilings, Floors, and Dependent Variables

Poorly designed DVs can also cause ceiling and floor effects. If you look at the graph, it's an example illustrating a ceiling effect and a floor effect on the DV and how that can obscure group gender differences on the IV.

Ceilings, Floors, and Independent Variables

Sometimes ceiling and floor effects can be the result of a problematic IV, as in the money and mood study, in which all three levels of the IV were very low amounts of money: none, 25 cents, or a dollar.

Power of Large Samples: Advantages

Studies with large samples have two major advantages. 1) Large samples make the CI narrow, they lead to a more precise estimate of any statistic, whether it's a mean, a correlation, or a difference between groups. -Large samples are more likely to lead to statistically significant results (CIs that do not include zero) when an effect is real. 2) Effects detected from small samples sometimes can't be repeated. EX: Imagine a study on online reading games that tested only 10 children. Even if reading games don't work, it's possible that just by chance, three children show a terrific improvement in reading after using them. Those children would have a disproportionate effect on the results because the sample was so small. And because the result was due primarily to three exceptional children, researchers may not be able to replicate it. Indeed, the CI for such a small sample will be very wide, reflecting how difficult it is to predict what the next study will show. -In contrast, in a larger sample (say, 100 children), three exceptional kids would have much less impact on the overall pattern. -Large samples have a better chance of estimating real effects.


Conjuntos de estudio relacionados

Biology 300 Chapter 6 review questions

View Set

Chapter 8 - Configuring Basic Switch Management (Key Terms & Topics)

View Set

NRSG 305 Practice Questions Exam 4

View Set

los avances tecnológicos (sustantivos y adjetivos)

View Set

Full Disclosure in Financial Reporting

View Set

French Family Vocabulary Practice

View Set