Research Methods Final
In order for a study to have a factorial design, which of the following features must it have? Please check all that apply. a. A measure of the DV b. At least one measured IV c. Two or more IVs, crossed with each other d. At least one manipulated IV
a. A measure of the DV c. Two or more IVs, crossed with each other d. At least one manipulated IV
Which of the following is the best research design if the researcher wants to demonstrate that random assignment created groups that are starting off equivalent on the DV (before the manipulation). a. Pretest/posttest design b. Repeated measures design c. Posttest-only design
a. Pretest/posttest design
Midway through a survey study at a university, a participant feels very uncomfortable answering some questions about sexuality. The participant gets up to leave, but the researcher explains that she needs to complete the entire survey in order to get credit for it. Which ethical principle does the researcher's behavior violate? a. Principle of respect for people's rights and dignity b. Principle of fidelity and responsibility c. Principle of justice d. Principle of integrity
a. Principle of respect for people's rights and dignity
A study is published reporting a significant positive correlation between circulating testosterone levels and self-reported behavioral dominance in men, r = .23, p < .05. Another study is published a few months later reporting no association between circulating testosterone and behavioral dominance, r = .09, p = .42. The studies used the same methods and measures; however, the first study was conducted on a sample of male undergraduate students from a large Midwestern university in the U.S., whereas the second study was conducted on a sample of male farmers in a small village in rural Poland. Which of the following is/are possible explanations for this discrepancy? a. The strength of the association between testosterone and behavioral dominance in men depends on other factors (e.g., one's lifestyle). b. The second study's finding was a false negative. c. The first study's finding was a false positive.
a. The strength of the association between testosterone and behavioral dominance in men depends on other factors (e.g., one's lifestyle). b. The second study's finding was a false negative. c. The first study's finding was a false positive.
A study reports that 16% of people who drink coffee ultimately get coronary heart disease. Based on this statistic, a popular press article claims, "Drinking coffee increases your risk of coronary heart disease!" Which of the following questions should you ask to evaluate this claim? Please choose the best answer. a. What percentage of people who do not drink coffee get coronary heart disease? b. Were there any participants in the study who drank a lot of coffee per day but did not get coronary heart disease? c. How many cups of coffee did participants drink per day?
a. What percentage of people who do not drink coffee get coronary heart disease?
You're conducting a 2 X 2 mixed factorial design. If you would like 50 participants per condition, how many participants will you need in total? a. 400 b. 100 c. 200 d. 50
b. 100
Angelica finds out that she got a very high score---91 points out of 100---on her last math test. The class average was 83, and the standard deviation was 4. Based on this information, Angelica does a quick calculation and estimates that roughly ___% of her classmates scored at or below her score. a. 84% b. 97.5% c. 34%
b. 97.5%
Claim: Experiences of sexual harassment are more common among women than among men or non-binary people. What type of claim is this? a. Frequency claim b. Association claim c. Causal claim
b. Association claim
True or False: A non-equivalent groups quasi-experiment satisfies all three criteria for establishing causation. a. True b. False
b. False
True or false: Whenever possible, physiological measures (e.g., hormones, brain activity, heart rate, etc.) should be used instead of self-report measures. Physiological measures are always more valid than self-report measures. a. True b. False
b. False
Which of the following sampling methods requires having a list or registry of all members of the population? a. Quota sampling b. Systematic sampling c. Purposive sampling
b. Systematic sampling
If a variable is normally distributed... a. approximately 68% of scores fall between 0 and 1 SD above the mean. b. approximately 34% of scores fall between 0 and 1 SD above the mean. c. approximately 95% of scores fall within 3 SD above or below the mean.
b. approximately 34% of scores fall between 0 and 1 SD above the mean.
All else equal, the less statistical power you have, the more likely you are to ________. a. commit a Type 1 error b. commit a Type 2 error c. have a confound
b. commit a Type 2 error
You're interested in whether how much time someone spends outdoors per week is associated with their happiness. You recruit a convenience sample of fellow CU students to take your online self-report survey. The survey includes a single item assessing hours spent outdoors per week and a validated scale assessing happiness. After data collection is complete, you run a test of Pearson's r correlation coefficient to test whether there is a positive association between hours outdoors per week and happiness scores. You find that r = .34, p = .04. Although you have no way of knowing this, there really is a positive association between hours outdoors and happiness in the population. This would make your finding a ______________. a. false negative b. true positive c. false positive d. true negative
b. true positive
Now, you're conducting a 2 X 2 X 2 independent-groups (between-groups) factorial design. If you would like 50 participant per condition, how many participants will you need in total? a. 50 b. 100 c. 400 d. 200
c. 400
What should a meta-analyst do to minimize possible bias due to the file drawer problem on their meta-analysis findings? a. Only include published studies in the meta-analysis b. Only include unpublished studies in the meta-analysis c. Include published and unpublished studies in the meta-analysis
c. Include published and unpublished studies in the meta-analysis
Which of the following is an advantage of using a within-group, rather than an independent-groups, experimental design? a. In within-group designs, participants are less likely to guess the hypothesis b. Within-group designs cannot have design confounds c. Within-group designs give you more power to detect a difference between conditions
c. Within-group designs give you more power to detect a difference between conditions
You overhear a classmate say that psychology isn't a "real science" because its findings don't replicate. You aren't sure whether to believe this, so you ask your psychology professor what she thinks. In this case, you are relying on _________ to evaluate your classmate's claim. a. intuition b. experience c. authority
c. authority
I researcher is interested in how attraction contributes to a wide variety of social behaviors. She conducts a study in which each participant is filmed interacting with a confederate. She measures multiple variables during this interaction. For example, to capture the independent variable--that is, the participant's attraction to the confederate, the researcher measured the participant's pupil dilation during their interaction. In this example, attraction is the _____, and pupil dilation is the _____ of attraction. a. construct; conceptual definition b. conceptual definition; operational definition c. conceptual variable; operationalization
c. conceptual variable; operationalization
A 2 X 2 factorial design shows that participating in an exercise program (vs. a control program) increases wellbeing among both adults and adolescents. How would you describe this result? a. There's a spreading interaction between participating in the exercise program and age b. There's a crossover interaction between participating in the exercise program and age c. There's a main effect of age on participating in the exercise program d. There's a main effect of participating in the exercise program on wellbeing
d. There's a main effect of participating in the exercise program on wellbeing
A study reports that higher self-confidence is associated with having a larger number of friends even after controlling for physical attractiveness (β = 0.20, p = .038). What is the proper way to interpret this regression coefficient? a. A one standard deviation increase in self-confidence is associated with a .20 standard deviation increase in number of friends, after controlling for physical attractiveness. b. An increase of 1 point in self-confidence is associated with an increase of .20 friends, after controlling for physical attractiveness. c. A one standard deviation increase in number of friends is associated with a .20 standard deviation increase in self-confidence, after controlling for physical attractiveness.
a. A one standard deviation increase in self-confidence is associated with a .20 standard deviation increase in number of friends, after controlling for physical attractiveness.
At what point can a psychological scientist say that a theory has been proven? a. A theory can never be proven because new evidence that does not support the theory could always emerge. b. The theory has been proven once each of its key hypotheses has been tested and supported. c. The theory has been proven once the weight of the evidence supports the theory (across many studies). d. As long as the study used rigorous methods, a theory has been proven once it has been tested and supported.
a. A theory can never be proven because new evidence that does not support the theory could always emerge.
An educational psychology researcher has a hypothesis that math teachers give more encouragement to male than to female students. To test this hypothesis, he conducts an observational study in a 9th grade math class. Twice a week for 5 weeks, he observes the class and codes the teacher's "encouragement behavior," including how often the teacher smiles, says "good job," and calls on male vs. female students. Unfortunately, this researcher's study is at risk of observer bias. What could he do to reduce this risk? a. Answer Have multiple observers who are unfamiliar with the hypothesis independently code the teacher's behavior b. Observe multiple classrooms with diverse teachers c. Observe each teacher through a two-way mirror so that the teacher does not know when they're being watched
a. Answer Have multiple observers who are unfamiliar with the hypothesis independently code the teacher's behavior
A group of students is conducting a survey for their research methods class. They distribute the survey to friends and classmates via Facebook. The survey asks for the participant's name, gender, age, and responses to a set of questions about their exercise habits. Once the group of students has received the completed surveys, they assign each participant an ID number and strip all other identifying information from their data so that all that is left in the data file is ID number, gender, age, and exercise habits (no names). The student researchers then store a list of names and ID numbers (showing which name goes with which ID number) separately, away from the data, in a locked filing cabinet that only the researchers have access to. This would be an example of a(n)... a. Confidential study b. Anonymous study c. Neither anonymous nor confidential
a. Confidential study
An online news article describes a study in which they found that only 20% of a sample of children who had been diagnosed with Autism Spectrum Disorder passed a widely used Theory of Mind task (the Sally-Anne Task). As a reader, I'm immediately curious how representative their sample was of all children on the autism spectrum. Which of the following kinds of validity am I interrogating? a. External validity b. Construct validity c. Statistical validity
a. External validity
Which of the following claims is construct validity relevant to evaluating? Please check all that apply. a. Frequency claim b. Causal claim c. Association claim
a. Frequency claim b. Causal claim c. Association claim
Match each of the following scenarios with the corresponding type of reliability. Please use each answer option only once. Internal Reliability, Inter-rater Reliability, and Test-retest Reliability a. Two raters independently code videotaped interactions between romantic partners for displays of affection. They then examine agreement between their codes. b. Participants are asked to complete a self-report measure of openness to experience (a personality trait) at "time 1" and again at "time 2" a year later to examine whether their scores correlate. c. To test her new neuroticism scale, a researcher has a sample of participants complete the scale. She then examines Cronbach's alpha to see how well the items "hang" together (correlate with each other).
a. Inter-rater reliability b. test-retest reliability c. Internal reliability
In a true experiment, holding everything else (aside from the manipulation) constant helps us to establish which of the following criteria for causality? Please choose the best answer. a. Internal validity b. Covariance c. Temporal precedence
a. Internal validity
Generally speaking, which of the following is/are true of a direct replication? Please choose all that apply. a. It can increase the credibility of the original findings. b. It uses the same operationalizations as the original study. c. It tests the hypothesis or theory in a new context, thereby expanding what we know.
a. It can increase the credibility of the original findings. b. It uses the same operationalizations as the original study.
Which of the following effects could an outlier have on an association between two variables? Please check all that apply. a. It could make it appear that there is no correlation, even though there really is one across the rest of the observations. b. It could make a correlation appear weaker than it really is. c. It could make a correlation appear stronger than it really is.
a. It could make it appear that there is no correlation, even though there really is one across the rest of the observations. b. It could make a correlation appear weaker than it really is. c. It could make a correlation appear stronger than it really is.
Match each piece of information to the appropriate section of an APA paper. Please select the best option, and use each option only once. (You will not use all of the options.) Abstract, Introduction, Method, Results, Discussion, References a. How each variable was measured or manipulated b. Tables or figures visualizing the key findings c. Strengths and limitations of the research d. Explanation of how past theory and research motivated the present research
a. Method b. Results c. Discussion d. Introduction
You turn on the news, and a reporter is covering a study that examined the effectiveness of SSRIs at reducing depressive symptoms. (SSRIs, or selective serotonin reuptake inhibitors, are a widely prescribed class of antidepressant drugs.) The study showed that, among participants randomly assigned to the treatment (SSRI) condition, 40% reported a reduction in their depressive symptoms over the 2-month research period. The reporter concludes that SSRIs are highly effective at treating depression. As a critical consumer of research, should you accept this conclusion? a. No, to evaluate the effectiveness of SSRIs, we would also need to know what percentage of participants in the control group experienced a reduction in depressive symptoms. b. Yes, when dealing with psychological effects, 40% is conventionally interpreted as a large effect size. c. No, research findings are probablistic, so we cannot draw any conclusions based on a single study.
a. No, to evaluate the effectiveness of SSRIs, we would also need to know what percentage of participants in the control group experienced a reduction in depressive symptoms.
Which of the following statements or information should a consent form include? Please check all that apply. a. Participation is voluntary b. Potential risks and benefits to the participant c. Participants may withdraw at any time d. Participants may request access to the study data at any time
a. Participation is voluntary b. Potential risks and benefits to the participant c. Participants may withdraw at any time
Which measure of effect size is most common for examining the strength of the association between two quantitative variables (rather than, for example, the size of the difference between group means)? a. Pearson's r b. Hedges's g c. Cohen's d
a. Pearson's r
A public health researcher is interested in measuring students' support for mandating mask-wearing in class. Her goal is to collect data that will be simple and straightforward to interpret, easy to analyze, and that will capture the degree of students' support for this policy. Which of the following survey questions is best designed to serve this researcher's goals? a. Rate your agreement with the following statement: I support mandating mask-wearing in class. 1=Strongly disagree, 2=Disagree, 3=Neither agree nor disagree, 4=Agree, 5=Strongly agree b. How do you feel about mandating mask-wearing? Choose any number between 1-100, where 1=Masks should never be required and 100=Masks should always be required c. Do you support or oppose mandating mask-wearing? Support/Oppose
a. Rate your agreement with the following statement: I support mandating mask-wearing in class. 1=Strongly disagree, 2=Disagree, 3=Neither agree nor disagree, 4=Agree, 5=Strongly agree
Match each of the following variables with the type of variable or type of scale. Please use each option only once. Ordinal, Categorical, Ratio, Interval a. Number of stressful life events experienced last year b. Drug vs. placebo c. IQ (intelligence quotient) d. Level of expertise: beginner, intermediate, or advanced
a. Ratio b. Categorical c. Interval d. Ordinal
A coach suspects that the athletes on her track team tend to run faster in the morning than in the evening. She decides to conduct an experiment to test this hypothesis. She asks all of the athletes on her team to sign up for a time-slot to have their running recorded. They can choose among early morning timeslots (between 7am-10am) and late evening timeslots (between 7pm-10pm). All sessions are held at the same in-door track, with everything else held constant (lighting, temperature, noise levels, etc.). The coach uses a stopwatch to record the athletes' running times at all sessions. She then computes the average running time across all of the morning time-slots and the average running time across all of the evening time-slots. She finds that, on average, athletes ran faster in the morning than in the evening, just as she had predicted. However, there is a serious threat to internal validity here. What is it? a. Selection effect b. Sampling bias c. Design confound
a. Selection effect
A researcher is interested in the effect of physiological arousal on the ability to delay gratification. Her hypothesis is that, when someone is experiencing high levels of physiological arousal, they will be less able to delay gratification (i.e. less able to resist the urge to take an immediate reward, rather than wait and get an even larger reward in the future). To test this idea, the researcher goes to Six Flags amusement park. She approaches people who just finished riding the fastest roller coaster at the park ("high arousal" condition), as well as people who just finished riding a slow-moving tram ("low arousal" condition). She has all of the participants complete a measure of delay of gratification, which consists of a series of hypothetical choices (e.g., "Would you rather have $10 right now or $20 tomorrow?"). She finds that participants in the "low arousal" condition show a significantly greater ability to delay gratification than do participants in the high arousal condition, p < .05. Unfortunately, there is a serious problem with her research design. Which of the following threats to internal validity should the researcher be concerned about? a. Selection effects b. Fatigue effects c. Maturation effects d. Practice effects
a. Selection effects
A group of students taking a course that involves weekly quizzes decides to conduct their own longitudinal study. They're interested in testing their hypothesis that getting more sleep in the week before their next quiz leads to higher quiz scores. Every Sunday for four weeks, each member of the group records the average number of hours of sleep they got per night that week and their score on that week's quiz. They then combine all of their data into one file so that they can examine various correlations. They find that the cross-lag correlations between average number of hours of sleep at earlier timepoints and quiz scores at later timepoints are stronger than are the cross-lag correlations between quiz scores at earlier timepoints and average number of hours of sleep at later timepoints. Although this finding cannot prove _________, it is consistent with _________. (The same word or phrase will go in both blanks.) a. Temporal precedence b. Moderation c. Internal validity
a. Temporal precedence
A researcher conducts a true experiment with random assignment to test the effectiveness of a new drug for reducing blood pressure. She finds that participants in the treatment (drug) condition displayed lower blood pressure than did participants in the placebo control condition, d = -.49. What is the proper way to interpret this effect size (d)? a. The drug reduced blood pressure by an average of nearly 1/2 of a standard deviation compared with the placebo control condition. b. The drug reduced blood pressure by an average of 49 more points than the placebo control condition. c. Blood pressure in the drug condition was roughly half that in the placebo control condition.
a. The drug reduced blood pressure by an average of nearly 1/2 of a standard deviation compared with the placebo control condition.
A researcher just finished collecting data for a project examining the effectiveness of an after school program for increasing students' sense of belonging. She conducts an independent t-test to test whether mean sense of belonging was higher among students who completed the after school program than among students who experienced a control condition. She gets a p-value of 0.05. What can she conclude? a. The probability of observing a difference this large or larger if the null hypothesis is correct is 5%. b. The probability of observing a difference in means this small could only have occurred by chance. c. The probability of her hypothesis being correct, given a difference this large, is only 5%.
a. The probability of observing a difference this large or larger if the null hypothesis is correct is 5%.
You're reviewing a manuscript that has been submitted for publication at a journal. The manuscript presents a study designed to test whether extroversion is associated with the self-reported quality of one's social support network. The study has a fairly large sample, n = 1,417, and used valid measures of extroversion and social support network quality. What concerns you is this: To test the association between extroversion and social support network quality, the authors conducted a multiple regression analysis examining this association while controlling for income, age, and education level. You cannot find any explanation in the manuscript as to why it was necessary to control for these other variables. Based on the information provided here, which of the following problematic research practices should you be most concerned about? a. p-hacking b. HARKing c. data peeking
a. p-hacking
A social psychologist is interested in the effectiveness of various interventions for changing people's attitudes toward legalizing marijuana. He recruits 20 participants from a community where attitudes toward legalizing marijuana are known to vary widely. He randomly assigns 10 participants to read a 1-page article about why legalizing marijuana benefits the community and the other 10 participants to read a 1-page control article about current community programs. Afterward, he has all participants rate their attitudes toward legalizing marijuana on a scale from 1 to 3 (1 = Strongly oppose, 2 = Neutral, 3 = Strongly support). He analyzes the data by conducting an independent t-test and is disappointed to find that condition (i.e. which article they read) did not have a significant effect on participants' attitudes, p = .14. Based on the information given here, which of the following is/are plausible explanations for this null result? Please check ALL that apply. a. This is a false negative; the measure was not sensitive enough to detect the difference between groups b. This is a true negative; there really is no effect c. This is a false negative; individual differences obscured the difference between groups
a. This is a false negative; the measure was not sensitive enough to detect the difference between groups b. This is a true negative; there really is no effect c. This is a false negative; individual differences obscured the difference between groups
A researcher conducts an experiment to examine the impact of exercise on sleep quality. She randomly assigns 5 people to exercise 3 times a week for a month and another 5 people to exercise 1 time a week for a month. At the end of the month, she asks everyone to rate their average sleep quality over the past month. She conducts a statistical test and gets a null result; based on her study, there does not appear to be an effect of exercise on sleep quality. What she doesn't know is that in the broader population, exercise actually does have a positive impact on sleep quality. This means her null finding is a _______. a. Type II error b. Confound c. Type I error
a. Type II error
A researcher is training her team of research assistants to do behavioral coding for a study. She realizes that the research assistants are so familiar with the hypothesis that they already have strong expectations for how the participants will behave. The researcher is worried that the research assistants' expectations could affect how they perceive and code participants' behavior in the treatment vs. control condition. Which of the following would be most effective at reducing this threat to internal validity? a. Use a double-blind or masked design (so that research assistants don't know which condition participants are in) b. Add a placebo condition c. Use random assignment to determine which research assistant codes which participant's behavior d. Have the research assistants do the behavioral coding through a 2-way mirror
a. Use a double-blind or masked design (so that research assistants don't know which condition participants are in)
A researcher conducts a self-report survey study to examine whether jealousy and relationship satisfaction are associated. She finds that people in relationships who report less jealousy tend to report that they are more satisfied with their relationships. Which of the following results would provide the strongest support for this claim? a. r = -.42, p = .032 b. r = -.28, p = .054 c. r = .62, p = .001
a. r = -.42, p = .032
You're having a friendly debate with your friend about whether a republican or democrat will win the next presidential election. She is arguing that a republican will win, whereas you are arguing that a democrat will win. To get more information, you do a Google search for "democrat predicted to win in 2024" and find several articles that, indeed, a democrat is projected to win in 2024. Armed with this information, you feel even more confident that you are right. Unfortunately, your approach to seeking information about the 2024 election reflects a common cognitive bias known as _____________. a. confirmation bias b. the present/present bias c. the availability heuristic
a. confirmation bias
People high on neuroticism tend to be more prone to experiencing negative emotions and cognitions (e.g., fear, worry, anger, frustration, jealousy, loneliness, etc.). They also tend to be more bothered by stressors. In addition, people high on neuroticism may appear moody---experiencing sudden shifts between emotional "highs" and "lows." Based on this conceptual definition of neuroticism, researchers developed a new self-report neuroticism scale. The researchers are now trying to assess their measure's construct validity. Match each of the following methods to the type of construct validity it would assess. Please use each option only once. Content Validity, Convergent Validity, Criterion Validity, Face Validity, and Discriminant Validity a. Do the items capture negative emotionality, susceptibility to stress, and moodiness (all of these facets of neuroticism, not just one or two)? b. Do the items appear to assess neuroticism? c. Do scores on our measure correlate positively with relevant behavioral outcomes (e.g., using more negative emotional language, displaying more negative emotional facial expressions)? d. Do scores on our measure correlate positively with scores on an anxiety measure, a related psychological construct? e. Are scores on our measure uncorrelated with scores on measures of unrelated constructs (e.g., intelligence)?
a. content validity b. face validity c. criterion validity d. convergent validity e. discriminant validity
Broadly speaking, it is thought that individuals who possess genetic or other risk factors for a mental disorder are more likely to actually develop the disorder if they are also exposed to stressful conditions. Based on this general idea, a researcher conducting a survey study predicts that, among people who report have a sibling or parent with depression, those who also report having experienced one or more major life stressors in the past 5 years will be more likely to have been diagnosed with depression themselves (as compared with those who report having experienced no major life stressors in the past 5 years). In this example, the general idea that genetic risk factors combine with stressful conditions to trigger the development of mental disorders is a (BLANK), whereas the researcher's prediction regarding what she expects to find in her survey study is a (BLANK) Fill in the BLANK(s)
a. theory b. hypothesis
A developmental psychologist is interested in whether a new critical thinking program for first graders can promote their cognitive development more generally. He goes to a local public school and enrolls all first graders in the program. Before the program starts, he has each child complete a cognitive test to assess their general cognitive ability (Time 1). Then, the children all complete a 3-month critical thinking program. After the program ends, the developmental psychologist returns and has the children complete a slightly different cognitive test (to prevent testing and practice effects) to assess their general cognitive ability again (Time 2). The developmental psychologist analyzes the data and sees that, on average, childrens' scores on the cognitive test improved significantly from Time 1 to Time 2. When attempting to publish his findings, however, reviewers comment that maturation could provide an alternative explanation for the results. The reviewers ask him to conduct a new study to address this weakness. Which of the following changes to his research design would do the best job of ruling out this alternative explanation? a. Have the students complete the study sessions just 1 month apart, instead of 3 months apart b. Add a control condition; then, randomly assign children to the critical thinking program or to a control program so that you can compare these groups' cognitive development 3 months later c. Conduct the same study on children in multiple different grades (e.g., 1st grade, 2nd grade, 3rd grade) d. Counterbalance the order in which the children take the two different versions of the cognitive test
b. Add a control condition; then, randomly assign children to the critical thinking program or to a control program so that you can compare these groups' cognitive development 3 months later
A researcher is interested in whether people with higher incomes tend to have more sex partners. They recruit a convenience sample of 50 participants from the community. In an anonymous survey, each participant reports their annual income and lifetime number of sex partners, as well as a variety of demographic variables (e.g., gender, age, race/ethnicity, etc.). Once data collection is complete, the researcher conducts a correlational analysis and finds that the correlation between income and number of sex partners is statistically significant, r = .31, p = .04. However, a colleague suggests that the researcher should have statistically controlled for age (which could predict both income and number of sex partners). The researcher conducts a multiple regression analysis with income and age as predictors of number of sex partners. Which of the following results (below) would be consistent with age being a third variable that could explain the association between income and number of sex partners? a. The relationship between age and number of sex partners is not statistically significant. b. After statistically controlling for age, the relationship between income and number of sex partners is no longer statistically significant. c. After statistically controlling for age, the relationship between income and number of sex partners becomes stronger and remains statistically significant.
b. After statistically controlling for age, the relationship between income and number of sex partners is no longer statistically significant.
Previous online self-report survey studies have found evidence for an association between screen time and depressive symptoms among adolescents. You are interested in examining whether this is truly a causal effect (and not merely an association), so you conduct a true experiment with an independent-groups posttest-only design. You recruit a sample of adolescents from a local school district. You then randomly assign each participant to a "high screen time" or "low screen time" condition. Participants in the high screen time condition are asked to spend 8 hours a day or more on their screen for 1 week. Participants in the low screen time condition are asked to spend 4 hours a day or less on their screen for 1 week. After that week, you have all of the participants complete a clinical interview to assess their depressive symptoms. What type of replication is this? a. Replication-plus-extension b. Conceptual replication c. Direct replication
b. Conceptual replication
A cognitive psychologist is interested in the effects of blood glucose levels on cognitive function. They recruit a community sample of 100 participants, age 8 to 90 years old (M = 48.31, SD = 26.34). Participants come from varying educational and other backgrounds. They randomly assign each participant to drink a high-sugar or sugar-free fruit juice drink to manipulate their blood glucose levels. Then, they measure how many word puzzles they can solve in 15 minutes to assess their cognitive function. Each word puzzle has just one correct answer, and the measure has been validated for assessing cognitive function through past research. All testing sessions take place in a tightly controlled lab setting with all other sights, smells, sounds, time of day, researcher, etc., held constant. Based on the information provided, what's the most likely source of within-groups variability? a. Situation noise b. Individual differences c. Measurement error d. Selection effects
b. Individual differences
You assess participants' anxiety at a pretest session. Then, you have all of the participants take an anti-anxiety drug for 6 weeks. At the end of this 6-week period, you assess the participants' anxiety again at a posttest session. You wish to analyze your data to test whether participants' anxiety is lower at posttest than at pretest. Which statistical test should you conduct? a. Independent samples t-test b. Paired samples t-test c. One-way ANOVA with post hoc tests
b. Paired samples t-test
A relationship scientist is conducting a study to examine blame among unhappy romantic couples by observing how couples interact. Although the researcher has hypothesized that unhappy couples blame one another on a fairly frequent basis when they are alone, the couples do not blame each other in front of her. In fact, they are intentionally more diplomatic toward each other whenever she is there. What does this best illustrate? a. Observer expectancy effects b. Reactivity c. Observer bias
b. Reactivity
A team of educational psychology researchers recently conducted a study of 100 incoming high school freshman. They randomly assigned 50 of the students to participate in a 1-week intensive math camp just before the start of 9th grade (the "intervention" condition) and the other 50 students to participate in a 1-week nature appreciation camp just before the start of 9th grade (the "control" condition). The researchers then tracked the students throughout 9th grade to monitor their performance in their math courses. At the end of that year, the researchers found that the students in the intervention condition had earned better grades in their math courses, on average, than did the students in the control condition. The researchers concluded that their intervention (math camp) is an effective intervention for promoting math performance among students transitioning to high school-level math. But wait! Which of the following new pieces of information should lead you to doubt the researchers' conclusion? a. A couple of the students who participated in the math camp ultimately failed 9th grade math. b. Students who participated in the math camp were given a free graphing calculator to take home at the end of the camp, whereas students in the control condition were not. c. The student who ended up earning the highest grade in 9th grade math had actually participated in the control condition.
b. Students who participated in the math camp were given a free graphing calculator to take home at the end of the camp, whereas students in the control condition were not.
In science, it is said that a good theory is parsimonious. What does that mean? a. The best theory is the most comprehensive theory (i.e. it explains many different phenomena). b. The best theory is the simplest theory that explains the data. c. The best theory is the simplest theory of those that have been proposed. d. The best theory is the one that is least intuitive.
b. The best theory is the simplest theory that explains the data.
Janessa is conducting an experiment for her senior honors thesis. She was leaning toward using a 2 X 2 X 2 within-groups factorial design, but now she's not sure. She realized this could be fatiguing for the participants, given that this would involve each participant experiencing ____ conditions. a. 2 b. 4 c. 8 d. 6
c. 8
What are some appropriate reasons a researcher might choose to conduct a small-N study? Please check ALL that apply. a. The researcher wants to minimize noise due to individual differences. b. The researcher is NOT interested in generalizing the study's findings to other people. c. The researcher is interested in people with a rare condition (and it won't be possible to recruit a large sample). d. The effect (association or difference between groups) is so large that it can be detected even in a small sample.
b. The researcher is NOT interested in generalizing the study's findings to other people. c. The researcher is interested in people with a rare condition (and it won't be possible to recruit a large sample). d. The effect (association or difference between groups) is so large that it can be detected even in a small sample.
You just finished conducting an online survey study about psychological wellbeing. You run a test to examine Cronbach's alpha for your psychological wellbeing scale, a set of 10 questions you designed to measure wellbeing. You find that Cronbach's alpha = .10. According to conventions in psych science, how should you interpret this? a. Your scale has excellent internal reliability; your items are "hanging together" exceptionally well. b. Your scale has very poor internal reliability; the items are not "hanging together." c. Your scale has marginally acceptable internal reliability (it is right on the edge of being acceptable).
b. Your scale has very poor internal reliability; the items are not "hanging together."
An article reports that a study of Canadian 7th-12th grade boys and girls found an association between screen time and symptoms of anxiety and depression. In this study, what is gender? a. a confound b. a variable c. a constant
b. a variable
A health psychologist is interested in the effect of physical exertion on wellbeing. She recruits a convenience sample of 50 participants from her university. She randomly assigns 25 participants to walk on a treadmill at 3 miles per hour for 10 minutes ("low exertion" condition) and the other 25 participants to walk on the treadmill at 3.5 miles per hour for 10 minutes ("high exertion" condition). She is extremely careful to hold everything else constant (room temperature, background noise, etc.). After walking on the treadmill, the participants complete a validated measure of wellbeing. The researcher conducts an independent t-test and is disappointed to find that the difference between her "low exertion" and "high exertion" groups on wellbeing is not statistically significant, p = .08. Although she doesn't know this, however, physical exertion actually does have a positive effect on wellbeing in the population. This would make her finding a ___________. (Please assume we are using a conventional alpha level of .05.) a. true positive b. false negative c. false positive d. true negative Part 2: This question refers to the study described above (see Question 1). Given the information provided about the study's design and methods, which of the following is the most plausible reason why the researcher failed to detect a difference between groups? a. The manipulation was too weak b. There is a design confound c. There was too much situation noise to detect an effect d. There is a selection effect
b. false negative a. The manipulation was too weak
In our Theory of Planned Behavior Survey, we collected data on PSYC 3111 students' attitude toward exercise and intention to exercise (as well as a variety of other variables). Let's say you analyze these data and find a positive correlation between attitude and intention. Given the research design, which of the following claims can you make? Please check all that apply. a. Attitudes toward exercise influence intention to exercise. b. More positive attitudes toward exercise lead to greater intentions to exercise. c. Attitude toward exercise and intention to exercise were positively associated.
c. Attitude toward exercise and intention to exercise were positively associated.
An educational psychology researcher wants to examine the prevalence of "math phobia" among students taking calculus 1 at his university. There are many different sections of calc 1 (more than 20 sections), so he can't include all of them in his study. Instead, he randomly selects 10 of those sections and then includes allof the students from each of the 10 sections selected for his sample. Which sampling method is this? a. Stratified sampling b. Multistage sampling c. Cluster sampling
c. Cluster sampling
A researcher is interested in whether running improves joint health among older adults. He asks older adults at various retirement homes to complete a survey. The survey includes two questions: How many times do you go running in an average week? 0 times, 1 or more times How would you rate your joint health? 1 = poor, 3 = average, 5 = excellent The researcher finds that participants who reported going on a run 1 or more times per week reported better joint health, on average, than did participants who reported going on a run 0 times per week, p = .023. Which of the following criteria for causation does this finding satisfy? Please check all that apply. a. Internal validity b. Temporal precedence c. Covariance
c. Covariance
In an experiment, participants are asked to play a computer game in which they will compete against another player. The researcher either tells them that they are playing against a student from their same university (low-competition condition) or that they are playing against a student from a rival university (high-competition condition). In fact, however, the participants are not playing against another person at all. They're actually playing against the computer. The researcher has misled the participants in order to manipulate their competitive motivation. In this example, the researcher is using _________. a. Undue influence b. Deception by omission c. Deception by commission d. Coercion
c. Deception by commission
Theories both inform and constrain the research questions we ask. Which of the following would be a reasonable research question for Harry Harlow to pursue if his objective is to test his Contact Comfort theory of mother-infant attachment? a. Do infants who prefer the wire mother mature more rapidly than infants who prefer the cloth mother? b. Do infants who spend more time with the wire mother than with the cloth mother exhibit signs of poorer immune function (e.g., getting sick more often)? c. Do infants seek out the mother only when they are hungry, or do they seek out the mother at other times as well (e.g., even when they are not hungry)? d. Does temperament (e.g., behavioral inhibition) predict which infants spend more time with the cloth vs. wire mother?
c. Do infants seek out the mother only when they are hungry, or do they seek out the mother at other times as well (e.g., even when they are not hungry)?
Now, you're the editor of a major psychology journal. A group of researchers recently attempted to replicate several studies that were published in your journal, and they were unable to replicate any of them. It has been very embarrassing for your journal; people are starting to question whether the findings reported in your journal are just a bunch of false positives. Looking back at the studies that failed to replicate, you notice that many had rather elaborate hypotheses---for example, predicting two-way or even three-way interactions without any strong theoretical reason to do so. You're concerned that the researchers who conducted those studies may have engaged in HARKing, and unfortunately, your journal's review process did not catch this. What's one action you could take to protect your journal from HARKing in the future? a. Implement a new policy that your journal will only publish studies with large sample sizes (100 participants or more). b. Implement a new policy that your journal will only publish studies if the researchers report detailed information about total sample size and how many participants were excluded for various reasons. c. Implement a new policy that your journal will only publish preregistered studies.
c. Implement a new policy that your journal will only publish preregistered studies.
A researcher just finished conducting an experiment to examine whether a body positivity intervention increases women's body self-esteem. She recruited 100 women from the local community to participate in her study. She randomly assigned fifty of the participants to complete a body positivity program for 1 week and the other fifty to complete a control program for 1 week. After that, she asked all of the participants to complete a self-report scale measuring body self-esteem. The researcher would like to examine whether body self-esteem scores were higher, on average, among participants who completed the body positivity program than among participants who completed the control program. Which statistical test should she use? a. Paired samples t-test b. Chi-square test of independence c. Independent samples t-test
c. Independent samples t-test
Some survey questions are better designed than others. What is the problem with this survey question? Do you enjoy traveling and meeting new people? Yes/No a. It's a leading question b. It is never appropriate to use forced-choice questions to assess attitudes c. It's double-barreled
c. It's double-barreled
A social psychologist is interested in examining the effectiveness of an intervention (vs. a control program) at reducing unconscious bias. He has already recruited a sample of 50 participants (roughly 48% women, 46% men, and 6% non-binary). He suspects that gender could make a difference to how people respond to the intervention. Therefore, he wants to make sure his intervention vs. control groups are identical in terms of the number of women, men, and non-binary people they include. Given what you know about the research and its aims, which of the following methods or designs would be most appropriate for creating equivalent groups? a. Random assignment b. Within-group design c. Matched groups
c. Matched groups
A news article reports that the pandemic has generally had a negative impact on people's memory, such that people report more experiences of "blocking" (not being able to retrieve information from our memory when we want to) now than they did before the pandemic. Interestingly, you have noticed that your own memory has improved since the pandemic. Does your own experience invalidate this research finding? a. Yes, this exception disproves the finding. b. Yes, to be valid, a research finding must describe all people's experiences. c. No, research findings are probablistic.
c. No, research findings are probablistic.
A consumer behavior researcher is interested in whether using blue vs. aqua packaging increases potential customers' interest in buying a product. The researcher recruits a convenience sample of 100 adults and then randomly assigns each participant to either view images of the product in blue packaging or view images of the product in aqua packaging. After viewing the product, all participants are asked to rate how likely they are to buy the product on a scale from 1 = "not at all likely" to 5 = "very likely." What kind of experimental design is this? a. Pretest/posttest design b. Concurrent measures design c. Posttest-only design
c. Posttest-only design
You're conducting a study to examine the effect of exercise on self-esteem. You're also interested in whether the effect of exercise on self-esteem depends on whether someone is a regular vs. occasional exerciser. You conduct an experiment with a 2 X 2 independent-groups factorial design. Your first IV is exercise condition. This is a manipulated variable with two levels; you randomly assign participants to either run on a treadmill for 15 min. (exercise condition) or sit quietly in a waiting room for 15 min. (no-exercise condition). Your second IV is exerciser status. It is a measured variable with two levels; you have participants self-report whether they are a regular exerciser (defined as exercising at least twice per week on average) or an occasional exerciser (defined as exercising fewer than two times per week on average). Your DV is self-esteem, assessed after the manipulation using a self-report self-esteem scale. You compute the mean self-esteem score for each of the 4 conditions. You find that, among both regular exercisers and occasional exercisers, participants in the exercise condition scored an average of 1.5 point higher on the self-esteem scale than did participants in the no-exercise condition. This effect was statistically significant for both regular exercisers and occasional exercisers. How would you describe this finding? a. Significant main effect of exerciser status on self-esteem b. Significant interaction between exercise condition and exerciser status c. Significant main effect of exercise condition on self-esteem
c. Significant main effect of exercise condition on self-esteem
All else equal, the smaller your sample size, the less ____________ you have. a. Internal validity b. Generalizability c. Statistical power
c. Statistical power
A political pollster is trying to predict the results of an upcoming New York City mayoral election. She knows from past research that voters' candidate preferences tend to depend on whether or not they are college educated. She gets access to a registry of all residents of New York City. She then randomly selects people from the registry who are vs. are not college educated in proportion to that group's representation in the New York City population. That way, her sample will have a similar educational breakdown to the NYC population as a whole. Which sampling method is this? a. Oversampling b. Cluster sampling c. Stratified random sampling
c. Stratified random sampling
A large online survey study finds that number of hours spent on social media per day is associated with lower wellbeing among people with depression, whereas there is no relationship between number of hours spent on social media per day and wellbeing among people without depression. How would you describe this result? (Note: It may help you to draw a graph of the result.) a. There's a main effect of depression on social media consumption b. There's a crossover interaction between social media consumption and depression status c. There's a spreading interaction between social media consumption and depression status d. There's a main effect of social media consumption on wellbeing
c. There's a spreading interaction between social media consumption and depression status
I'm interested in whether frustration (frustrated vs. not frustrated) affects aggressive behavior and whether the effect of frustration on aggressive behavior depends on the temperature (hot vs. warm vs. cold). If I want to run an experiment with a factorial design to examine this question, how would I describe the design, and how many unique conditions will there be? a. 1 X 1 factorial design; 3 conditions b. 2 X 2 X 2 factorial design; 6 conditions c. 2 X 2 factorial design; 4 conditions d. 2 X 3 factorial design; 6 conditions
d. 2 X 3 factorial design; 6 conditions
A researcher has just finished collecting data for a huge (and expensive) study of over 10,000 participants. They run a t-test to see if there is a difference between the control group and the treatment group on the dependent variable. The result is nonsignificant, p = .07. They can't believe it. Searching for answers, they begin going through the original data file. They notice that some of the participants did not respond to questions about demographics (race, gender, etc.), though these variables were unrelated to the present hypothesis. The researcher excludes participants with any missing demographic data and reruns the t-test. This time, they get a significant finding, p = .048. They're extremely relieved. They let their collaborators know that the result was significant and begin outlining their manuscript. Whether they intended to or not, this researcher has engaged in a highly problematic practice known as ___________. a. Data fabrication b. HARKing (hypothesizing after the results are known) c. Coercion d. Data falsification
d. Data falsification
A health psychologist is conducting a survey to test whether adolescents' attitudes toward smoking are correlated with how often they smoke. She uses multistage cluster sampling to recruit a random sample of 2,500 middle and high schoolers from the local school district. The participants are asked to complete a survey, which includes two scales. The first scale consists of 5 items, which are summed to yield a total score. The items are designed to assess attitudes toward smoking by asking the students to rate their agreement with statements such as, "I think smoking is attractive." Students respond using a 7-point Likert-type scale (1 = Strongly disagree, 3 = Neither agree nor disagree, 7 = Strongly agree). The second scale assesses smoking behavior by asking the students to write in a number to indicate the number of days in the past week they engaged in various smoking-related behaviors. For example, students are asked, "How many days in the past week did you smoke a cigarette?" After data collection is complete, the researcher runs a correlational analysis. She gets a null result, r = .14, p = .19. Looking back at her data, she notices that most of the participants wrote in "0" for their response to most or all of the smoking behavior items. How might this explain her null finding? a. Her null finding may be due to the sample not being diverse enough b. Her null finding may be a false negative due to a design confound acting in reverse c. Her null finding may be a false negative due to a lack of statistical power d. Her null finding may be a false negative due to a floor effect
d. Her null finding may be a false negative due to a floor effect
A researcher is interested in whether using hormonal contraceptives (i.e. "the birth control pill") decreases women's sexual desire. They recruit a sample of 4 women, all of whom have stated that they plan to begin using hormonal contraceptives at some point in the coming year. Participants are asked to complete a questionnaire once per month for the 12-month duration of the study. The questionnaire is designed to assess their sexual desire. All participants start out not using hormonal contraceptives. Then, at the end of Month 2, Participant 1 is asked to begin using hormonal contraceptives; at the end of Month 4, Participant 2 is asked to begin using hormonal contraceptives; at the end of Month 6, Participant 3 is asked to begin using hormonal contraceptives; and at the end of Month 8, Participant 4 is asked to begin using hormonal contraceptives. The researchers then examine changes in sexual desire from before to after each participant began using hormonal contraceptives. What type of small-N research design does this best illustrate? a. Stable-baseline design b. Reversal design c. Single-baseline design d. Multiple-baseline design
d. Multiple-baseline design
An educational psychologist is interested in the effect of an after school reading program on students' reading skills. He has access to data for all of the 5th graders from a local public school, collected over the last academic year. The data include (among other variables) the students' grades on weekly reading quizzes for the entire 28-week academic year, as well as whether they participated in an after school reading program that took place from Dec. through the end of the school year in June. He realizes he can use these data to examine his research question using a quasi-experimental design. He selects the 50 students who completed the after school program (which he calls the "after school group"), as well as another 50 students who did not complete the program (which he calls the "control group"). He examines the students' reading quiz scores. He finds no difference in average reading quiz scores comparing the after school group to the control group for the first 14 weeks of the year, before the after school reading program started. However, he finds that the students in the after school group consistently earned higher average scores on the reading quizzes than did the control group over the last 14 weeks of the year, once the after school program had started. Also, reading quiz scores steadily improved over the last 14 weeks of the year for the after school group, whereas there was little change in scores during this period for the control group. This would be an example of which of the following types of quasi-experimental research designs? a. Nonequivalent groups pretest/posttest design b. Nonequivalent groups posttest-only design c. Interrupted time series design d. Nonequivalent control group interrupted time-series design
d. Nonequivalent control group interrupted time-series design
A health psychologist is interested in the possible effect of having to wait in a long (vs. short) line on attitudes toward the COVID-19 vaccine. She conducts a survey at two local vaccination sites---one that always has long lines and another that always has short lines. In both cases, she approaches people waiting in line and asks them to fill out a brief survey assessing their attitudes toward the vaccine. Afterward, she conducts an independent t-test and finds that people in the longer line had more negative attitudes toward the vaccine than did people in the shorter line. This would be an example of which of the following types of quasi-experimental research designs? a. Interrupted time series design b. Nonequivalent control group interrupted time-series design c. Nonequivalent groups pretest/posttest design d. Nonequivalent groups posttest-only design
d. Nonequivalent groups posttest-only design
A TV show host claims that many serial killers are actually possessed by evil demons. He wants to invite a psychological scientist to come on his show and help design a study to test this hypothesis. Why might it be difficult for him to find a psychological scientist who will agree to do this? a. Psychological scientists consider it unethical to study any form of criminal behavior. b. As empiricists, psychological scientists generally avoid studying small populations (e.g., serial killers) because findings that are only relevant to small populations are seen as unimportant. c. It is generally seen as inappropriate for psychological scientists to collaborate or share their work directly with journalists/the media. d. The hypothesis that serial killers are possessed by demons is unfalsifiable; there are no valid scientific instruments for measuring whether someone is possessed by a demon.
d. The hypothesis that serial killers are possessed by demons is unfalsifiable; there are no valid scientific instruments for measuring whether someone is possessed by a demon.
Which of the following examples illustrates an empirical approach? a. Using logic to solve a word puzzle b. Using a mathematical formula to calculate the amount of force being exerted on an object c. Doing a "gut check" (checking your intuitions) before making an important moral decision d. Tracking people's heartrate to measure their stress while giving a speech
d. Tracking people's heartrate to measure their stress while giving a speech