Psychology 3010 Exam 3 (Bauer)
Temporal Precedence
-1. Manipulate IV -2. Measure DV -example - Pasta study: serving bowl size first manipulated before measuring amount of pasta taken and amount of pasta consumed.
Attrition (aka Mortality) Threat
-A differential loss of participants from the various experimental groups. -example: effects of age on eyewitness memory; what threat to internal validity do you need to worry about? a. attrition; certain type of participant drops out (old participant). -to prevent threat - remove dropouts' data from the pretest group average.
Testing Threat (aka Repeated Testing)
-A kind of order eeffect in which scores change over time just because participants have taken the test more than once. -example: retrieval cues can improve memory; can professor Beach tell his students that this proves the effectiveness of using retrieval cues as a mnemonic strategy? Why or why not? a. Professor Beach can not tell his students that this proves the effectiveness of using retrieval cues as a mnemonic strategy. Testing threat; students may have recalled more words after the second test of recall because of the retrieval cues or because the list was the same and they were familiar with the list of words. Avoid the problem by using different list of words. -To prevent threats: a. Do not use a pretest b. If use a pretest, use alternative forms of the test for pretest and posttest. c. Comparison group - testing effect can be ruled out if the treatment group ashows a larger group shows a larger change than the comparison group.
Multiple-Baseline Designs
-A multiple-baseline design is used across individuals, times, or situations used to rule out alternative explanations. -Example: In a special education classroom, an over correction consequence was first applied when a girl touched her face. Later, the teachers used overcorrection when the girl touched her hair and, still later, when she grabbed objects in the classroom. The graphs show how the rate of each behavior changed when the overcorrection consequence was introduced.
Identifying Factorial Designs in Empricial Journal Articles and Popular Press Articles
-ANOVA (F-Ratio) -2 x 2 (or 2 x 3 or 2 x 2 x 2, etc) -"It depends" *Information about design of study can be found in methods section of empirical design; look for key words such as "factorial design, an analysis of ratio, 2 x 2 (or 2 x 3, 2 x 2 x 2, etc), also identify number of IVs. Section contains specific statistical information such as the mean or p-value. *Media articles: look at how many IVs were in study and look for "it depends" when describing results.
Advantages and Disadvantages of within-groups design
-Advantages: -1. Participants act as their own control, therefore the groups are equivalent -2. Researchers have more power to notice a statistically significant result -3. Requires fewer participants *Independent groups design requires more participants to receive data points. Within groups design require less participants to receive more data points. Receive all levels of IV. -Disadvantages: -1. A within-groups designmay not be practical or possible. -2. Demand characteristics (aka experimental demand) - the within-groups design may contain cues that lead participants to guess hypothesis. -3. Order effects: a. Order effects are a threat to internal validity (a potential confound) in that performance in later levels of the IV might be caused by the order of the conditions rather than the experimental manipulations. b. Order effects; being exposed to one condition changes how people react to later conditions. 1. Practice effects (aka fatigue effects_ 2. Carry-over effects *Counterbalancing can help control for order effects: Complete counterbalancing and Partial counterbalancing (e.g., Balanced Latin Square)
Advantages and Disadvantages of Small-N designs
-Advantages: a. Focus on individual performance b. Can tailor treatment for an individual and/or a small group of individuals c. CAn examine unusual and rare cases d. Can establish excellent construct validity e. Internal validity can be just as high as that of a repeated - measures experiment with a large group of participants. f. Can establish external validity g. Avoid ethical problems (control, placebo groups) -Disadvantages: a. Generalizability b. Ethical issues with reversing treatment
Stable-Baseline Design (aka AB Design)
-Also known as Simple Comparison Design or an AB design -Measure baseline (A) -Introduce treatment (B) *Observe behavior for period of time before beginning treatment. Then, aftertreatment, behavior, will be observed. *Threats reduced for IV; similar results for external validity -Spelling Test Performance Example: a. Child test scores go higher due to help from teacher or could it be that parent worked with child at home.
Reversal Design
-Also known as an ABAB design or a Replication Design -Can test internal validity -Can make a causal statement is the results are reversed. *Can examine effectiveness of treatment; through unethical to withdraw treatment. *Depending on situation, may be unethical to remove treatment, however some argue it is ethical to reverse the treatment in order to determine if treatment is supported empirically.
Concurrent Measures Design
-An experiment using a within-groups design in which participants are exposed to all the levels of an independent variable at roughly the same time, and a single attitudinal or behavioral preference is the dependent variable.
pretest/posttest design
-An experiment using an independent-groups design in which participants are tested on the key dependent variable twice: once before and once after exposure to the independent variable. -Use if researchers want to be absolutely sure that the groups were equivalent at pretesting. -Sometimes pretest may provide clue to hypothesis and participants respond accordingly
Instrumentation Threat (aka Instrument Decay)
-Any changes that occur when a measuring instrument changes over time from having been used before (e.g., coders, interviewers). -example: Food choice; can the researchers conclude that the the 20 minute interactive video improved children's food choices? a. Researcers can not conclude that the 20 min. interactive video improved children's food choices, because researchers changed where they were observing the food choices. The location and change between the two groups and the ability to see the choices was different between the two groups. Researchers need to worry about instrumentation. -To prevent threats: a. Train interviewers/coders b. Retrain interviewers/coders throughout the study c. Train research assisstants d. Consider using a posttest only design e. Make sure alternative tests are equivalent f. Counterbalance tests used at pretest and posttest
Internal Validity
-Are there alternative explanations for the results? a. Control for confounds (aka confounding variables) - potential alternative explanations for a research finding; can influence IV a. design confounds 1. systematic variability b. selection effects c. order effects
Most Theory Testing in Psychology is Done on WEIRD People
-Arnett (2008): a. Top 6 Psychology Journals (2007) b. 68% American c. 96% North America, Europe, Australia, or Israel -WEIRD -Western -Educated -Industrialized -Rich -Democratic
Increasing the Number of IVs
-Can have more than 2 IVs -If more than 2 IVs, look for possible main effects and interactions -Example: 2 (driver age: young vs. old) x 2 (cell phone condition: hands-free vs. no phone) x 2 (traffic condition: light vs. heavy). -Look for 3 main effects - Look for 3 2-way interactions -Look for 1 3-way interaction
Maturation Threat
-Changes in biological and psychological conditions that occur as timepasses (e.g., development, fatigue, spontaneous remission) -Example: modeling; can we conclude that modeling increased the likelihood with which infants made a neat pincer grasp? 1. We can not conclude that modeling increased the likelihood with which infants made a neat pincer grasp. Difference could be due to maturation; an infant's fine motor skills require for the pincer grasp is likely developed over the 2 month period of time between the first lab session and the second lab session. We don't know if modeling increased the likelihood of making the pincer grasp to pick up the peas or if the chil'd natural development is the likelihood of the chil making the pincer grasp, to pick up the peas. -To prevent threat - include comparison group (s) a. Can examine threats like internal validity b. For the infant study, researchers could have included a compariosn group where the researchers did not ask the parents to model the pincer grasp. Addition of compariosn group allows researchers to see if maturation was a threat.
Nonequivalent Groups Interrupted Time-Series Design
-Combines two designs a. Nonequivalent control groups design b. Interrupted time-series design *By adding a control group to interrupted time-series design, this design can control confounding variables such as history, maturation, and other threats to validity
Interactions are more important than Main effects
-Conducting experiment using factorial design; collect data, analyze data, and find significant main effects and interaction. -Find the exciting and most accurate story by looking at the interaction.
Evaluating the 4 validites in quasi-experiments
-Construct validity - How well were the variables manipulated or measured? -Statistical validity - How large were the differences (effect size)? Were the results statistically significant (p<.05)? -External validity - can these findings generalize? -Internal validity - Are there alternative explanations for the pattern of results? Did the researchers control for confounding variables?
Interrogating causal claims with the 4 validities
-Construct validity: How well were the variables manipulated and measured? a. If needed, did the researchers use a manipulation check? A manipulation check is used to see if the manipulation was effective. b. If needed, did the researchers conduct a pilot study? Pilot study - a study conducted to test the effectiveness of the manipulation. -Extternal validity: To whom or to what can you generalize the causal claim? a. other people (ranndom sampling) b. other situations *Example: Socially excluded group rated warm food (coffee) as more desirable than those in socially included condition. *Example: Poor performance in red color condition a. Researchers prioritize internal validity over external validity -Statistical Validity: How well do your data support your causal conclusion? a. Is the difference statistically significant (p<.05)? 1. Yes, covariance 2. No, the study does not support a causal claim b. How large is the effect? -Internal Validity: are there alternative explanations for the outcome? a. Design confounds? Did some other variable accidentally covary along with the IV? b. Independent-groups design - control for selection effects? c. Within-subjects design - control for order effects?
Evaluating the 4 validities in Small-N Designs
-Construct validity: How well were the variables measured or manipulated? Were the measures reliable and valid? -Statistical Validity: Look at trends and by what margin did the client's behavior improve? -External Validity: Can combine results of small-N designs with other studies. Can examine whether findings generalize to across people, situations, and time. Treatment may be useful even if only applies to a few people. -Internal Validity: Can eliminate alternative explanations (measure behavior repeatedly before and after intervention)
Demand characteristics
-Cues that lead participants to guess a study's hypothese or goals and chage their behavior in the expected direction. -Example: Students' stress levels 1. Students know they are participating in a stress-reduction study and may report less anxiety.
Cultural Psychology
-Cultural Psychologists are interested in how cultural settings settings influence the way a person thinks, feels, and behaves.
Manipulation Checks
-Detect weak manipulations -If manipulation was effective, look for other reasons for null effect 1. Ceiling and floor effects 2. There really is no difference -Mood and False Memory example: make sure participants were sad after watching video. Happy participants were happy, and neutral participants were neutral. 1. Use manipulation check to see if manipulation was effective. *If null effect in study, can rule out possibility that the null effect occured because had a weak manipulation; manipulation check illustrates that manipulation was not weak.
History Threat
-Events that occur outside the lab affect everyone or almost everyone (e.g., significant events, seasons) -Example: Students' stress-reduction at beginning and end of semseter; can I conclude that the stress-reduction workshop decreased anxiety? 1. Due to threat of history, can not conclude that the stress-reduction workshop decreased anxiety/ There was another event that coincided with study (semester finish). Unknown whether it was stress-reduction workshop or the completion of the academic semester that influenced anxiety levels. -To prevent threat - include comparison group (s) 1. Could have included a comparison group that received no therapy for stress-reduction study. Decrease in anxiety resuts from semester ending and not from stress-reduction workshop.
Covariance
-Experiments are able to examine covariance - Is the IV related to the DV? Are distinct levels of the IV associated with different levels of the DV? -All experiments need a comparison group a. Can ask "compared to what"? b. Control group - a "no treatment" or a "neutral" condition 1. Placebo group (placebo condition): examples - sugar pill, saline injection c. Do not have to have a control group - can compare treatment groups d. Must find a statistical difference to state covariance *Have to have two conditions; need comparison group -Example (Pasta): a. Serving bowl size; two conditions: medium and large (IV) b. amount of pasta served and amount of pasta consumed (DV) c. No control group d. Covariance: established when participants in large bowl condition taking more pasta and consuming more pasta than in a medium bowl.
To be important, must a study have external validity? NO
-Generalization mode: a. Frequency claims (sometimes association and causal) b. Goal is to make a claim about a population c. Ecological validity - want tasks and manipulations to be similar to real-world contexts. d. real world matters *External validity is essential -Theory-testing mode: a. Association and causal claims b. Goal is to test a theory c. Isolate variables d. Prioritize internal validity e. Artificial situations may be required f. Real world comes later *External Validity is not the priority
Which design is better: Posttest-only design or pretest/posttest design?
-If you want to make sure the groups are equivalent at the start of the study, then you should use a pretest/posttest design. -However, in some situations pretest may be problematic (e.g., can guess hypothesis)
complete counterbalancing
-In complete counterbalancing (aka full counterbalancing), all possible orders are presented. -If a researcher has two conditions, let's call them conditions 1 and 2 (or A and B), the researcers would have two possible orders a. 1 then 2 (or AB) b. 2 then 1 (or BA) -The researcehr would then randomly assign hal of the participants to one order and half of the participants to the other order. -example: Chocolate task a. IV two conditions 1. Taste chocolate alone (A) 2. Taste chocolate with a confederate (B) b. DV: Taste Rating
Experiment
-Manipulate at least one variable and measure another
Factorial Designs used to study manipulated or participant variables
-Manipulated variable: cell phone condition -Participant variable: age -In factorial designs, researchers refer to both manipulated and participant variables as independent variables. *Participant variables such as age, eye color, and whether a person attends preschool are not truly independent vairables as they can not be manipulated.
Potential Internal validity threats in one-group, pretest/posttest designs
-Maturation -History -Regression to the mean -Attrition -Testing -Instrumentation
Measurement Error
-Measurement error can inflate or deflate a participant's true score on a DV. a. Let's say I'm interested in examining the effects of music while studying (IV) on later test performance (DV). 1. One participant was drowsy 2. Another participant made good guesses. -DV = true score +/- random error of measurement *More sources of random in a dependent variable's measurement, the more variability there will be in an experiment. More precise = less variability *Example: height and weight; some errors can influence height and weight. a. Solution 1: -Reliable measurements (i.e., internal, interrater, test-retest) -Good construct validity -precision b. Solution 2: -Measure more instances
Advantages of Factorial Design
-More than one IV can be manipulated - enables researchers to test and establish multiple influences of behavior. -Can test limits: a. a form of external validity b. Can test for moderators -Can test theories
Nonequivalent Control Group Designs
-Most common type of quasi-experiment (posttest only and pretest posttest design) -Experimental group and control group are not randoly assigned -Participants are not randomly assigned *Group differences may or may not affect the dependent variable (groups being nonrandom and perhaps being different)
Perhaps within-groups variability obscured the group differences
-Noise (aka error variance) - unsystematic variability within a group a. Measurement error b. Individual differences c. situation variability *Within-groups variability: the variability within each treatment condition.
Potential internal validity threats in any experiment
-Observer bias -Demand characteristics -Placebo effects
Observer bias (aka Experimenter Bias)
-Observers' expectation influence their interpretation of the participants' behaviors or the outcome of a study. -Example: Sports fans tend tos see opposing team's fouls morethan their own team's fouls. -Observer bias can affects: a. Internal validity - Is the IV responsible for the change in the DV or is the change due to observer bias? b. Construct validity - Are the researchers really measuring what they say they are measuring?
Combined Threats
-Ocassionally in a pretest/posttest design, selection effects can be combined with a. History threats (selection-history threat) b. Attrition threats (selection-attrition threat) c. Maturation threats (selection-maturation threat)
Simple experiments
-One IV with 2 or more levels -One DV -Example: Stryer and Drews (2004) study examined effects of driving condition either on a cell phone or not on a cell phone on the amount of time it takes to hit the brakes. a. Drivers using cell phone slower to hit brake than those not on cell phone.
The Replication Debate
-Open science collaboration (large group of psychologists). Set out to directly replicate findings from 100 studies in 3 psychology journals (Psychological sciences, journal of personality and social psychology, and journal of Experimental Psychology: Learning memory, and cognition) Published in 2008. a. Findings: 39% of the studies replicated the original effects - "replication crisis" (please note other measures found a higher rate of replication). -Why might replication studies fail? -Problems with: a. Replication attempt b. Original Study 1. HARKing - hypothesizing after the results are known. 2. p-hacking - adding participants after the results are analyzed, looking of outliers, running new analyses to obtain a p-value < .05. c. publication practices -Improvements to scientific practice a. Research journals require larger sample sizes and researchers to report all of the variables tested and analyses conducted. b. Open science - share materials and data 1. open science foundation (OSF): Reproductibility Project: Psychology 2. OSF: Collaborative Replications and Education Project -Pre-registration -Some journals have a section for replication studies
Another Name for these solutions: Power
-Power - the likelihood that a study wil show a statistically significant result when the effect is truly present in the population a. Increase power by: 1. Larger samples 2. Accurate measures 3. Strong manipulations 4. Less situation noise *Studies with low power can find only large effects, not small ones. Studies with lots of pwer can find even small effect sizes. *Low power may produce a null result
Balancing Priorities in Quasi-Experiments
-Pros: a. enables researchers to take advantage of real world opportunities b. Can enhanve external validity c. Allows for applied research when experiments are not possible (e.g., ethics) d. Whereas variables are only measured in correlational studies, using quasi-experimental designs enable researchers to increase internal validity relative to correlational studies by matching participants and/or using a comparison groups. -Cons: a. Tend to sacrifice some internal validity for external validity.
Null effects can be hard to find in the literature
-Publication bias: the bias for research journals and the media to publish research studies that show significant results. a. Null results are rarely published
Avoiding selection effects
-Random assignment -Matched-groups designs a. IQ sort by that then sort for alcohol conditions and would be repeated until participants are assigned to all conditions.
Direct Replication
-Replicate the orginial study as closely as possible. a. same variables, same operationalizations (operaational definitions). *Same materials and procedures, however participants are different and other small validities occurred. *Any threats to internal validity such as construct validity flaws and design confounds would be present in a direct replication.
Quasi-Experiments
-Research in which the scientist doesn't have complete control -May not be able to manipulate the IV. -Participants are often selected rather than randomly assigned. *Example: Effects of language delay on later language development a. Have to select participants who have had language delays and compare them to children who do not have language delays at 18 months. Researcher can't manipulate which children will have language delays.
Replicability, Importance, and Popular Media
-Responsible journalists a. consider replicability when presenting the latest studies b. provide information about the entire literature c. place the new study in context with the body of literature d. critically evaluate the original research -Bohannon (2015): journalists who conducted and published a flawed research study a. Randomly assigned 6 dieters to 1. a low carb diet and 1.5ounces of chocolate daily 2. a low carb diet b. Measured 18 variables (e.g., weight, mood, sleep, quality, cholesterol, sodium, blood pressure) c. A few weeks later found those who ate chocolate had lost weight 10% faster than those in the control group d. Published in International Archives of Medicine e. Problems 1. Small sample size 2. Examined 18 different variables f. Published in a for-profit online journal g. Press release was picked up by at least a dozen popular outlets. *Example: Eating chocolate to lose weight
Interrupted Time-Series Design
-Same group measured over time on some DV before, during, and after the "interruption" caused by some event (IV) 00000000x00000000 0=observation (DV) x=interruption/event/treatment (IV)
Replication-plus extension
-Same variable plus some new variables
Conceptual Replication
-Same variables, different operationalizations (operational definitions) *Conceptual variable stays the same, operational variable changes. *use different manipulations of IV, and different measures of DV.
Literature Review
-Scientific Literature -describes the research findings on a particular topic and provides a critique. *Example: Lit review on false memory would describe the most common ways to examine false memory, the variables that researchers have examined in terms of false memory, the findings and a critique of previous research and suggestions for future research.
Meta-analysis
-Scientific literature -a mathematical summary of the scientific literature on a particular topic. a. Provides one overall effect size (average results from studies conducted on that topic) *valuable; combined findings from variety of studies, direct replications, and conceptual replications into a single average and provides with effect size (small, or weak; medium or moderate; large, or strong) -Strengths: a. Published papers are peer-reviewed b. Find an overall effect size c. Can look for moderators d. Tell us the weight of the evidence -Limitations: a. Publication bias: null effects are less lkely to be published than studies finding significant effects. b. File drawer problem: a meta-analysis may overestimate the true size of an effect because null effects are less likely to be published, and therefore, less likely to be included in the meta-analysis.
Selection effects
-Selection effects occur when the kinds of participants in one level of the IV are systematically different from those at the other level (s) of the IV. -Examples: a. Allow participants to decide which condition they want to be in. b. First to volunteer placed in one group and the last to volunteer were placed in a different group.
Nonequivalent Control Group Posttest only design
-Similar to the Independent-Groups Posttest only design discussed in ch. 10 except participants are not randomly assigned to groups. *participants arenot randomy assigned to different groups. They are selected *Because groups are not randomly assigned, use term non-equivalent. *Because dependent variable is measured only once at the end of the study, we use the term posttest only.
Nonequivalent Control Group Pretest-Posttest Design
-Similar to the Independent-Groups Pretest/Posttest Design discussed in Ch. 10 except participants are not randomly assigned to groups.
Individual differences
-Solution 1: Change the design a. Use a within-groups design to control for individual difference.s. -Solution 2: Add more participants a. Reduces impact of individual differences within groups and increases the ability to detect differences bewteen groups.
Factorial Designs
-Study 2 or more IVs a. Enables researchers to examine how combinations of factors acting together influence behavior. 1. Factor: another term for an IV 2. Cell: a condition in an experiment *Cell phone condition (on or off phone) and age (young drivers and old drivers) -Each IV is represented by a number that refers to the number of levels of the IV. -Different factorial designs -2 x 2 a. 1st IV has 2 levels b. 2nd IV has 2 levels c. 2 x 2 = 4 cells (conditions) -3 x 2 a. 1st IV has 3 levels b. 2nd IV has 2 levels c. 2 x 3 = 6 cells (conditions) -2 x 2 x 3 a. 1st IV has 2 levels b. 2nd IV has 2 levels c. 3rd IV has 3 levels d. 2 x 2 x 3 = 12 cells (conditions) -Types of Factorial Designs: a. Independent-Groups (aka Between-Subjects) factorial design: all factors (IVs) are manipulated as independent-groups. b. Within-groups factorial design (aka Repeated-Measures): all actors (IVs) are manipulated as within-groups. c. Mixed factorial design: at least one factor is manipulated within-groups and at least factor is manipulated between-groups.
Small-N designs
-Study one or a few participants -Typically observe and collect baseline data -Examine trend to determine effect *Small group design; participants are treated equally and data for each individual are presented compared to large group designs where participants are grouped and data are represented as group averages. *Example: Ebbinghaus Forgetting curve; researchers can obtain a lot of information from a small number of people. *Not all small group designs are case study designs, others use experimental designs on a few research participants. Small group designs an be used in wide variety of settings.
Internal Validity in quasi-experiments
-Tend to sacrifice some internal validity for external validity -Ability to make causal claims relies on the design and the results a. Does the design rule out threats to internal validity (i.e., design confounds, order effects, selection effects, maturation, history, attrition, regression to the mean, instrumentation, testing, observer bias, experimental demand, placebo effects) 1. Are the results consistent with a causal claim? -When a quasi-experiment includes a comparison group and the right pattern of results, researchers can often support a causal claim".
Regression to the Mean (aka Regression threat)
-Tendency for extreme scores to be less extreme on subsequent tests. -To prevent threat - include a compariosn group that was equally extreme at pretest and that did not receive the treatment *Extremely high scores at pretest become low at posttest and/or low at pretest become higher at posttest. -Example: childhood obesity; can the researchers conclude that the 20-minute interactive video improved the children's healthy eating behaviors? 1. The researchers can not conclude that the 20 min. interactive videos improved the children's healthy eating behaviors. Look at poor performing kids test score; likely to increase on posttest due to regression to the mean.
Balanced Latin Square
-The Balanced Latin Square (aka Latin Square) is a partial counterbalancing technique in which each condition precedes and follows every other condition equally often. a. Formula: 1, 2, n, 3, n-1, 4, n-2, ... b. Even number of conditions - that number of rows 1. 4 conditions = 4 rows 2. 6 conditions = 6 rows c. Odd number of conditions - twice that number of rows 1. 5 conditions = 10 rows 2. 7 conditions = 14 rows
Interrogating Null Effects
-The IV did not have an effect on the DV. -Example: amount of money participants receive does not affect their happiness the next day. a. The study was not designed well enough 1. Not enough variability btwn. levels. 2. Too much variability b. The IV really does not have an effect on the DV.
Crossover interaction
-The lines cross over each other.
Spreading Interaction
-The lines touch but do not crossover
Placebo effects
-The placebo effect is an effect that occurs when people receiving an experimental treatment experience a change only because they believe they are receiving a valid treatment -Examples: 1. Sugar pills 2. Saline injections 3. Sham surgeries 4. Placebo therapies -A double blind placebo control study can be used to rule out a placebo effect. In this type of study, neither the person administering the condition nor the participants know whether they are in a treatment group or the placebo effect. a. Separate true effects of a particular therapy from placebo effect. -In order to identify a placebo effect, need 3 comparison groups; improvement du to Placebo effect and not to other threats of internal validity such as maturation, history, regression to the mean, testing, and instrumentation.
To Be Important, Does a Study Have to Take Place in a Real-World Setting?
-Theory-testing mode often requires artificial settings. -Some lab settings can feel emotionally real. (Experimental realism) -Field setting - study takes place in the real world (high ecological validity; aka mundane realism). a. Priority: external validity -Laboratory setting: a. Can be high in experimental realism b. Priority: internal validity
Interpreting Factorial Designs
-Two kinds of information: -Main effect: the overall effect of one IV, regardless of the other IVs. a. Calculate a main effect for each IV. b. Calculate the marginal means - the means for each level of an IV, averaging across the levels of the other IV. c. Ask: Is there an overall difference? -Interaction: the effect of one IV depends on the effect of another IV. a. Calculate the difference in differences b. Ask: IS there a difference in differences? c. Significant effect: "It depends..."
Unsystematic variability is not the same as confounds
-Unsystematic variability: not a confound because individual differences do not coincide in some experimental way with the group membership. -Systematic variability is a confound because levels of a variable coincide in some predictable way with the experimental group membership. a. pesto - medium bowl, marinara - large bowl; can't be sure it was bowl size or pasta sauce that was affecting participants calorie consumption.
Perhaps There really is no difference?
-Was the study adequately designed to examine the effects of the Iv on the DV? -Was the manipulation strong? -Was the DV sensitive enough to detect group differences? -Were there ceiling or floor effects? -Was the DV reliable and precise? -Could individual differences be obscuring the effect of the IV? -Did the study include enough participants to counteract the effects of measurement error and indivudual differences? -Was the study conducted with appropriate situational controls?
Weak manipulation and Insensitive Measures of the DV
-Weak Manipulation of the IV a. How did the researcher operationalize (operationally define) the IV? -Insensitive measures of the DV a. Was the DV sensitive enough to measure a difference is there was a difference? 1. How was happiness measured? On a sclae from 1-3?
Why experiments support causal claims
-Well-designed experiments can establish the 3 criteria for establishing causation: A. Covariance - Is the IV related to the DV? B. Temporal Precedence - Does the causal variable come before the effect variable in time? C. Internal Validity - Does the design control for alternative explanations for the results?
Controlling for observer bias and demand characteristics
-What can be done to reduce the likelihood of observer bias demand characteristics? The researcher (s) can conduct a double-study or a masked design study. a. Double-blind study: neither the participants nor the experimenter (s) know who is receiving which condition. b. Masked DESIGN (aka blind design, aka blinded experiment): the observer bias does not know the condition to which participants have been assigned or the participants do not know which condition they are receiving.
Design confound
-a design confound is an experimenter's mistake in the research design in that a second variable happens to vary systematically along with the IV. a. Systematic variability: levels of a variable coincide in some predictable way with experimental group membership *alternative explanations for results
Contrl variable
-a potential third variable that an experimenter holds constant.
Independent groups design
-also known as between-subjects design -different groups of participants are placed into different levels of the Iv. a. Each participant receives one and ONLY one level of the IV.
within-groups design
-also known as within-subjects design -each participant is presented with all levels of the IV. *No need to worry about individual differences between conditions in a within-groups design.
Repeated-Measures Design
-an experiment using a within-groups design in which participants respond on a dependent variable more than once; after exposure to each level of the independent variable. a. example: rate chocolate after each test
Posttest-only design
-an experiment using an independent-groups design in which participants are tested on the dependent variable only once. *randomly assigned to independent variable groups
Situational Noise
-situation noise - irrelevant events, sounds, or distractions in the external situation that create unsystematic variability within groups. -Solution: control the external environment (lab)
Dependent Variable (DV)
-the variabe being measured to assess the effects of the independent variable.
Independent Variable (IV)
-variable "manipulated" by the experimenter. a. conditions: levels of an IV
Ceiling and Floor Effects
the IV groups score almost the same of the DV.