Midterm
External Validity
results of a study generalize to, or represent, people or contexts besides those in the original study. generalizability: How did the researchers choose the study's participants, and how well do those participants represent the intended population?
Convergent Validity (empirical way to assess validity)
strong association- measures of similar constructs
False positive/ Type 1 Error
study mistakenly conclude association between two variables from sample but really is no association in the full population - Try to minimize False positive/ Type 1 Error: want to make associations only when there in population
4 cycled in psychological science
theory data cycle basic applied cycle peer review cycle journal to journalism cycle
Association Claims
• Argues one level of a variable is likely to be associated with a particular level of another variable (probabilistic - i.e., will not obtain in every case). • Variables that are associated may be said to correlate, or covary, such that with change in one other is likely to change. Could also say they are related. • Association claim must involve at least two variables. Measured, not manipulated. Interrogate: construct validity • assess the construct validity of EACH variable external validity • asking whether it can generalize to other populations/contexts Statistical validity • no false positive/negative
Causal Claims
• Argues the one variable causes a change in another variable. • To do this, the study manipulates the independent variable in order to measure the effect it has on the dependent variable.
Types of bias
• Confirmation bias vs "cherry-pick evidence" • Availability heuristic - pop-up principle. • Present/present bias • Bias blind spot
Frequency claims
• Describe a particular rate or level of something. Claims mention the percentage of a variable, the number of people who fit a description, or some group's level on a variable. • Focus on one variable. • Always measured, never manipulated. • Can be a report of the frequency of multiple variables, not association between these variables. In other words, single variables are measured, their frequency included in same report. - Reports on the systematic measurement, number and population percentage of a variable. evaluate how well a study supports a frequency claim ex: "The proportion of individuals with high levels of anxiety in 2020 was greater than in 2019" construct validity and external Validity & maybe Statistical validity
There are many ways of knowing and sources of evidence
• Experience • Intuition • Authority • Empirical research
Cause for The Nuremberg Code 1947 11.22
• Named for the post WWII war crimes tribunal at Nuremberg, Germany (1946-47). • Trial of 23 German physicians and scientists "accused of inflicting a range of vile and lethal procedures on vulnerable populations and inmates of concentration camps between 1933 and 1945." Fifteen found guilty, seven acquitted. Seven given the death penalty, eight imprisoned.
Different types of articles
• Review • Meta-analysis • Theory • Perspectives Addresses an entire topic or area of research, providing synthesis and narrative, perspectives, recommendations. Often describes history, key findings, controversies, discrepant results, theoretical perspectives, current state of field, gaps in knowledge, ideas for future directions.
Where do you find empirical evidence?
• Scientific journal articles, typically peer-reviewed • Review papers: CORRECT ANSWER • Qualitative / narrative • Quantitative - meta-analysis • Conference proceeding/abstract - not entire study. • Findings reported in a book chapter - depends, likely the entire study is not peer-reviewed.
Empirical research
• data as evidence to evaluate questions. • Uses the scientific method to evaluate theories. • Good theories are supported by data, falsifiable, and parsimonious. • Hypotheses are accepted or rejected on the basis of data from studies of observable and measurable events or phenomena.
1975 - Department of Health, Education and Welfare (DHEW) creates regulations to implement NIH's 1966 Policies for the Protection of Human Subjects.
"The Common Rule," requires IRBs for human subjects research
Criteria for causation
1. COVARIANCE: establishes that A-->B 2. TEMPORAL PRECEDENCE (directionality problem): Which came first? A-->B or B-->A: If we can't tell, then can't infer causation 3. INTERNAL VALIDITY (third variable problem): Is there a C variable that is associated w both A & B, independently? (C-->A&C-->B): If there is a plausible third variable, we cannot infer causation.
three criteria to support causal claims
1. COVARIANCE: establishes that A-->B 2. TEMPORAL PRECEDENCE (directionality problem): Which came first? A-->B or B-->A: If we can't tell, then can't infer causation 3. INTERNAL VALIDITY (third variable problem): Is there a C variable that is associated w both A & B, independently? (C-->A&C-->B): If there is a plausible third variable, we cannot infer causation.
Research process: theory data cycle
1. Find information 2. Identify question 3. Generate testable hypotheses 4. Construct research design 5. Conduct a study 6. Interpret results 7. Communicate results
reliability measurements
1. Test-retest reliability - consistent scores every time 2. Internal reliability - consistent scores on items of questionnaire 3. Interrater reliability - consistency of different behavioral coders
The Nuremberg Code (1948 international)
10 standards for experiments on human subjects. New standard of ethical medical behavior for the post-World War II human rights era. • Voluntary informed consent of the human subject (right of the individual to control his/her own body). • Risk must be weighed against the expected benefit. • Unnecessary pain and suffering must be avoided. • Doctors should avoid actions that injure human patients. experiment should be such as to yield fruitful results for the good of society, unprocurable by other methods or means of study, and not random and unnecessary in nature. experiment designed and based on results of animal experimentation and a knowledge of the natural history of the disease or other problem under study & anticipated results will justify the performance of the experiment.
Brief History of Human Research Protections in US
1906: Pure Drug and Food Act (US federal) 1948: Nuremberg Code (led to National Research Act) (US federal) 1952: APA Code of Ethics (APA) 1964: Declaration of Helsinki (US federal) 1966: NIH - Office for Protection of Research Subjects (US federal) 1974: National Research Act (led to Belmont report) (US federal) 1979: Belmont Report (US federal)
3 claims, 4 validities approach
3 Claims: - Frequency (percentage, or number, for a variable) - Association (suggests that two variables are related) - Causal (one variable causes change in another) 4 Validities: - Construct - Statistical - Internal - External
Measuring Behavior: Ethogram
A list of behaviors and their operational definition. Often grouped into structural or functional categories.
parsimonious
All other things being equal, the simplest solution is the best
Cherry-picking evidence (distinction w Confirmatory bias)
Analyze available evidence in a way biased to supporting what you already think Selective presentation of: • Data or points on graph • Studies, representation of balance of evidence • Viewpoints (includes false balance, false equivalence, or misrepresentation of expert consensus)
How is construct validity evaluated with evidence-based approaches?
Ask questions about: - Reliability (necessary, not sufficient) - Validity
Confirmatory bias
Asking biased questions "confirmatory hypothesis testing" - asking non-neutral questions, seeking evidence that confirms, ignoring evidence that argues against
Major forms of measurement
Behavior • Behavioral observation • Tasks - tests administered via computer, various mechanical means • Artifacts of behavior Survey • Self-report • Others' report (clinician, parent, teacher) Physiological/neural • MRI • Hormone, chemistry, other physio
Behavioral Observations
Behavioral observation is a broad term referring to a wide range of formal and informal techniques to document behavior Measuring Behavior: How do you know what to code What is the hypothesis? What behavior are you interested in? Think about construct validity.
Differentiating Between Association and Causal Claims
Causal: • Uses causal language: cause, enhance, curb, promotes, changes, leads to, affects, increases, prevents, increases, decreases, etc.. • Causal claims that use tentative language (may, could, seem suggest, possible, potential) are still causal claims. Association -"linked to/goes w/predicts/ is tied to/ at risk for/ prefers/ correlated w/ associated w/ predicts/ are more or less likely to
peer review cycle
Conduct study. Write paper. Submit to journal. Editor decides it should be considered, selects reviewers: Accepted: Published, Revision required, or Rejected Peer-review, or post-publication evaluation, are not the only sources of feedback and critical evaluation: Presentations of research and discussion with colleagues: • Lab meetings • Seminars • Scientific conferences Evaluation continued after publication: • Continued peer critique. • Citations. (ex - look at Web of Science) • Replication. • Weight of evidence.
Pure Drug and Food Act
First legislation to create regulation aimed at ensuring the safety of food and drugs. Animal rule. FDA requires that treatments, devices, and drugs are evaluated for safety and efficacy in 2 types of animals prior to approval for clinical trials in humans.
Surveys: Types of questions:
Forced-choice format: Pick the best of two or more options.
Operationalized variable
How conceptual variable was actually measured. ex: Score on Beck's Depression Inventory
Question interrogation: Specificity
Is what you are trying to understand represented in your questions? Interrogate content and construct validity.
1974 - National Research Act established the National Commission for the Protection of Human Subjects (federal)
Mandates PHS develop regulations to protect the rights of human research subjects federal law resulted in requirement for IRBs in the US One of the charges to the Commission was to identify the basic ethical principles for research involving human subjects
Variables: Manipulated (vs Measured)
Manipulated: Experimenter controls levels of the variable by assigning participants to groups or exposure to different levels of the variable. Examples: Different test conditions, stimuli, drug dose. Some variables allow for both manipulated and measured: (Example: Music lessons, stress, videogame play)
Experimental / Causal Measure
Measured Variable AND Manipulated Variable Y (DV) DEPENDS ON X (IV) • Causal claims are strong statements and strong evidence that can have implications for policy, treatment, and intervention. • By ruling out alternative explanations, experiments can identify causal relationships with more confidence and more potential predictive value. • Well-designed experiments can satisfy all three criteria to support causal claims. (Manipulate, measure)
Descriptive / Frequency Measure
Measured Variable - 1 per claim (NOT Manipulated)
Correlational / Association Measure
Measured Variable - 2 or more (NOT Manipulated) have a correlation: not causal relation
Variables: Measured (vs Manipulated)
Measured: Observe and record values. • Can be measured using tools (ruler, scale, MRI). • If abstract, operationalized and measured through test, survey, or observation of behavior. • Some variables measured as categories (hair color, ADHD) Some variables allow for both manipulated and measured: (Example: Music lessons, stress, videogame play)
Reliability and validity
Measurement must be reliable and cannot be valid without being reliable. - Can it be repeated? - Is it precise? However, a measurement can be reliable and/or precise, but not valid.
Surveys: Types of questions: Likert scale
Never Almost never Sometimes Often Always 1 2 3 4 5 Present statement - indicate degree of agreement. 1: Strongly disagree - 5: Strongly agree
Conceptual variable:
Often abstract. Cannot be measured directly. May be many ways to measure ex: depression
Surveys: Types of questions
Open-ended Semantic differential Likert scale Forced choice
Empirical articles
Original report on a scientific study Includes: • Abstract • Introduction • Detailed Method • Results (tables, figures, statistics) • Discussion • References • Published in a scientific journal • Receives peer-review
Belmont Report (1979)
PHS issues Ethical Principles and Guidelines for the Protection of Human Subjects of Research. • Respect for Persons. Individuals should be treated as autonomous agents; persons with diminished autonomy are entitled to protection. • Beneficence. Do not harm, maximize possible benefits and minimize possible harms. • Justice. Fair distribution of harm and benefit
Faking-good / socially-desirable responding
Participants give inaccurate answers because they are embarrassed, shy, or worried about giving an unpopular opinion
Question interrogation (for survey/self report questions)
Poorly worded questions → Measurement Error
Declaration of Helsinki history
Post WWII "The Declaration developed the ten principles first stated in the Nuremberg Code, and tied them to the Declaration of Geneva (1948), a statement of physicians' ethical duties." Viewed as revision of Hippocratic Oath.
Different types of articles: Primary literature
Primary literature report about a specific original experiment or study Includes background, hypotheses, methods, results (data), interpretation, and conclusion.
APA Principles (ethics) 1952 How participants are treated & How researchers behave- includes study design, treatment of participants, data, and findings
Principle A: Beneficence and Nonmaleficence Principle B: Fidelity and Responsibility Principle C: Integrity Principle D: Justice Principle E: Respect for People's Rights and Dignity
Declaration of Helsinki (US federal- led to belmont report) 1964-current
Principles centered on right to self-determination, right to make informed decisions, ethical considerations take precedence over laws of any country. • risks should not exceed benefits • informed consent • recognition and protection of vulnerable participants • independent ethics committee • qualified researchers • research with humans should be based on the results from laboratory and animal experimentation
APA Principles: Principle A: Beneficence and Nonmaleficence
Psychologists strive to benefit those with whom they work and take care to do no harm. Psychologists seek to safeguard the welfare and rights of those with whom they interact • What would you need to know to assess risk? • What would you want to know as a participant? • Must guard against harm to participants in studies. • Design studies to minimize risk. • Include procedures to assess harm, negative consequences. • Balance - risk and potential benefit problem for beneficence and nonmaleficence: Potential harms of a treatment/study condition are unknown
Review article (Qualitative review)
Qualitative review of an entire area of research, synthesis and narrative describing history, key findings, controversies (discrepant results, theoretical perspectives, current state of field, gaps in knowledge, ideas for future directions).
Review article (Quantitative review)
Quantitative review (meta-analysis) of an area of research, uses a statistical technique to quantitatively evaluate the weight of evidence and data in support of a theory Meta-analysis: Step 1. Collect all possible studies on a particular question. This can include unpublished studies in order to avoid bias. Step 2. Combine results mathematically to study an overall trend in the data. Can use average correlation (r) or effect size (d).
Observer effect
Relates mainly to participants Participant behavior changes because of experimenter/observer Effect of observer influences research participant e.g., Participants react (and act differently because they are being observed; Teacher is told about students and the teacher's behavior leads to a change in the student's behavior
Observer bias
Relates mainly to researchers Data is affected by experimenter bias Experimenter/observer bias influences the data they collect about the participant. Observers see what they expect to see: Researchers' expectations influence results
Surveys: Types of questions: Open-Ended
Respondent answers anyway they want.
Ways of knowing and sources of evidence: Based on authority
STRENGTH: Could be based on the authority's research (empirical evidence) LIMITATIONS: • Could also be the authority's personal experience • Could also be the authority's intuition Questions to ask: • Is the authority an expert whose conclusions are based in empirical evidence? • Does the authority make claims that are supported by evidence? • What is the source of the evidence? • Does it include what is known, not known, limitations, caveats, weaknesses in the evidence?
Ways of knowing and sources of evidence: Intuition ("makes sense")
STRENGTHS: • Available • Can save time/energy LIMITATIONS: • Sources of bias that interfere with good decisions. • The "good story" - accept conclusion because it "makes sense." • Present-present bias - bias towards what is noticed, not what is absent. • Pop-up principle "availability heuristic" - things that easily come to mind guide our thinking. • Cherry-picking evidence - analyze available evidence in a way biased to supporting what you already think. • Asking biased questions "confirmatory hypothesis testing" - asking non-neutral questions, seeking evidence that confirms, ignoring evidence that argues against.
Psychological Science
Scientific study of psychological processes and behavior in human and nonhuman animals Variety of methods, techniques, and tools Experimentation, but correlational and naturalistic observation are also common psychological research approaches.
Ways of knowing and sources of evidence: Experience
Strengths: • Available •save time/energy Limitations: • Central consideration: Compared to what? What is the comparison group? • Compare what would happen with and without. • Confounds: Several possible explanations for any outcome. [Remember, alternative explanations are also called confounds.] Important to isolate variables (factors) in order to rule out alternative explanations QUESTIONS TO ASK: What is the comparison? Is there an alternative explanation?
theses ethical codes or laws talk about the use of nonhuman animals in research or testing
The Nuremberg Code Declaration of Helsinki APA Code of Ethics
Journal-to-journalism cycle
Three considerations, or questions to ask, about media stories: 1) What are the benefits and risks of the story? Who is affected? How? 2) Is the story important? To whom? Why? How? 3) Is the story accurate? Potential benefits= awareness Potential risks= If the conclusions are not based on empirical evidence could lead to weak foundation for good decision making
Measurement validity of abstract constructs
Two subjective ways to assess validity • Face validity (looks like) • Content validity (contains the parts your theory says is should) Three empirical ways to assess validity • Criterion (correlated with relevant behavioral outcome) • Convergent (strong association- measures of similar constructs) • Discriminant (less strong association-dissimilar constructs)
How can an experiment test the causal claim and address the criteria of temporal precedence and ruling out alternative explanations?
Well-designed experiments can satisfy all three criteria to support causal claims. Manipulate the variable hypothesized to be the cause. (IV) Measure the variable hypothesized to be the effect. (DV)
Possible empirical questions relevant to psychological science:
What effect does recess have on children's behavior? Attention? Learning? Mood and mental health?
Random assignment
With random assignment, variation between participants is controlled for because each group will have similar variation
falsifiable
able to be disproven by experimental results
Present/present bias
bias towards what is noticed, not what is absent. Example: I have a psychic relationship with my friend. When I think of my friend they text me: Present/present - times you thought of friend, text received Ignore: Present/absent - times you thought of friend and they didn't text Absent/present - times friend texted when you were not thinking of them
Interrater reliability
coders consistent
Construct Validity: Details in methods section
concerns how well the variables under study have been measured or manipulated assess how well such measurements were conducted How were the variables measured - i.e., how were the conceptual variables operationalized? - Frequency claim: How well have you measured the variable in question? - Association claim: How well have you measured each of the two variables? - Causal claim: How well have you measured and manipulated the variables in the study? • Relevance to hypotheses established through previous literature; • Cronbach's alpha provided for self-report/survey measures; • Interrater reliability provided for behavioral measures. • Threat to construct validity: potential observer effects: ex Guilty behavior is operationalized with a subjective rating by owners (ask about interrater reliability)
Internal reliability
consistent within questionnaire
Content validity (subjective way to assess validity)
contains the parts your theory says is should - related to the number of questions that a survey might include - might wanna avoid long ?s cuz: Long surveys can lead to fatigue, fast/inaccurate responding, undermine construct validity does it capture all parts of a defined construct
Criterion Validity (empirical way to assess validity)
correlated with relevant behavioral outcome How well the measure predicts a future outcome How well the measure correlates with a current outcome
Internal reliability
relevant mainly for self-report measures that contain more than 1 question to measure a construct. Looking for consistency across items. Statistic is Cronbach's alpha (for self-report/survey measures). Calculates inter-item correlations and one measure returned. Closer to 1.0 = higher reliability = justification to combine separate items into 1 measure.
Statistical Validity
extent to which a study's statistical conclusions are accurate and reasonable extent to which data support conclusions STRENGTH of association & STATISTICAL SIGNIFICANCE Results include p values, beta - analysis of strength of relationship while controlling for other variables. margin of error helps us describe how well our sample estimates the true percentag Frequency: - What is the margin of error of the estimate? Association: - If the study finds a relationship what is the probability it is a false alarm? - If no relationship, probability of missing a true relationship? What is the effect size? - How strong is the association? - Is it statistically significant? Causal: -If the study finds a relationship what is the probability it is a false alarm? -If no relationship, probability of missing a true relationship? -What is the effect size? -If there is a difference, how large? -Is the difference between groups statistically significant?
Criteria for causation: COVARIANCE
extent to which two variables are observed to go together
Criteria for causation: INTERNAL VALIDITY (third variable criterion)
indication of a study's ability to eliminate alternative explanations for the association confounds also called internal validity problems study method ensures there are no plausible alternative explanations for the change in B; A is the only thing that changed.
Data falsification
influence studies results by selectively deleting observations or influencing research subjects in hypothesized way
1966 - National Institutes of Health (NIH) establishes an Office for Protection of Research Subjects (OPRR)
issues report Policies for the Protection of Human Subjects recommends creation of Institutional Review Boards (IRB), ethical review panels.
Discriminant Validity (empirical way to assess validity)
less strong association-dissimilar constructs discriminant validity is when studies with theoretically very different constructs produce results that don't correlate results shouldn't correlate if their constructs are measuring different things
self report measure
limitations: potential for biased responses and potential for attrition bias due to fatigue over repeated measurements with cell-phone
face validity (subjective way to assess validity)
looks like
False negative/ Type 2 Error
mistakenly conclude from a sample that there is no association between two variables when there really I San association in the full population - want to reduce the chances of having False negative/ Type 2 Error & missing associations that are really there
Internal Validity
not relevant for association claim extent to which A, rather than some other variable (C), is responsible for changes in B control for Internal Validity problems with random assignment
Criteria for causation: TEMPORAL PRECEDENCE (directionality problem)
one variable has temporal precedence: means it comes first in time, before the other variable (which variable comes first to CAUSE the other) Study's method ensures A comes first in time, before B
Test-retest reliability
participant consistent
Availability heuristic
pop-up principle- Things that "pop up" in our memory and thinking may lead us to overestimate (ex. shark attack headline, vacation plans) • Singular memorable moments have an outsized influence. Easy to recall memories may not be representative and are often not sufficient to generate an accurate picture or accurate probabilities • Low-quality information to form the basis of decisions
Question interrogation: Double-barreled questions
question actually asking more than one question ❑ Look out for works like AND, OR, and BUT.
Question interrogation: Double negatives
question prompt contains more than one negative word (e.g., oppose, not, disagree) ❑ Is it clear what agree and disagree responses indicate?
Question interrogation: Leading questions
questions posed in a way that biases results ❑ Minimize ambiguous words (e.g., could, should, might) ❑ Minimize strong wording (e.g., force, well, poorly)
Surveys: Types of questions: Semantic differential format
rate an object using a numeric scale that is anchored with adjectives. Easy 1 2 3 4 5 6 Hard
Validity
refers to the appropriateness of a conclusion or decision, and in general, a valid claim is reasonable, accurate, and justifiable
Question interrogation: Response options
❑ Are your response options mutually exclusive? ❑ Are your response options exhaustive? ❑ Do your scale endpoints allow for a neutral position? ❑ Do you have "prefer not to respond" options? ❑ Are you collecting demographic data respectfully (e.g., only what is needed, worded appropriately)?
Question interrogation: Response order
❑ How are you combating recency and primacy effects for response options? ❑ Are your scales consistent in numbering and labels, as appropriate?
Question interrogation: Question order
❑ How might responding to questions about X influence responses to questions about Y later on? ❑ Are you ordering questions from broad to specific?