Psych 200 Research Methods Exam 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Take home messages of ethics

- All research must be evaluated for ethical appropriateness (IRB!) - Some inherent "gray zones" in determining what is (un)ethical in research - Ultimately, researcher's job to make ethical behavior a priority

Scale Midpoints

-There can be many meanings of a midpoint measure? Neutrality? Indifference? Ignorance? Ambivalence? N/A? -However, the scale designer could clarify this by changing the language of the midpoint question!

Measure of happiness example

-count # of smiles or count # of blinks -which is reliable and valid?

Types of Measures

1. Observational 2. Physiological 3. Self-report 4. Archival

Deception

- Acceptance/use varies by field - Prevent participants from knowing true study purpose * Use of confederates * False feedback * Presenting 2 related studies as unrelated - Objections * "Lying & deceit are WRONG!" * Undesirable consequences

Inter-item Reliability

- A measure of whether the individual questions in a question set are consistent in their results - Consistency among scale items - Aka internal consistency * Each should be "tapping into" and measuring same general construct - Most commonly used technique to quantify this: Cronbach's alpha * Items are all correlated and measure is reliable if Cronbach's alpha is > 0.70

test-retest reliability

- A method for determining the reliability of a test by comparing a test taker's scores on the same test taken on separate occasions - Measure participants on two separate occasions o Correlate scores o Stable traits • I.e. personality traits, etc.

*** Informed consent ***

- An ethical principle that research participants be told enough to enable them to choose whether they wish to participate in a study - VOLUNTARILY agreeing to participate in research (giving CONSENT) - But, can only give consent after learning enough to make a knowledge decision (INFORMED decision) - AND, a debriefing must follow the experiment Researchers MUST tell participants: 1. The RISKS 2. That they can stop or withdraw from the study at any point WITHOUT penalty 3. The researcher's contact info

Reliability

- Consistency/ dependability of a measure - "Am I measuring SOMETHING?" - True-score variance / Total variance * Sound familiar? - Estimates usually obtained via CORRELATIONS btn measures of same attribute * "How similar are the measures' results?" * Correlation coefficient (r) can range from .00 - 1.00 "Sufficient reliability" ≥ .70

Data Analysis & Presentation

- Data cleaning & deletion * What's OK? What isn't? - Over-analysis & p-hacking * Inflated Type I error - Selective reporting - Post-hoc theorizing (HARKing) * Falsifiability issue

Confidentiality of study

- Data used only for research purposes (not divulged to others) - Ensure anonymity - No names/personal identifiers saved w/ data - Collect data in secure location

Face Validity

- Extent to which respondents can tell what the items are measuring o Is your measure testing what it is actually supposed to measure? o It looks like, on face value, that it is measuring what it should be measuring - How much should we care about face validity? o Matters if you don't actually want your participants to know what you are measuring for, can use deception here

Scientific Misconduct

- Fabrication, Falsification, & Plagiarism * Serious, blatant dishonesty * Ex: Diederik Stapel case - Questionable Practices * Authorship issues * Withholding data/inconsistent findings - Unethical Behavior * Harassment, intimidation, discrimination

Reliability and Validity

- In order for a measure to be valid, it must be reliable - You cannot have a measure that is valid and unreliable - NEEDS TO BE RELIABLE BEFORE VALID - Ex. phrenology example: head shape and intelligence * this is a reliable measure because head shape doesn't change over time but not a valid measure because head shape has nothing to do with intelligence

Physiological measures

- Measures of bodily responses, such as blood pressure or heart rate, used to determine changes in psychological state - How do physiological measure relate to psychological outcomes? -Examples: 1. Electroencephalogram (EEG): records electrical activity of the brain; electrical patterns in your brain activity 2. Electromyograph (EMG): recording the electrical activity produced by skeletal muscles and nerves 3. Neuroimaging, such as an fMRI: detects changes of brain activity by measuring blood flow in the brain 4. Involuntary Responses *Heart rate, skin conductance (skin momentarily becomes a better conductor of electricity when either external or internal stimuli occur that are physiologically arousing) 5. Blood and saliva assays - Ex. Melissa's stress study *Participants had to do "stressful public speaking" task and the researchers collected a saliva sample pre and post public speaking Do cortisol levels change? -- Increased right before and right after public speaking task -Pros? *Less reactivity because less subjective *More objective, your body's response -Cons? *Expensive to measure *Invasive *Interpretation difficulties

systematic vs measurement error

- Observational error (or measurement error) is the difference between a measured value of a quantity and its true value... - Systematic errors are errors that are not determined by chance but are introduced by an inaccuracy (involving either the observation or measurement process) inherent to the system

Discriminant Validity

- Scores on the measure are not related to other measures that are theoretically different - No correlation between measures/ variables that are unrelated - Ex. happiness & conscientiousness

Validity

- The ability of a test to measure what it is intended to measure - Accuracy of a measure - Does observed score reflect ABC of interest? - Need consistency of a measure first before measuring validity - "Am I measuring what I think I'm measuring?"

****interrater reliability****

- The amount of agreement in the observations of different raters who witness the same behavior - Content analysis - qualitative data - Nominal data - Kappa coefficient (chi-square) - Ordinal data - Spearman r; Kendall's tau - Approx. interval/ratio: Pearson r - 3+ raters - Intraclass correlation coefficient ***Scores of 0.70 or higher = "sufficient reliability" • Slightly lower bar (0.60) for: o Kappa o Very difficult rating scales

Construct Validity

- The extent to which variables measure what they are supposed to measure - Correlations among multiple measures - Very important for hypothetical constructs

Convergent Validity

- The measure should correlate more strongly with other measures of the same constructs - Correlates with expected measures - Ex. happiness and satisfaction; happiness and depression

When to waive informed consent? When is it not needed?

- When they is VERY minimal risk - When there is no adverse effects to rights and welfare - Not feasible to conduct research with a consent form *However, the IRB determines and approves when this is acceptable or not *Example, one group is trying to look at conformity between men and women, so asking people to sign up for something and one list has a lot of names and the other has none, but here it wouldn't make sense to give an informed consent before because that would mess up results and would be impossible to really do while still measuring what you want to measure

Response Formats: Rating Scales

-"To what extent do you like or dislike Emory?" * Strongly dislike - Moderately dislike - Neither - Moderately like - Strongly Like -Bipolar adjective scale (semantic differential) * The scale is used in surveys to gauge people's feelings towards a particular subject : measures one's connotations towards something * Semantic differential questions simply ask where the respondent's position is on a scale between two bipolar adjectives, such as "Happy-Sad," "Creamy-Chalky," or "Bright-Dark

Interrater Reliability (IRR)

-A measure for the extent to which two or more raters of the same behavior or event are in agreement with what they observed -It is important in observational methods because it is difficult to replicate a study or for a study to be accurate if there is low reliability -How can we improve IRR? *Good, detailed, and strong working operational definitions of variables *Rater training

Observational Research

-Direct observation of a behavior *In person or via recordings 3 decisions to make here: 1. Natural vs. Contrived Naturalistic: NO active intervention/setting arrangement -Pure participants observation *Researcher does same activities as those being observed -Problems with this? *Limited control for confounding variables Contrived: you arrange the setting/conditions in which you are going to observe people's behaviors -Lab (ex. Milgram shock experiment) -Field (ex. Standford prison experiment) 2. Undisguised vs. Disguised Undisguised: participants know that they are being observed *You could have reactivity concerns here because people already know you are watching them Disguised: participants are not aware that you are observing them *There could be ethical concerns here Alternatives to disguised observations: A) Partial concealment: researchers compromising by letting participants know they are being observed while withholding information of why they are being observed *Lowers concerns of reactivity but does not fully eliminate it B) Knowledgable informants: people who know the participants well *Use these people to observe and rate participant behavior C) Unobtrusive measures: measures that can be taken without participants knowing that they are being studied *Ex. garbage contents, wear on books, Netflix queue, etc. 3. Behavioral Recording: how to observe a participant and how to record their behavior -Narratives or field notes *Qualitative, not quantitative --> content analysis -Checklists *Ex. did the person laugh, cry, are they a man, woman, etc. -Rating scales *Intensity or quality -Temporal measure

Number of Response Options

-For a rating scale, 5-9 options is the SWEET SPOT!! -Example: "Do you like dogs?" * _____ Yes _____ No "To what extent do you like or dislike dogs?" * Strongly dislike, moderately dislike, neither, moderately like, strongly like "On a scale from 1-100, how much do you like dogs?"

Experience Sampling

-Participants report their ABCs (thoughts, feelings, and behaviors) in "real time" during one's daily life -- data collection is happening in the moment when you are doing the study * Instead of asking how you felt about X 2 weeks ago, etc. -Benefits? o Reduces memory bias o Track info that's hard to collect in a lab / via observation -Diary Method: people writing in their diary how they feel in that specific moment *Ex. sleep studies -- write down what you feel like when you cannot sleep, etc. -Computerized ESM *Ex. Meg's food intake study *Could be like an app on your phone that gives you a survey every time you do X and have to rate how you feel about X every hour or so

Coercion

-People cannot be pressured into participating in a study *This refers to real or implied pressure -Participants can decline or quit the study at ANY time -NO EXCESSIVE incentives *Excessive would refer to $500 per hour or something like that -Equitable alternatives *For example, in psych 110 we could write summaries of articles if we did not want to participate in psych studies

How good is our measure? Survey, questionnaire, etc.?

-The measure should assess behavioral variability accurately -Observed Score = True Score + Measurement Error *True Score --> Systematic Variance *Measurement Error --> Error Variance

Invasion of Privacy

-This is up to the discretion of the researcher (and IRB!) -Observation in public areas is okay *Ex. the eye-gazing naturalistic experiment we did; the participants didn't know we were observing them -Observation in places where people expect privacy is NOT OKAY *Ex. dorms, bathrooms, etc. -Private vs. Public? 1. Urinating in a public restroom 2. Kissing in a park 3. Purchasing condoms *These examples are tricky because this behavior might be considered "private" even though the people themselves are in public places

Triangulation/converging operations

-Triangulation: meaning multiple operational definitions of a variable -Using various and diverse measures to assess one construct so reseachers can more accruately assess variable of interest * I.e. have multiple operational deifnitions of happiens and how you can measure happiness -- this will give you broader and more accurate data on how thw participant relates to the construct you are measuring -Determining validity of a measure based on its relation to other measures that are all measuring the same construct * For example, study where the reseachers wanted to assess the effects fo writing about a traumatic effect on the participants' heath so used observational, self-report, and physiological measures to test this

Archival Measures

-Using pre-existing data * Usually NOT collected for research purposes * For example, using data from the census, etc. -Useful for studying: changes over time or rare events (e.g. riots or mass murders) -Limitations: * Data validity / reliability concerns * Because typically insufficient information and very specific

Temporal Measures

-when and how long a behavior occurs -recording how long (normally in min, sec, etc.) a certain behavior occurs in response to a stimuli or variables

What is wrong with the question exercise?

1. "Do you like food?" - Too general and ambiguous - What food specifically are you talking about, etc. 2. "At what velocity do you ambulate?" - Needlessly complicated and confusing what it is asking --> really asking "How fast do you walk?" 3. "How often do you walk your dog?" - This is operating on the assumption that you have a dog in the first place 4. "Do you eat healthy and exercise regularly?" - This is a two part question -- double-barreled question - It is possible to do one without the other; I can eat healthy regularly but not exercise and vice versa

How would you handle: in terms of informed consent and what you would/would not tell the participants

1. ...survey investigating relation between political ideology, attitudes toward abortion rights, and SES? 2. ...experiment in which some people will be exposed to noxious odors to examine effect of disgust on biased attitudes toward outgroups? 3. ...experiment in which participants receive false failure feedback to see impacts on self-esteem? 4. ...priming people with scary/pleasant images to investigate impact of mood on helping behavior? 5. ...after filling out paperwork and en route to another room, participant is bumped by a confederate to see how they react?

Ethical Perspectives

1. Deontology: if the action itself is right or wrong, not based on the consequences of an action -"the ethics of the action is black and white" 2. Ethical Skepticism: "It depends on the action" -we never know what moral claims are true and it is hard to know what to believe in terms of ethics 3. Utilitarian: cost-benefit analysis of ethics -Ethical guidelines *Professional organization (ex. APA style writing, etc.) *Federal, state and local laws

Response Formats: Fixed-Alternative Response

1. Dichotomous: giving the participants two choices -"I brushed my teeth this morning" -- True or false? 2. Multiple choice: multiple answers the participants can choose from -"How many times did you brush your teeth this morning?" -- 1, 2, 3, or 4?

Self-Report Measures

1. Questionnaires: most commonly used psych measure -Pros? * Easy, cheap, and easy to administer to a group so very efficient * Anonymity 2. Interviews -Many varied forms of interviews -"Interview influence" * Interviewer's own biases/expectations/etccould shape/color the way they interact with the participants and the participants answers -Pros? * Possibility for follow-up questions * To interpret body language, etc. * Keeping people engaged * Level of engagement and motivation that doesn't come with a questionnaire * Can reach special populations • Young kids (pre-literary population) • People who don't have access to a computer, etc. • Illiterate population or speak another language * Ensure understanding of questions * Probe for elaboration of one's answers

Biases in Self-Reporting

1. Social desirability and desirability to be socially "normal" and not have abnormal/taboo answers to questions - Ex. "How bad of a friend are you?" -- how would you want to respond to this question? - Solutions? • Neutral wording • Anonymity 2. "Satisficing": accepting an available or alternative option as satisfactory (Krosnick) - Respondents are not as invested as the researcher so they are giving "good enough" (satisficed) responses and not necessarily their true responses - Ex. Say yes to everything or say no to everything, picking the middle rank every time, just trying to get through the survey fast, etc. - Solutions? • Reverse-coding • Mix of positively and negatively worded Qs - connotation of different questions • Try to keep participants engaged!

Cost-benefit analysis of ethics in research

Benefits: 1. Basic knowledge 2. Research improvements 3. Human and or animal welfare 4. Researcher benefits 5. Participant benefits Costs: 1. Money 2. Time 3. Effort 4. Risk: for both the participants and researchers *We have the Institutional Review Board (IRB) in place now to ensure all research is ethical -reviews all research to make sure it is ethical before getting approved -IRB utilized cost-benefit analysis: benefits must outweigh the costs

Additional Controls for Research

Experiments - Random assignment - from the sample you pick, randomly assign people from your sample to any of the various conditions you are testing * This minimizes possibility that some individual difference could account for your outcomes - Inclusion of a control group Manipulations and original measures - Pilot studies (before actually running the study) * Friends are OK for pilot studies! - 4-6 people for our studies in psych 200 - Manipulation checks (IN the study itself) * Did manipulation work - only looking at one variable at a time

"When is deception OK?"

General Guidelines: 1. Deception must be justified by study's value - Cost/benefit 2. Cannot conduct research otherwise (use of deception is absolutely necessary) - MUST communicate potential risks up front * What 2 other things (consent)? --> provide researcher's contact info and remind participant that they can decline or quit study at any point Debriefing: just means telling the participants the true nature of the study and what variables you were really measuring after the study is done; trying to get participants back to emotional baseline before they leave the study - Dehoaxing * Reveal true nature of the study - Desensitizing * Remove any negative consequences - Benefits of oral debriefing?

Measurement Error: Sources

Involves the participant: 1. Transient factors: mood, health, anxiety fatigue 2. Stable attributes: intelligence, ability, general motivation NOT involving the participant: 1. Situational factors: room temp, lighting, researcher attitude 2. Measure factors: ambiguous questions, too long, fear inducing, etc. 3. Mistakes: data entry errors, computer glitch

Physical/Mental Harm

Minimal risk - No greater in probability/severity than that associated with daily living * For example, ups and downs of emotions are part of daily living so provoking minimal stress or minimal sadness could still be considered a minimal risk Can a study be run that's more than minimal risk? - YES! But... - THE BENEFITS FROM THE STUDY HAVE TO OUTWEIGH THE GREATER COST : cost-benefit analysis - Monitor for unexpected/adverse effects * I.e. Stanford prison experiments got cut short because it was getting out of control like this (got shut down after 6 days but originally supposed to go on for 6 weeks)

Observational Study Example

Money drop experiment overview: -Researchers were in a mall and dropped money in the mall and wanted to see if the people who found the money would return it or keep it -Natural or contrived? *This was contrived because the researcher was part of the experiment, had confederated actively dropping the money so actively made these conditions of observation possible -Disguised or undisguised? *Disguised because the participants are not aware they are being recorded -What was recorded? *The money being dropped and whether or not the individual passing by would pick up the money or not

Response Formats: Open-Ended Questions

Open-Ended: this refers to free response questions -"What did you do last night?" -Benefits? * Elaboration, clarification, and specificity -Challenges? * Misinterpret question/interpretation; too vague * It is a lot of work/time for the participant to type out/write out these responses so generally won't do it and if they do won't do it well or right * Requires CONTENT ANALYSIS: need to code data and put it into quantified terms

Single-Item v. Multi-Item Measure

Single-Item Measure -"Stand alone" measure * I.e. sex, age, etc. Multi-Item Measures -A set of items measuring the same construct * I.e. NPI (Narcissistic Personality Inventory -- Raskin and Terry, 1988): this is a serious of questions all trying to account for and measure narcissism * Typically, as mentioned, these are questionnaires / surveys that measure a personality trait

Example of inter-item reliability

The 12-item Cognitive Flexibility Scale (CFS; Martin & Rubin, 1995) was included...to assess individual differences in willingness to consider multiple options and alternatives, as well as self-efficacy in being flexible (Appendix K)...Response options range from 1 (strongly disagree) to 6 (strongly agree), with higher values indicating greater cognitive flexibility. In Study 3, the twelve CFS items were averaged to create a composite variable, which was found to have satisfactory internal consistency (α = .77).

Example of measurement error:

True creativity score = 95 Observed creativity score = 85 Error = 10 -Possible sources of error? Participant sick, stressed out Room too cold Odd question wording Scoring error

How can we maximize reliability and validity in a study?

• Random sampling - systematic procedure OR standardized sampling • Using already "vetted" measures (pre-existing measures) o "Why reinvent the wheel?" • Pilot testing • Training "coders" - maximizes inter-rater reliability • Coding scheme - knowing up front what you are looking for • Standardized procedure - hold other variables constant • Minimize confounds / "noise" in the procedure • Operational definition specificity = use "good" (I.e. precise and appropriate) measures • Inclusion and assessment of subject variables - sex, age, etc. o Demographics collection essentially o Why might it be a good idea to assess subject variables? * Could be inherent in hypothesis (ex. comparing men and women in your study) * Explore post-hoc hypotheses

Criterion-Related Validity

• Relation between a measure and some sort of behavioral outcome o AKA correlation between measure and relevant behavior o Ex. SAT and college success - standardized test predicting the outcome behavior of success in college students • Concurrent vs. Predictive o Timing differentiates these two o SAT and college success example is predictive example because PREDICTING students success for the future and concurrent would be happening in the present

Bottom Line about Reliability and Validity

• Your study is only as good as your measures!! • NO measure is perfect but you should make every effort to minimize error • Reliability and validity = crucial for all measures (operationalized variables) in the study


Ensembles d'études connexes

Library Plagiarism/Paraphrasing Tutorial

View Set