UMN PSY 3001W Exam II
Central tendency measures
typical or representative score -mean -median -mode
P-value
-Probability of seeing our results or results more extreme -if P < 0.05 or P < 0.01, reject the null
Pretest-posttest design
-Use with between-subjects design -Measure DV before and after IV "exposure" -Checking for group equivalency before the experiment -Problems
What is Power?
the probability of rejecting the null when the null is in fact false -value between 0 and 1 *the high the power, the greater the ability to detect real effects
what should alpha and beta be in an experimental design?
to maximize the probability of reaching the correct decision, maximize alpha and beta *the stricter we set alpha, the higher probability of beta (type II error)
Cohen's d
used for independent groups t test: -if d = .2-.5, small effect -if d = .5-.8, medium effect -if d > .8, large effect
what is the Chi Square test?
used for nominal data *frequencies or category counts -single variables = goodness of fit test -two variables = test of independence
Internal validity
want to be sure the independent variable causes the change in the dependent variable (nothing else causing it)
descriptive statistics
summarize the data
what is statistical significance?
the ability to reject the null
If your IV is nominal/categorical and your DV is continuous..
and you want to compare groups -means -inferential tests to use: T tests or ANOVA
Types of research designs
between- vs. within-subjects
Interpreting results for T test independent groups
degrees of freedom (df) = N - 2 where N =sample size -means are significantly different if P < 0.05
failure to reject the null can results from what?
either it being actually true or low Power
correlation method
*different than r -it is any non-experimental method -typically, when r is used, you have used a correlation methods to obtain data
Interpreting results: f ratio
-At least two means are significantly different if p<.05 -conduct post hoc test to determine which two means -use means (M) and standard deviation (SD) to describe the effect
Solomon four-group design
-Combination of pretest-posttest design and posttest only design -Measure DV before and after IV exposure *For one control group and one experimental group -Measure DV only after IV exposure *For one control group and one experimental group -If pretest does not matter, posttest scores will be the same between control group and control group and between experimental group and experimental group
Questionnaires: steps for constructing
-Decide what information should be sought -Decide what type of questionnaire to use Write a draft -Reexamine and revise after review -Pretest -Edit and develop procedures for use
interpreting results
-Degrees of freedom df = (r-1)(c-1); where r is the # of rows and c is the number of columns -data are not significantly independent if p < 0.05 -size of statistic can be an indicator of effect size -use output to determine the percents to describe the effect
Participant selection factors: What should we used to measure participants?
-Depends on: · Availability · Finances · Nature of the problem · Accuracy, precision
When both your IV and DV are continuous..
-and you are interested in the relationship between the variables -correlate the individual scores -inferential test to use: Pearson product-moment correlation coefficient (r)
When both your IV and DV are nominal/categorical..
-and you want to compare groups -use % -Inferential test to use: chi-square test of independence
notes about chi squared
-cannot be negative -will rarely equal 0 -size of the discrepancy relative to the magnitude of expected frequency matters (you want to look at the observed vs. the expected) -look at residuals
alpha-level
-cut-off for retaining/rejecting null *level of significance *when Pr(results) > alpha, fail to reject Ho *when Pr(results) <_ alpha, reject Ho -alpha is set at the beginning of the experiment, typically at 0.05 or 0.01 -should report exact significance level if reject the null
Uses of survey research
-describe something about a person or population -predict
ANOVA: Independent groups
-interpreting results= F ratio -df = 2 reported *between groups= based on the number of IV levels *within groups= based on the total sample size
Effect size
-large effect size, easier statistical significance. -if all other factors are held constant, the greater the effect of the IV, the higher the Power. *as N increases, power increases *BUT high sampling variability decreases power.. so we want to decrease variability
what are the types of correlation?
-linear vs. curvilinear -positive vs. negative -perfect vs. imperfect
How to choose proper analysis?
-need to know the scale of measurement of the IV and DV *nominal/ordinal scales - categorical data *Interval/ratio scales= continuous data
ANOVA: one-way hypotheses
-uses a non directional hypothesis - Null: all groups are equally effective and do not differ from each other; IV has no effect - Alternative: at least two groups means differ; the null is false
what are the general rules of Power?
1. as N increases, the power increases 2. For a set N, power varies with the magnitude of the real effect
ANOVA: Independent Groups
ANOVA= analysis of variance -used when there are 2+ levels of the IV's -generalization of the t test
inferential statistics
analyses or tests, used to make decisions about if the hypothesis was supported or not
T test for independent groups
Compute the mean for each group separately and then compare to see whether chance alone is a reasonable explanation for the difference between means
when is a correlated groups ANOVA used?
If your IV has more than two categories and each category contains DV measures from the same or related people -scales of variables
when is a correlated groups T test used?
If your IV has only two categories and each category contains DV measures from different, non-related people.
Descriptive statistics
Percentages -most appropriate for nominal data
sampling: basic terms
Population- complete set of whatever we are studying Sample- subset of a population
what is the equation for Power?
Power + beta = 1
how is correlation best seen?
a scatterplot
how do we control for disadvantages?
counterbalancing
· Open ended questions
o Describe your political viewpoint o Who will you vote for in the next election and why o How do you view liberals and why o How do you view conservatives and why
Type II error
failing to reject the null when it is actually false -Pr(Type II error) = beta
Correlation: Relationships
form the basis of correlation and regression -if a relationship between two variables exists, one might be the cause of the other.
Cluster sampling
o "cluster" individuals together o Sample clusters, as a whole, rather than individuals
· Partially opened ended questions
o 'other' option o Do you consider yourself a liberal, moderate, conservative, or other? If other, please explain.
Alternative hypothesis
o AH or H1 or Ha o The research hypothesis o There is an effect o Non-directional § Uses non equal to sign o Directional § Used greater than or less than sign
Internet
o Advantages § Efficient § Low cost § Possibly diverse samples o Disadvantages § Unrepresentative samples § Response rate § Lack of control
personal interviews
o Advantages § Higher completion rate § Control over administration and interpretation § Can clarify questions o Disadvantages § Costly § Interviewer bias- need to be really well trained, tone of voice and things matter, interviewer could write down the answer incorrectly
o Advantages § quick and convenient § Self-administered § Best for highly personal or embarrassing topics o Disadvantages § Response rate (usually 20-30%) § Unrepresentative samples § Subjective interpretation of questions
Convenience sampling
o Aka haphazard sampling o Sample when you want, where you want, who you want o Basically volunteers
mean (M)
o Arithmetic average o If a sample, (X bar) - If a population (m) o Use with interval and ratio data o Sensitive to exact values of all raw scores
· Confounding variables threaten internal validity
o Confounding = when the effects of the independent variable and an uncontrolled variable are intertwined so you cannot determine which of the variables is responsible for the observed effect on the dependent variable
variability
o Dispersion of spread of scores o Range (largest score- smallest score) o Variance: § Total amount of variability in the distribution § Average squared deviation from the mean § Interval and ratio data § Why square? If you take each value and subtract the mean and you add up all the numbers you would get zero, so if you want to measure variability you square it to get around the negative o Standard deviation (SD) § Average deviation from the mean § Square root of the variance
Stratified random sampling
o Divide population into subgroups (strata) o Randomly select members from each subgroup o Increases likelihood of representative samples
Simple random sampling (gold standard)
o Every member of the population has equal probability of being selected - Random number table or generator - Random digit dialing - Disadvantage: you need a list to start from (sampling frame)
Stages of data analysis
o Get to know the data- actually look at the data o Summarize the data- summary stats, graphing o Confirm what the data revel- was the hypothesis supported or not
Controlling confounding variables in general
o Goals § Eliminate by producing groups that are equivalent prior to introducing IV § Reduce effect of nuisance variables as much as possible (keep within group variability low)
Threats to internal validity
o History o Maturation- systematic change o Testing o Instrumentation (instrument decay) o Regression toward the mean o Selection- differences in groups before the experiment o Mortality- people drop out of the IV condition o Interactions with selection o Diffusion or imitation of treatment- go through a study and your friend tells you about it so you know about what happens in the study
When designing the experiment, should take precautions to protect or increase the internal validity
o However, cannot assess internal validity until after the experiment is over
How do we protect internal validity
o Implement control procedures § Balancing- equate across conditions, equalize groups before § Holding conditions constant o Use a standard experimental procedure
Must rule out alternative explanations
o Make sure extraneous variables.... § Undesired variables that might influence the results; could threaten the validity of the experiment o ... do not become confounds § Variables that co-vary (change along with) the IV
mode
o Most frequently occurring score o Can be used with any scale o Not stable from sample to sample o Might not be representative of sample
Null and alternative are...
o Mutually exclusive (if you reject one you fail to reject the other) o Exhaustive o Evaluate the null in an experiment § Null · The assumption you are beginning with · The opposite of what you are testing § Alternative · The claim you are testing
Null hypothesis
o NH or H0 o There is no effect o Non-directional § Direction of outcome not predicted § Uses = sign § Two-tailed test o Directional § Direction of outcome is predicted § Used greater than or equal to and less than or equal to sign § One-tailed test o If you find any difference in the null there is no
· Types of manipulation
o Physiological o Experience o Stimulus or experimental o Participant characteristics (not true IVs)
Counterbalancing incomplete designs
o Presentation of different treatment sequences to different participants § Each treatment must be presented to each participant an equal number of times § Each treatment must occur an equal number of times at each testing or practice session § Each treatment must precede and follow each of the other treatments an equal number of times
Counterbalancing complete designs
o Presentation of different treatment sequences to the same participant § But, each participant is required to go through each treatment more than once o Reverse § One order, then backwards § ABBA o Block randomization § Each condition used once before any are repeated · Order randomly within blocks · Eliminates predictability or conditions
Between-Subjects - ways of assigning participants to conditions/groups
o Random assignment § Best way § Spreads subject variables evenly across groups § More participants/group= greater likelihood of equating o Block randomization § Each condition gets a participant before you repeat a condition § Ensures all participants in one condition do not end up being run at same time
Independent groups design (between-subjects designs
o Random groups- random assignment (hallmark of an experiment) o Natural groups- could be differences among people that exist already (left handers vs. right handers) o Matched groups?- individually differ but as a whole they're equal
Systematic sampling
o Random selection of first participant, then selecting every nth person o Simple, efficient o But what if there is a pattern in the population?
How to control
o Randomization- random assignment o Elimination- get rid of anything that might possibly influence your study other than the independent variable o Constancy- try to make things as equal as we can o Balancing- each group experiences the unwanted variable o Counterbalancing- ordering doesn't always stay the same
· Truthfulness of response- assume the person is being truthful
o Reactivity- act of measuring changes the measurement, if you tell the person what you are measuring they will change their response o Social desirability- sometimes people will respond how they 'should' rather than how they would actually respond o Response set- people respond with the same response or in a particular way § How to fix response set: You can reverse code some of the questions to ask the opposite o Reliability and validity
Measuring the DV: Issues
o Reliability and validity o Sensitivity § Can measure detect differences? § Larger range of scores generally makes measures more sensitive o Avoid § Ceiling effects: all scores at or near top of the scale § Floor effects: all scores at or near the bottom of the scale
· Correlated groups design (within-subjects designs or dependent groups)
o Repeated measures (spouses, siblings, best friends, people within families) o Matched pairs
what is used to analyze correlation methods?
t tests, ANOVAS, chi square tests
· Social desirability
o Sample items from the marlowe-crowne social desirability scale o For each item can you tell whether 'true' or 'false' represents a socially desirable response? § No matter who I'm talking to, I'm always a good listener (if answered true, they are acting socially desirable) § I like to gossip at times (if you say no you are acting socially desirable) § I'm always willing to admit it when I make a mistake (if you say yes you are acting socially desirable) § I have almost never felt the urge to tell someone off (if you say true that is the socially desirable reaction) § These are all leading questions
Quota sampling
o Sample with respect to the proportion of people in various subgroups in the population o Ex. Chocolate buyers- 40% are male, 60% are women, you would sample 80 men and 120 men (200 people total in study)
Purposive sampling
o Sampling only from those who meet some predefined criteria o Ex. A particular age (18-22), above legal drinking age, female or male, etc. o Some restrictions
median (Mdn)
o Score that splits the distribution into two equal halves (50% above; 50% below) o Use with ordinal, interval, or ratio data o Less sensitive to extreme scores o Can be used with open-ended or undeterminable score data o Treats all scores alike o Examples- housing costs, income, how many miles from home students are
Measuring the DV: Common measures
o Self-report: attitudes, judgements, perceptions, emotional tests, knowledge o Behavioral: direct observations (rates, reaction times, durations) o Physiological: biological responses (EEG, MRI, fMRI)
Probability sampling techniques
o Simple random sampling o Stratified random sampling o Systematic sampling o Cluster sampling
Manipulating the IV: Straightforward/natural vs. staged
o Straightforward- instructions and/or stimuli presentations o Staged- create particular psychological state, stimulate a real-world situation
Strength of manipulation
o Strong manipulation maximizes differences between groups (increases chances of finding statistically significant results, effect size) o Possible problems- external validity, ethics
Nuisance variables
o Unwanted variables that cause the variability of scores within groups to increase § Makes the effects of the IV more difficult to assess § Affects all groups in the experiment (unlike confounders) § Effect variability, but not location of the distribution
· Closed ended (restricted) questions
o Yes/no true/false agree/disagree § Do you consider yourself a liberal o Forced alternatives § Do you consider yourself a liberal or a conservative o Multiple choice § Need alternatives to be mutually exclusive and mutually exhaustive § Do you consider yourself a liberal, moderate, conservative, or other o Rating scales § Likert scales · On a scale from 1 to 7, rate your political views · 1=strongly liberal, 2=liberal, 3=weakly liberal, 4=moderate, 5=weakly conservative, 6=conservative, 7=strongly conservative § Is this truly an interval scale? We don't know if there are equal intervals between § They are probably ordinal but we treat them as interval § Odd or even number of points? (middle is moderate or neutral usually) § More points = more sensitive § Be careful of wording in questions § Nonverbal scales · Visual scale of rating (ex: rating pain)
One-Way ANOVA
one IV -testing whether groups are equal, but uses the variance
Type I error
rejecting the null when it is actually true -Pr(Type I error) = alpha
Alpha levels
shaded level (higher or lower tail of the distribution)
how is correlation measured?
with a correlation coefficient (r) which ranges from -1.0 to +1.0 *the sign of the coefficient indicate direction
N! combinations
§ All possible orders § Selected (partial) orders · Latin square method · Each condition appears in each ordinal condition once · Randomly assign to complete an ordering · Random starting order with rotation
Extraneous variable sources: Ones related to the participants
§ Demand characteristics: (control for it by double-blind experiment, placebo, deception) § Good participant effect: § Response bias: (how to control for it- reverse code some questions, remove response cues, pilot test items) · Yea/nay saying · Response set
what to avoid when asking questions:
§ Double-barreled questions: asking two things at once (divide questions into two) § Leading questions: "most people believe...." makes people feel like they should agree with this § Loaded questions: tend to be emotion-laden, often use slurs' or negative language (ex. Do you support the slaughter of innocent animals by big corporations) § Negative wording: use of not approve, agree with opposition (ex. do you not approve of obamacare)
Extraneous variable sources: Ones related to the experimenter
§ Experimenter characteristics: (how to control for it: standardization, automated procedures) § Experimental expectancies=Rosenthal effects: (AKA experimenter bias or expectancy effects (how to control for it: single-blind experiment, person doesn't know what condition they are in)
· Within subjects design: advantages
§ Fewer participants § Convenient and efficient § Eliminates variability in DV due to individual differences because each participant is his/her own control § No need to equate participants across conditions § Most sensitive to detecting the effect of the IV, even if the effect is small · Reduced error variation
Participant selection factors: how many to use?
§ Finances § Time § Availability § Within vs. between variability
Between subjects design: disadvantages
§ Need more participants § Less statistical power § Influence of extraneous variables § Ethical consideration § Dealing with mean differences might mask important individual differences within the groups
within subjects design: disadvantages
§ Not always practical § Can have order effects (effects due to the position of a treatment in a sequence, not to the specific treatment) · Practice effects, fatigue effects, contrast effects § Can have carryover effects
Participant selection factors: how choose?
§ Precedence- look at previous studies § Availability- can you get a sample for the study § Nature of the problem-
· Between subjects design: advantages
§ Sometimes its not possible to expose some participants to all treatment conditions § No order or carryover effects § Simplicity- easy to understand what is going on
Independent groups design (between-subjects design)
· 2+ distinct or independent groups · Basic design o Have different participants in each group- each group receives a different treatment- compare treatments between groups § Post-test only design · Participants o Randomly selected from the population o Either: § Assigned to a level of the IV · Control group vs experimental group o Or: § Divided by an individual differences variable (natural groups) · Group 1 vs. group 2 · Problem: cannot make causal inference · Form of correlational research
Guidelines for writing questions
· Choose how participants will respond o Types of questions § Closed ended § Open ended · Write clear, specific questions
Nonprobability sampling techniques
· Disadvantages- not likely to have representativeness · Advantages- helps discover if there is a relationship
Carryover effects
· Effect of being exposed to a treatment program persists and influences responding to the next treatment
hypothesis testing
· Hypothesis: statement of predicted relationship o Cannot be proven absolutely true o Two types: § Null hypothesis- test the null § Alternative hypothesis- often the one we are interested in (research hypothesis)
Types of administration (survey methods)
· Mail · Interviews -In person -Telephone · Internet
Control for: Sequence or order effects
· Position of treatment in the series determines, in part, the response DV · Effect depends on where in the sequence the participant is evaluated, not on which treatment was experienced
telephone interviews
· Telephone interviews o Advantages § Efficient § Great access to population § Easier to supervise interviewers o Disadvantages § Unrepresentative sample § Response rate § Interviewer bias
Does not control for: Differential carryover
· The response to one treatment depends on which treatment was administered previously
Within-subjects- counterbalancing: Counterbalancing complete designs
· Use when participants take part in more than one experimental condition o Try to control for order effects of the condition by presenting different treatment sequences o Complete (ex. Within-subjects) design o Incomplete (ex. Within-group or partial) designs
Inferential statistics
· Used to make decisions o Parameter estimation o Hypothesis testing § Determine extent to which your sample results are representative of the population of interest § Big question: what are the chances (probability) that you would find your sample results or results more extreme if your sample was drawn from the population of interest?