Research Methods
complex designs notation
#n IV1 (condition 1, condition 2,... condition n) x #n2 IV2 (condition 1, condition 2,... condition n) ... x #nn IVn (condition 1,... condition n)
independent variable
A factor that researchers control or manipulate in order to determine the effect on behavior. A minimum of two levels: The treatment (experimental) condition and the control condition
t-test
A t test is used to test the difference between two sample means. Each t value has a probability associated with it. The test statistic, or t-test, is used to determine the relative likelihood (probability) of observing a given difference between means under circumstances in which both samples have been drawn from the same population Between-subjects t-test Tests for mean differences between groups Within-subjects t-test Tests for mean differences between conditions Single-sample t-test Tests sample versus population parameter When two sample means are different, there can be two explanations Null hypothesis: difference is due to sampling error (from the same population, and not treated differently) Experimental hypothesis: means are from the same population, but were treated differently Within-subjects t-test is more sensitive to differences between conditions than the between-subjects test Smaller differences are more likely to be significantly different when using a within-subjects test
ABA or Reversal design
ABA Design (also called a reversal design) Follow the B with the original conditions Want to know if the IV produced the changes in behavior Going back to the original conditions after your experimental condition can test this If behavior is changed during B then reverts during the final A phase, the IV worked End with a B condition if you want the behavior stopped (ABAB) sometimes you can't undo the effect of the treatment & it gets confusing whether to attribute the changed behavior to the treatment or to confounds if the second A phase does not resemble the first A phase
power
Ability to reject H0 when H0 is false Affected by: Sample size & Magnitude of effect Can determine n necessary for a study This is where you get the 20 Ss per condition Can determine after the fact why you were unable to detect an effect the ability to detect statistical significance if it exists
internal validity
An experiment has internal validity when we are able to confidently state that the independent variable caused differences between groups on the dependent variable (i.e., a causal inference). For an experiment to be internally valid, we must be able to rule out alternative explanations for the study's findings. Threats: intact groups, extraneous variables, selective subject loss, and when demand characteristics and experimenter effects are not controlled for
case study
An in-depth analysis of one particular individual Common research procedure in clinical psychology Allows you to get the vast amount information available in a therapeutic context and find what treatments are effective
ANOVA
Analysis of Variance (ANOVA) is used for null hypothesis testing. Analysis of variance tells us whether the main effects and interaction effect(s) in our complex design are statistically significant. The initial ANOVA is called the "omnibus" (overall) ANOVA. - Main effects & Interaction effect(s) If an interaction effect is statistically significant, the researcher does "follow-up" or "post hoc" tests of statistical significance. Planned vs. unplanned comparisons of two means
relational research
Attempts to determine how two or more variables are related to each other Analysis of data using: central tendency dispersion t tests anova (analysis of variants) x^2 effect size normal distribution
how to calculate a contingency table
Calculate expected frequencies for each cell Put all numbers into the formula Calculate the Chi-Square value Look up significance based on degrees of freedom X2=(22.99+1.56+5.97+13.92+0.94+3.62) X2 = 49.00 df = (2-1)*(3-1) = 2 Critical value = 5.99 (if our X2 is greater than this, we have a significant result and reject the null)
changing criterion design
Change the behavior necessary to obtain reinforcement ie: Interested in investigating the effects of different criteria (press a lever 5 times, then 7 times, etc. before reinforcement) on the participant's behavior could be (but not always) shaping
why do you conduct research?
Contribute to the evolution of knowledge Different researchers ask different questions Different researchers approach the same questions differently
goals of research (according to the scientific method)
Describe a behavioral phenomenon Develop explanations for behavior Attempt to predict behavior or performance Control - Psychotherapy or therapeutic interventions
Correlational research
Determine the degree and direction of a relationship with a single statistic Coefficient of correlation Can be used to evaluate measurements and tests, or can be used directly as a research tool Refers to a category of statistical tests Vary from -1.00 through 0.00 to +1.00 Magnitude of the correlation coefficient determines the strength of the relationship *Just a measure of magnitude and direction of association No determination of causality A could cause B or B could cause A C could cause both A and B Problem of truncated (restricted) range: Need variability to detect differences & If the variability is too small, important differences can be masked
steps of research
Develop a research question. Generate a research hypothesis. Form operational definitions. Choose a research design. Evaluate the ethics of your research. Collect and analyze data; form conclusions. Report research results.
Sensitivity (d')
Difference between mean amount of sensory activity generated by noise and noise + signal trials measured in Z scores d' = ZFA - ZHIT ie: d' = Z(50-20) - Z(50-60) Look up Z score values based on % under the curve 30% and -10% d' = .842 - (-.253) d' = 1.095
range
Difference between the highest and lowest scores in the distribution
multi-method approach
Different areas of psychology require multiple methods clinical, social, industrial/organizational, developmental, counseling, physiological, cognitive, educational, personality, human factors, neuropsychology, etc. No single research method or technique can answer all of the different questions in psychology. No perfect research method Each method or measure of behavior has flaws "Toolbox" with different tools for conducting research
two-tailed test
Do not have a specific prediction about the direction of a relationship (non-directional test)
scientific experiment characteristics
Empirical Judgments based on direct observation and experimentation Skeptical, critical attitude observation is Systematic, controlled Control = essential ingredient of science Greatest control is in an experiment Clear, specific definitions Uses constructs is the term generally used for concepts within research Unbiased, objective reports with inter-observer agreements accurate, precise, valid, and reliable instruments and measurements
directional test
Expect a difference between your two groups/conditions, and you expect that difference to be in a specific direction
non-directional test
Expect a difference between your two groups/conditions, but do not predict a direction
type 2 error
Fail to reject a null hypothesis that should have been rejected
Stages of data analysis
Getting to know the data Summarizing the data Confirming what the data reveal
one-tailed test
Have a specific prediction about the direction of the relationship (directional test) Easier to find a significant result
normal distribution/ curve
High in the middle with symmetrical decline towards each extreme Inflection point Curve starts bending outward (rather than down) One standard deviation from the mean Percentage of scores under different parts of the curve (on each half of the mean)
applied research
Human Factors Industrial/Organizational Clinical/Counseling/Health
main effect
IVs, main thing you're measuring The overall effect of an independent variable in a complex design The main effect is the effect of the independent variable on the dependent variable as if only that variable was manipulated in the experiment. Simple main effects: The effect of one independent variable at one level of the second independent variable. One definition of interaction effects is that the simple main effects of one independent variable are different across the levels of the second independent variable.
covariation of events
If one event causes the other, the two events must vary together (when one changes, the other must change also). We must observe a relationship between the independent and dependent variables. For example, participants who write about emotional events have better health and academic outcomes than participants who write about superficial events. Thus, the two types of writing covary with the different outcomes.
central tendency
Indicates the center of a distribution of scores use mean, median, mode to help analyze Indicates the center of a distribution of scores
discrete trials design
Individual participants receive each treatment condition dozens (or even hundreds) of times Each exposure to a treatment, or trial, produces one data point for each DV Extraneous variables are tightly controlled Order is randomized or counterbalanced if feasible Can compare behavior of participants to provide intersubject replication signal detection theory
intact group
Intact groups are formed prior to the start of an experiment. Individuals are not randomly assigned to intact groups. As a result, individual differences among groups threaten the validity of the experiment.
nonscientific experiment characteristics
Intuitive Judgments based on "what feels right" Accept claims without evidence observation is Casual, uncontrolled Personal biases influence observation Ambiguous concepts Use undefined terms Biased, Subjective, Personal impressions and opinions are reported inaccurate, imprecise, invalid, non-reliable, possibly inconsistent instruments and measurements
interrupted time series design
Involves the examination of a naturally occurring treatment on the behavior of a large number of participants Interesting example: Introduce fluoride and measure school achievement scores Can follow a single class (e.g. , 2001) over a long period Can follow one class (e.g., 3rd grade) over a long period mortality delayed effects on treatments ex: Measure school achievement scores from 1965 to 1995 Fluoride introduced in local water supply in 1979 After this interruption, school achievement scores increased
large n designs
Large numbers of participants are included so that an unusual participant (outlier) does not skew the results (nomothetic)
coefficient of a correlation
Larger numbers indicate stronger relationships The sign (+ or -) indicates the direction of the relationship Pearson product-moment correlation (r)
interpretation
Main effects Interactions Sources of error How would you do the study again? Matching design and analysis to research questions
subject variables
Measurable characteristic of people (Age, sex, height, weight, IQ, nACH, race, Alcoholism, schizophrenia, brain damage) Do not manipulate these variables, and have no control over them Cannot necessarily make clear causal statements when the research involved subject variables Interpretation can be thought of as similar to correlations - there is a relationship, but we cannot determine causality
AB design
Measure initial levels Institute some treatment Measure post-treatment levels A: baseline condition before therapy B: level of the behavior after therapy IV: Therapy DV: tantrum behavior Poor design because it overlooks too many potential confounds (no control) in a small-n: want to find out if we have an effective treatment for the behavior use AB design to do this A- condition B- treatment AB design not necessarily sufficient because other factors that occur between A and B could be confounding variables
median
Middle-most score (50th Percentile)
mean
Most common and familiar (average) Sum of scores divided by number of scores (ΣX/n) Affected strongly by outliers The average score on a dependent variable is computed for each group in the experiment (a measure of central tendency). Most common and familiar (average) Sum of scores divided by number of scores (ΣX/n) Affected strongly by outliers
Pearson product-moment correlation (r)
Most commonly used coefficient r=∑xy / ((∑(x^2)∑(y^2))^-.5) X = scores for the 1st variable and Y = scores for the 2nd variable ∑XY = sum of cross-products of each pair of X and Y scores N = number of pairs of scores Pearson r is not good at detecting relationships that are not linear Linear: straight line relationship Curvilinear: relationship does not fall on a straight line, rather a curve
mode
Most often occurring score
extraneous variables
Practical considerations when conducting an experiment may confound an experiment — these are referred to a extraneous variables. controlled for via balancing and holding conditions constant
type 1 error
Reject a null hypothesis that should not have been rejected
generalizability
Researchers are not interested in just the one sample of people or one set of circumstances tested in a research study. They wish to generalize a study's findings to other People, Settings, & Conditions
inferential stats
Researchers use inferential statistics to determine whether an independent variable had a reliable effect on a dependent variable. Inferential statistics allow us to rule out whether the findings from our experiment might be simply due to chance
alternating treatments
Reversal designs are problematic if you expect to get long-term effects on the DV Alternating treatments design is an alternative that can extend the basic reversal design A small-n experimental design in which two or more IVs alternate ACABCBCB A: Baseline, B: IV1, C: IV2 This design allows you to test more than one IV, and can make determinations about causality Example: test effects of three different diet levels on hyperactivity A: normal diet (baseline) B: introduce cookie into diet C: introduce artificially colored cookie into diet
control
Scientists investigate the effect of various factors one at a time in an experiment. Researchers conduct controlled experiments to identify the causes of a phenomenon. Control requires that researchers manipulate factors to determine their effect on the event of interest — these are independent variables. *Potential IV that is held constant during an experiment
basic research
Social (attitudes, attraction, persuasion, conformity) Cognitive (language, memory, decision making) Developmental (age-specific function levels) Personality (traits, motivation, individual differences)
overlap of basic and applied research
Sometimes we have applied questions that lead to basic research That basic research can then be applied to real problems Research is inspired by critically evaluating phenomena in the world *ie: social loafing
standard deviation
Square root of variance (and used more as descriptive) (Variance= s2= Σ(X-¯X)2/n) The average distance of each score from the mean of a group is computed (a measure of variability).
z score
Standard score units Each increase in Z score represents a change of one standard deviation ≈68% of scores are within one standard deviation ≈96% are within 2 SD, ≈99.74% are within 3 SD Great source of comparison IQ score of 115 is 1 SD above the mean Score is higher than ≈84% of people Can compare across different sample means, etc. when use Z
Contingency tables
Tabular presentation of all combinations of categories of variables that allows relationships to be examined categorical variables (can only be in one square at a time, in one category)
interaction effect
The combined effect of independent variables in a complex design An interaction effect occurs when the effect of an independent variable differs depending on the level of the second independent variable. Interaction effects represent how independent variables work together to influence behavior. An interaction effect occurs when the effect of one independent variable differs depending on the level of the second independent variable. An interaction effect is likely present in a complex design experiment when the lines in a graph that display the means are not parallel - that is, the lines either intersect, converge, or diverge.
dependent variable
The measure of behavior that is used to assess the effect of the independent variable. In most psychology research, several dependent variables are measured to assess the effects of the independent variable. Researchers observe the effect of the independent variable by measuring dependent variables.
time order relationship
The presumed cause must occur before the presumed effect. For example, writing about emotional events (the cause) comes before the beneficial health and academic outcomes (the effect).
effect size
The strength of the relationship between the independent variable and the dependent variable is computed. Cohen's d Guidelines for interpreting Cohen's d are: Small effect: .20 /Medium effect: .50 /Large effect: .80 Different measures of effect size for different tests, but attempting to determine practical significance rather than statistical significance Basically, the magnitude of the effect
X^2 test for independence
There are different kinds of Chi-Square tests Null hypothesis: the number of golfers is independent from school (no difference associated with the variables) Reject the null: there is a relationship Fail to reject the null (you don't accept it, but fail to reject it) Based on observed frequency (how many people appeared in each cell) compared to expected frequency (how many people would you expect in each cell) X2= Σ(O-E)2/E If you do not have an expectation about the expected frequency, you calculate it based on participants E=(row total X column total)/total
null hypothesis testing
This statistical procedure is used to determine whether the mean difference between two conditions is greater than what might be expected due to chance or error variation. We say that the effect of an independent variable on the dependent variable is statistically significant when the probability of the results being due to error variation (chance) is low. Null hypothesis testing is most often used to decide whether the independent variables have produced an effect as measured by the dependent variables.
causal inference
Three conditions must be met before we can make a causal inference: Covariation Time-order relationship Elimination of plausible alternative causes
Sections of a research paper
Title Page Abstract Introduction Method Results Discussion (main body of report) References Footnotes Appendices Tables (if any) Figures (if any)
explanation
Using experiments, researchers can make causal inferences — statements about the cause of an event or behavior. *requires Covariation of events, Time order relationship, and explanation of alternative possible causes
small-n designs
Very few participants are intensely analyzed More information about a smaller number of people Idiographic approach usually in applied settings Allow you to do research when you do not have a large number of participants available to you Because specific conditions, like autism, schizophrenia, hyperactivity are necessary Because specific tests allow for high levels of control (e.g., radar target discrimination) *not all small n are case studies
elimination of alternate plausible causes
We accept a causal explanation only when other possible causes of the effect have been ruled out. Using control techniques, we rule out other possible causes for the outcome. If the two groups differ in ways other than the emotional and superficial writing, these differences become alternative explanations for the study's findings.
confounding variables
When the independent variable of interest and a different, potential independent variable are allowed to covary (go together), a confounding is present. Confoundings represent alternative explanations for a study's findings. An experiment that has a confounding is not internally valid. An experiment that is free of confoundings has internal validity.
external validity
Would the same findings occur in different settings? Would the same findings occur with different conditions? Would the same findings hold true for different participants? Researchers often seek to generalize their findings beyond the scope of their specific experiment. Experiments should include characteristics of the situations, settings, and populations to which researchers wish to generalize (e.g., representative samples and situations). Partial replications are common: Research findings generalize when a similar result occurs when slightly different experimental procedures are used in a subsequent experiment.
scientific method
an abstract concept, not a particular technique or method. Two important aspects of the scientific method are: the reliance on an empirical approach, and the skeptical attitude scientists adopt toward explanations of behavior and mental processes. Way to gain knowledge about behavior and mental processes A general approach to gaining knowledge Not a particular technique or tool Compare scientific method to "everyday" ways of gaining knowledge (nonscientific)
what might a null be due to?
confounding variables, problem with selecting levels of IV
reliability
consistency of an experiment Test-retest reliability: Give same test twice over a short period of time Parallel forms: To avoid practice effects, you can employ alternate forms of the same test Split-half reliability: Correlate scores from two halves of a test improve reliability with: More items, Greater variability among individuals on the factor being measured, Testing situation free of distractions, and Clear instructions
balancing
control technique Some alternative explanations for a study's findings concern characteristics of the participants. For example, what if participants in the emotional writing condition are healthier or smarter than participants in the superficial writing condition, even before they write anything? The goal of balancing is to make sure that, on average, the participants in each condition are essentially the same before the experiment begins. Thus, on average, the groups in the experiment are equally healthy, smart, motivated, conscientious, etc. prior to the independent variable manipulation. *when the two experimenters conduct each condition but are randomly assigned to administer a condition at any particular time.
Description
define, classify, catalogue, or categorize events and their relationships to describe mental processes and behavior. Most psychology research is nomothetic rather than idiographic. Most psychology research is quantitative rather than qualitative.
Interaction
how IVs interact with each other IV1xIV2, IV1xIV3, IV2xIV3, IV1xIV2xIV3
Signal detection theory
how good are you at picking up a signal in some sort of broader noise? If participants need to rate the presence/absence of an item, they can say present or absent and be wrong We need a way to account for hits (correctly indicate present) and misses (fail to identify presence) Allows you to determine ability to detect a stimulus (signal) amongst interference (noise) Calculate d' (d prime) Sensitivity Assumes that noise is randomly distributed (see chart in ppt.)
ideographic approach vs. nomothetic approach
ideographic is better at predicting one person's actions b/c it's tailored to them - The problem of whether to use nomothetic or idiographic approaches is most sharply felt in the social sciences, whose subject are unique individuals (idiographic perspective), but who have certain general properties or behave according to general rules (nomothetic perspective).
delayed effects on treatments
ie: If you expect fluoride to reduce cavities and thereby absenteeism, you would not expect the effects to be immediate
mortality
ie: Students graduate, move away, some die This can be a special type of selection bias You do not have any control over who withdraws from the sample
measures of dispersion
indicate how spread out scores are range and standard deviation
idiographic
individual case studies
theory
integration of facts and ideas
nomothetic
large sample sizes, "average" performance of a group
alpha
level of significance, usually .05
descriptive stats
measures of dispersion and effect size ie: mean, median, mode, range, standard deviation, central tendency, effect size
Observation-treatment-observation design
natural treatment Like an ABA, but the final A is not entirely equivalent to the initial A condition May be interested in measuring reading levels, introducing a new reading program, then measuring reading levels again problem: Children will not be the same age at follow-up that they were at the beginning (see full example in ppt) Not a reversal design because the treatment not under experimenter's control and natural treatments are likely to have long term carryover effects
null
no significant correlation Null results can be a product of problems with the DV Floor effect- Being at the bottom of a scale (does not allow for variability) Ceiling effect- Being at the top of a scale (does not allow for variability) Null results can occur because of a failure to control important control variables Confoundings: unintended effects that occur because their influence confounds the proper interpretation of the results
empirical approach
observation of behaviors that can be observed directly, and experimentation in which scientists employ systematic control in the situation to be observed.
Between subjects design
participant is only in one level of the IV
alternative hypothesis
predicting an alternate outcome for a different causation
p-value
probability associated with an inferential statistic found in a table If the observed p value is less than .05, reject the null hypothesis of no difference. If the observed p value is greater than .05, do not reject the null hypothesis of no difference.
variance
s2= Σ(X-¯X)2/n
quantitative
statistical summaries of performance
X^2 independence test
test the null hypothesis to see whether our outcome rejects or fails to reject the null hypothesis that the variables are independent of each other Based on observed frequency (how many people appeared in each cell) compared to expected frequency (how many people would you expect in each cell) X2= Σ(O-E)2/E
hypothesis
testable prediction about the relationship between 2 or more variables circular- Event itself is used as an explanation for the event
operational definition
the specific procedure used to produce and measure a construct. Define with specificity to allow for clear communication A definition of a concept in terms of the operations that must be performed to demonstrate the concept
what makes research good?
theoretical framework standardized procedures generalizability objective measurement
validity
truth of an experiment Predictive validity: Can you predict an outcome based on some criterion? Construct validity: Does the experiment measure what you intend it to measure? External validity: Can you generalize from your sample results to the population? Internal validity: Do you have causal relationships between your IV's and DV's? Establishing the construct validity of a measure depends on convergent validity & discriminant validity Convergent validity refers to the extent to which two measures of the same construct are correlated Discriminant validity refers to the extent to which two measures of different constructs are not correlated
multiple baseline design
two ways: Establish baseline for dif behaviors and come up with ONE treatment for ONE behavior *look at effects of each behavior, and a combo of the two *the second behavior could be effected unintentionally by the first treatment Different behaviors (or participants) receive baseline periods of varying lengths prior to the introduction of the IV Can establish baselines for different behaviors Crying vs. fighting measured for different lengths of time Can establish baselines for the same behavior for different people Fighting measured for different lengths of time Continuing to measure one baseline while directing a behavioral intervention at a second behavior allows you to tease apart effects of each behavior plus their combination Determine independence of the behaviors or the effect of the IV on the two measures
qualitative
verbal summaries of research findings
minimum number of participants for a study?
~20