Chapter 8
To establish causation
1. Covariance of cause and effect- the results must show a correlation, or association between the cause and effect variable.
Restriction of range Ex
A college testing the relationship between SAT scores and first year college grades, but only using SAT scores of 1200-1600, when in reality SAT scores range from 400-1600. In this situation, the college doing this test underestimate the true correlation between SAT scores and first year college grades.
T-test
A statistical test to examine differences in means between two groups. Can also use ANOVA.
Statistical Signifigance and sample size
A very small effect size will be statistically significant if is identified in a very large sample. (1,000 or more).
Larger effect sizes
Allow more accurate predictions, often considered more important results.
Outlier
An extreme score, a single case or a few cases, that stands out from the rest of the pack. The outlier can have an effect on the correlation coefficient. The outlier can make a medium sized correlation appear stronger than it really is. Outliers can exert disproportionate influence. They can have a large impact on the direction or strength of the correlation. In bivariate correlation, they are super problematic when they involve extreme scores on both of the variables. Outliers matter the most when a sample is small.
Moderator
Another variable, not intended to be studied, changes the relationship between two variables. Moderators can inform external validity. Ex. When an association is moderated by residential mobility, type of relationship, day of the week, or some other variable. We know it does not generalize from one of those situations to the other.
Usual representation of categorical variables
Bar graphs or T-test.
Bar Graph
Commonly used when researchers plot the results of an association claim with a categorical variable. In this technique, each person is not represented by one data point instead, the graph shows the mean (arithmetic average) for all measured variables, and you can examine the difference between the group averages to see if there is an association. If the difference is small, the association would be considered weak.
Analyzing curvilinear associations
Compute the correlation between one variable, and square the other.
Negative correlation (bivariate)
For every increase in variable A, there is a decrease in Variable B. Ex. Increase in calories = decrease in weight loss.
Positive correlation (bivariate)
For every increase in variable A, there is an increase in variable B. Ex. Time exercising and calories burned.
Pearson's r
Ranges from -1 to +1. Sign refers to direction (+ or -).
Magnitude
Refers to strength. Larger absolute value = stronger. Closer to 1 or -1 means its stronger.
Association Claims
Relationships between measured variables. Ex. SAT scores and college performance. Ex. Perceived social support and well being.
Coefficient of determination(R^2)
Represents the variability in one variable that can be explained (accounted for) by the other. Based on covariance.
Usual representation of Quantitative variables
Scatterplot
R of approx .10 or -.10
Small or weak correlation.
Correlation Coefficient(r)
Statistic indicating strength of a linear relationship.
2. Temporal Prescendence
The cause variable must precede the effect variable; it must come first in time.
Statistical Signifigance
The conclusion a researcher reaches regarding the likelihood of getting a correlation of that size just by chance, assuming there is no correlation in the real world.
Error of prediction
The difference between prediction and actual reality. Ex. if you predict a boy will be 172 cm when he is 18, and he is actually 170 cm when he turns 18, 2 is the error of prediction. Error of prediction is larger when association is weaker.
Statistical significance and effect size
The larger the effect size (the stronger the correlation) the more likely it will be statistically significant.
P value
The probability that the sample's association came from a population in which the association is 0.
If (p) associated is MORE than .05
The result is not rare, we can not rule out the possibility that the results came from a population where the association is 0, Thus we can conclude it is not statistical.
Effect Size
The strength of the relationship between two or more variables. The closer to 1 or -1 it is , the stronger the relationship. Can also indicate the importance of a result.
If all variables are measured it's an
The study is correlational and therefore you can make a correlation claim. Association claims are not supported by a particular kind of statistic or graph; it is supported by a correlational study design in which all the variables are measured.
Directionality Problem
The temporal precedence criterion, when we dont know which variable came first.
Validities in association claims
The two more important validities in association claim are construct validity and statistical validity. Can ask about external validity, but doesnt always matter.
3. Internal Validity
There must be no plausible alternative explanation for the relationship between the two variables.
Spurious association
When the relationship is there but only because of some third variable, and not what was originally intended to be variables.
Restriction of range
When there is not the full range of scores on one of the variables, it can make the correlation appear smaller than it actually is. This can apply when one of the variables has very little variance. Since this situation makes the correlations appear smaller, we would ask about it when the correlation is weak.
Third-variable problem
When we can come up with an alternative explanation for the association between two variables, that alternative is some lurking third variable. Must correlate logically with both of the measured variables in the original association.
Curvilinear Association
Where the relationship between two variables is not a straight line; it might be possitive up to a point and then become negative. We would only get an r of .01 because it is designed to describe the slope of the best fit line, and the best fit line for this is a horizontal straight line with a slope of zero.
Statistical Validity of association claims
You are asking about the factors that might have affected the scatter plot, the correlation coefficient (r) , bar graph, or difference score that led to your association claim. You need to consider effect size and statistical significance of the relationship, any outliers that might have affected the overall findings, restriction of range, and whether a seemingly zero association might actually curvilinear.
Bivariate Correlation(simple correlation)
degree of relationship of exactly TWO variables. Relationships can be pos, neg, or zero (no relationship). Calculate with a sample that includes all Ss scores on the two measured variables.
If (p) associated is LESS than .05
it is very unlikely to come from a zero-association population. Thus we can conclude it is statisticall significant.
Strength (effect size)
How "strongly associated" are these two variables. Line of best fit. Larger r = stronger magnitude = better prediction.
R of approx .50 or -.50
Large or strong correlation.
R of approx. .30 or -.30
Medium or moderate correlation.
Third variables in bivariate correlation
Not necessarily going to present an internal validity problem, a reason to dig deeper and ask more question. You can ask the researcher if their bivariate correlation is still present within potential subgroups.