Chapter 9: Multivariate Correlatioional Research
Mediators vs. Moderators
- mediators ask "WHY" and moderators ask "For whom" or "when" -mediation hypothesis could propose that medical compliance is the reason conscientiousness is related to better health -moderation hypothesis could propose that the link between conscientiousness and good health is stronger among older people -mediate-in between variables -moderate-to change (changes the relationship between 2 variables making them more or less intense)
Regression in Popular Press Articles
-"Controlled for" -"Taking into account" -"Correcting for" or "Adjusting for" -journalists discuss betas, p values, or predictor variables -they are writing for a general audience, most readers might not be familiar with these concepts
multiple regression (Multivariate regression)
-Can help rule out some third variables, thereby addressing some internal validity concerns -statistical tool used to derive the value of a criterion from several other independent/predictor variables
Interpreting beta
-High beta = 0.25 -even when we hold age constant statistically, the relationship between exposure to TV sex and pregnancy is still there -the beta that is associated with a predictor variable represents the relationship between that predictor variable and the criterion variable, when the other predictor variables in the table are controlled for
parsimony
-The degree to which a theory provides the simplest explanation of some phenomenon. -In the context of investigating a claim, the simplest explanation of a pattern of data; the best explanation that requires making the fewest exceptions or qualifications.
predictor variables (independent variables)
-The rest of the variables measured in a regression analysis -listed below the criterion variable -ex: in the sexy TV/ pregnancy study, it is the amount of sexual content teenagers reported viewing on TV and the age
mediator/mediating variable
-a variable that helps explain the relationship between two other variables -study does not have to be correlational to include it -experimental studies can test them -mediation analysis often rely on Multivariate tools such as multiple regression (makes more sense) -ex: we know conscientious people are more physically healthier than less conscientious people (why-mediator of this relationship might be the fact that conscientious people are more likely to follow medical advice and instructions, that's why they are healthier) -mediator of the relationship between the trait, consciousness, and the outcome, better health
Mediators vs. Third Variables
-both involve Multivariate research designs, and researchers use the same statistical tool (multiple regression) to detect them. -they function differently -third-variable explanation: proposed third variable is external to the 2 variables in the original bivariate correlation; it might even be seen as an accident-a problematic "lurking variable" that potentially distracts from the relationship of interest ex: deep talk and high well-being might both be associated with educational level -mediator: researchers are interested in isolating which aspect of the presumed causal variable is responsible for that relationship -mediator variable is internal to the causal variable and often of direct interest to the researchers, rather than nuisance ex: researchers believe stronger social ties is the important aspect, or outcome, of deep talk responsible for increased well-being
Using a statistics to control for third variables
-by conducting a Multivariate design, researchers can evaluate whether a relationship between 2 key variables still holds when they control for another variable -"controlling for"= age -viewing sex on TV and getting pregnant are correlated, but sex on TV and age are also correlated with each other, and age and pregnancy are correlated too -also researchers want to know whether ahead a third variable correlated with both the original variables can account for the relationship between sexual TV content and pregnancy rates, in order to to that they must control for age -control for age-talk about proportions of variability -testing a third variable with multiple regression is similar to identifying subgroups. From moving from oldest to youngest
Longitudinal designs
-can provide evidence for temporal precedence by measuring the same variable in the same people at several points in time -longitudinal research is used in developmental psych to study changes in a trait/ability as a person grows older -this type of design can be adapted to test causal claims -ex: researchers conducted a study on a sample of 565 children and their mothers and fathers living in the Netherlands -parents and children were contacted 4 times, every 6 months -each time, children completed questionnaires in school, responding to items about narcissism (e.g. "Kids like me deserve something extra") -parents also completed questionnaires about overpraising their children, which was referred to in the study as overevaluation (e.g. "my kid is more special than others")
3 criteria for establishing causation
-covariance, temporal precedence, internal validity -ex: we might apply these criteria to correlational research on the association between parental praise and narcissism
Regression does not establish causation
-even though Multivariate designs analyzed with regression statistics can control for third variables, they are not always able to establish temporal precedence -even when a study takes place over time (longitudinally), another important problem is that researchers cannot control for variables they do not measure -unknown third-variable problem is one reason that a well-run experimental study is ultimately more convincing in establishing causation than a correlational study -ex: an experimental study on TV, for example would randomly assign a sample of people to watch either sexy TV shows or programs without sexual content -randomized experiment is the gold standard for determining causation -multiple regression, allows researchers to control for potential third variables, but only for the variables they choose to measure
Adding more predictors to a regression
-even when there are more predictor variables in the table, beta still means the same. -ex: more exposure to sex on TV predicts a higher chance of pregnancy -adding several predictors to a regression analysis can help answer 2 kinds of questions -first, it helps control for several third variables at once (cause=sexy tv effect=pregnancy) -second, by looking at the betas for all the predictor variables, we can get a sense of which factors most strongly predict chance of pregnancy (ex. Gender)
What if beta is not significant
-ex: family meals and child academic achievement -researchers found that children in families that eat many meals together (sinner and breakfast) tend to be more academically successful, compared to kids in families that eat only a few meals together -this simple bivariate relationship is not enough to show causation -there are third variables that present an internal validity concern -a multiple-regression analysis could hold parental involvement constant and see if family meal frequency is still associated with academic success -in once study, the researchers found that when parental involvement was held constant (along with other variables), family meal frequency was no longer a significant predictor of school success -the only reason family meals correlated with academic success was because of the third-variable problem of parental involvement
Ruling Out Third Variables with Multiple-Regression Analyses
-ex: third variables could establish the association of teens watching sexual content on TV likely to get pregnant -one is age: older teens might watch more mature TV programs, and they're more likely to be sexually active -other is parenting: teens with stricter parents might monitor TV use and put limits on their behaviors -we use the statistical technique multiple regression to know whether one of these variables is the true explanation for the association
Cross sectional correlations
-first set of correlations -they test to see whether 2 variables, measured at the same point in time, are correlated -ex: the study reports that the correlation between mothers' over-evaluation at Time 4 was r=.099 (weak correlation, but consistent with the hypothesis because both variables in a cross-sectional correlation were measured at the same time, this result alone can't establish temporal precedence)
Measuring More Than Two Variables (Multivariate Correlational Study)
-in the teen pregnancy study, they measured several variables including the total amount of time teenage participants watched any kind of TV, their age, their academic grades, and whether they lived with both parents
Multivariate designs
-involves more than two measured variables -extremely useful and widely used tools, especially when experiments are impossible to run -ex: longitudinal designs, multiple regression designs, and pattern and parsimony approach
multivariate designs and the four validities
-longitudinal designs help establish temporal precedence, and multiple-regression analysis helps rule out third variables, thus providing some evidence for internal validity -for any Multivariate design/any bivariate design it is appropriate to interrogate the construct validity of the variables in the study by asking how well each variable is measured -we can also interrogate the external validity of a Multivariate design -for interrogating a Multivariate correlational research study's statistical validity, we can ask about the effect size and statistical significance
taking into account
-means that researchers conducted multiple regression analyses -ex: even when they controlled for education and intelligence, they still found a relationship between job complexity and cognitive decline
Controlled for
-most common sign of a regression analysis -holding a potential third variable that is intended to represent 'no treatment' or a neutral condition -ex: when journalists covered the story about family meals and academic success
power of pattern and parsimony
-moving closer to a causal claim using a wide collection of correlational studies -ex: case of smoking and lung cancer articulated by psychological scientist Robert Abelson -decades ago, it started becoming clear that smokers had higher rates of lung cancer than non smokers (r=.40) -smoking manufacturers did not want anyone to think that smoking caused cancer -critics could argue that regression cannot control for every possible third variable -even though an experiment would rule out third-variable explanations, a smoking experiment would not be ethical or practical -researchers had to work with correlational data -solution to this problem, is to specify the mechanism for the causal path (the more contact a person has with these chemicals, the greater the toxicity exposure) -simple theory leads to a set of predictions, all of which could be explained by the single, parsimonious theory that chemicals in cigarettes cause cancer -toxicity in cigarettes-parsimony
Autocorrelations
-next step -they determine the correlation of one variable with itself, measured on two different occasions -ex: the results have suggested that both overevaluation and narcissism are fairly consistent over time
Statistically significance of beta
-regression tables in empirical journal articles often have a column labeled sign or p, or an atrisked footnote given the p value for each beta -such data indicate whether each beta is statistically significantly different from zero -when p is less than .05 the beta, is considered statistically significant -when p is greater than .05, the beta is considered not significant (we cannot conclude beta is different from zero)
Considering or adjusting for
-researchers used multiple regression -can also indicate multiple regression -ex: people who ate more chocolate had lower BMI, and researchers adjusted their results for several variables
Cross-lag correlations
-shows whether the earlier measure of one variable is associated with the later measure of the other variable -researchers are more interested in it -address the directionality problem and help establish temporal precedence -ex: in the Brummelman study, the cross-lag correlations show how strongly mother's overevaluation at time 1 is correlated with child narcissism later on, compared to how strongly child narcissism is correlated with mothers' overevaluation later on
Why can't we just do an experiment?
-the problem is that in some cases people cannot be randomly assigned to a casual variable of interest -ex: we cannot manipulate personality traits, such as narcissism in children -could be unethical to assign children to a condition in which they receive some sort of praise, especially after a long period of time
Criterion Variable (dependent variable)
-the variable in a multiple-regression analysis that the researchers are most interested in understanding or predicting -almost always specified in either the top row or the tile of a regression table -ex: the Chandra team were primarily interested in predicting pregnancy, so they chose that as their criterion variable
Beta to test for third variables
-there will be one for each predictor variable -similar to r -positive = positive relationship between that predictor variable and criterion variable, when other predictor are statistically controlled for -negative = negative relationship between 2 variables (other predictors are controlled for) -similar to correlations because they denote direction and strength of a relationship -the higher the beta, the stronger the relationship -the smaller the beta, the weaker the relationship -unlike r, there are no quick guidelines for beta to indicate effect sizes that are weak, moderate, or strong -betas change depending on what other predictor value are being used/controlled for in the regression
Pattern, parsimony, and the popular media
-when journalists write about science, they do not always fairly represent pattern and parsimony in research -they may report only the results of the latest study -ex: they may present a news story on the most recent nutritional research, without describing the other studies done in that area -when journalists report only one study at a time, they selectively present only a part of the scientific process -they might not describe the context of the research, such as what previous studies have revealed, or what theory the study was testing -skeptics who read such science stories may find it easy to criticize the results of a single correlational study -journalists should report on the entire body of evidence, and the theoretical background for a particular claim
Regression results indicate if a third variable affects the relationship
-when researchers use regression, they are testing whether some key relationship holds the even when a suspected third variable is statistically controlled for -consumer of information =you will work with the end result of this process, when you encounter regression results in tables in empirical journal articles -what do the numbers mean, what steps did the researchers follow to come up with them
Longitudinal studies and the three criteria for causation
1. Covariance: significant relationships in longitudinal designs help establish covariance. When 2 variables are significantly correlated, there is covariance 2. Temporal precedence: each variable is measured in 2 different points in time, they know which one came first. By comparing the relative strength of two cross-lag correlations, researchers can see which path is stronger. If only one of them is statistically significant, the researchers move a little closer to determining which variable comes first, causing the other 3. Internal validity: when conducting by measuring only the two key variables longitudinal studies do not help rule out third variables
Interpreting results from longitudinal designs
since there are 2 variables involved, a Multivariate design gives several individual correlations, referred to as cross-sectional correlations, auto correlations, and cross-lag correlations