Multiple Regression Analysis and Beyond Statistical Regression

¡Supera tus tareas y exámenes ahora con Quizwiz!

Partial Correlation: What is a partial correlation?

A partial correlation is a correlation coefficient. It describes the linear relationship between one variable and a PART of another. It is the relationship between a given IV and the residual variance of the DV, when the other IVs are already entered. A partial correlation describes the linear relationship between two variables when the effects of other variables have been statistically removed from one of them.

A regression model is underspecified?

A regression model is underspecified if the regression equation is missing one or more important predictor variables. This situation is perhaps the worst-case scenario, because an underspecified model yields biased regression coefficients and biased predictions of the response.

In MRA what is each IV assigned?

A weight. Because the model configures the predictors to maximize prediction accuracy, the specific weight (contribution) assigned to each IV in the model is relative to the other IVs in the analysis. Beta weight is the person r The weight is not a feature of the variable. However, it describes the particular role that it has placed in this one analysis combined with the other specific variables predicting this particular outcome variable.

Weight and beta

Beta weight is the person r for invariate and multivariate correlations. Beta weights are partial coefficients that indicate the unique strength of relationship between a predictor and criterion, controlling for the presence of all other predictors. Beta weights are also the slopes for the linear regression equation, when standardized scores are used. Little b coefficient are for raw scores and B are for standardized scores.

Critical values for correlation.

Critical Values for the correlation coefficient r. Consult the table for the critical value of v = (n - 2) degrees of freedom, where n = number of paired observations. For example, with n = 28, v = 28 - 2 = 26, and the critical value is 0.374 at a = 0.05 significance level.

What does calculating each variable last do?

Each predictors b weight will be calculated as though it were entered last into the model. Assess the contribution of each variable above and beyond the others. This makes them covariates

What do you review in the correlations of the variables?

Examining the correlation matrix, we are looking for two features. 1) Make sure no predictor is so highly correlated with the DV (Correlation of 0.70 or higher) 2) Make sure no two predictors are so highly correlated that they are assessing the same thing (Correlation of 0.70 or higher)

Goal of a regression procedure?

The goal of any regression procedure is to predict or account for as much of the DV variance as possible using the predictors. Each predictor weight is computed as though it had entered the equation last. This is to determine what predictive work it can do over and above the prediction attributable to the rest of the predictors.

What is the variables used as predictors?

The predictor variables are the IV. IVs may also be configured after the face in correlation designs rather than be exclusively based on manipulated conditions. In a regression design, it is usual for all the variables to be measured in a given "state of the system."

Why is it called statistical regression methods?

The reason for calling the procedures "statistical regression methods" is to emphasize that once the researchers identify the variables to be used as predictors, they relinquish all control of the analysis to the mathematical algorithms in carrying out the analysis.

What is the most widely used regression model?

The standard regression method (also called the direct model) is the most widely used where all the predictor (IV) variables are entered into the equation in a single "step."

What is the statistical goal of multiple regression?

The statistical goal of multiple regression analysis is to provide a model in the form of a linear equation that identifies the best "weighted linear" combination of the IV in the study to predict the criterion value (DV) optimally

What are the two classes of variables in a MRA?

The variable being predicted and the variable that is used as the basis of prediction.

What is the variable being predicted?

The variable that is the focus of a Multiple regression design is the one being predicted. It is also known as the criterion variable, outcome variable, or dependent variable.

What is a variate in MRA?

The weighted linear combination is called a variate. The variate is a weighted linear composite of the measured variables in the model. A variate is also called a latent variable that is not directly observed but is rather inferred (through a mathematical model) from other variables that are observed.

Hierarchical Linear Regression

•Very similar to standard regression, except variables are entered into the model in steps (referred to as "blocks") rather than all at once •The researcher determines the which variables to use as covariates at each step of their choosing in the analysis. •Purpose? To assess the amount of predictability gained (via R2) with each new block of variables added compared to the previous model

Statistical Error in Prediction: Why Bother With Regressions?

•Why use a line of best-fit when we have the actual data? •Regression is meant to be a model for how one variable in general can be used to predict another •The key is to have a model to generalize to the population •Two big perks: •We can predict how future individuals may perform when we know X •We can better understand the relationship between X and Y more generally

Regression Equations

•Ypred = a + b1X1 + b2X2 + ... + bnXn •Ypred = the predicted on the criterion variable •X = predictor variables in the equation •b = weight or coefficient of each X variable •Also referred to as the partial regression coefficients •Reflects the relative contribution of each predictor variable when controlling for the other predictors in the model •Each b weight tells us how many units the Ypred will increment for a one-unit change in the corresponding X value •a = Y-intercept of the line; the value of Y when X = 0.

Correlation coefficient

•correlation coefficient is an index of the degree to which two or more variables are associated with or related to each other •The squared correlation coefficient is an index of the strength of that relationship

Covariation

•the amount of change in one variable that is consistently related to the change in another variable of interest •The patterns exhibited by the two variables being evaluated •Data must be arranged in a pairwise configuration •Two pieces of data (one for each variable) must be derived from the same case and linked to each other •No variation is scores? Then no ability to measure covariation! No covariation? No correlation!

Strength of Relationship

Guidelines for Assessing Relationship Strength •Cohen (1988) suggested (in the absence of context) the following: •r > 0.5 (large) •r > 0.3 (moderate) •r > 0.1 (small) •However, there's usually always context and literature for a given set of variables •Important to consider the difference between statistical, practical, and clinical significance •What is the worth of the correlation in specific applications Relationship Strength is Shared Variance •To say that two variables are correlated is to say they covary •To say they covary is to say they share variance Indexing Relationship Strength by r2 •The squared correlation value is referred to as the coefficient of determination •This can be translated into a percentage Alternative Measures of Strength of Relationship •Ratio of explained variance to unexplained variance (Rosenthal, 1991) •r2 / (1 - r2) Calculating the Mean Correlation •You can't just average the correlation values •Pearson's r is not interval level of measure; you can't find a traditional mean •Must use a Fischer z' score (which is interval level data) •Convert r into Fischer z' using a conversion table •Example: r values of .10, .20, and .80 have Fischer z' scores of .100, .203, 1.099, respectively •Average Fischer z' = .4673 •Convert back to r using table; r = .44 •Quick and dirty method to estimate: •Sum up r2 values, average them, and convert back to r •Example: r values of .10, .20, and .80 have r2 values of .01, .04, .64, respectively •Mean r2 value is .230 ; convert back to r ; r = .48

Evaluating the Overall Model

How Much Variance Is Explained by the Model? •R2 = .48 meaning, these three predictors accounted for 48% of the variance in Self-Esteem •Note: R2 can sometimes be inflated by smaller sample sizes Variables in the Model •All variables are in the model, thus are part of the equation regardless of the size of their contribution The Regression Equations •Raw: Ypred = a + b1X1 + b2X2 + b3X3 •Self-Esteempred = 56.66 + (2.89)(pos affect) - (2.42)(neg affect) + (.11)(openness) The Regression Equations •Standardized: Yz pred = 𝛽1Xz1 + 𝛽2Xz2 + 𝛽3Xz3 • •Self-EsteemZ pred = (.40)(pos affectZ) - (.43)(neg affectZ) + (.06)(opennessZ) Tests •SPSS uses t-test to test the significance of each predictor •Here, we see Positive Affect and Negative Affect are significant predictors of Self-Esteem when controlling for each predictor •Meaning, the partial correlation is significant Openness is not a significant predictor of Self-EsteemCoefficients •b show the weights assigned to each variable at the end of the model building •Ex: a 1-point increase in Positive Affect corresponds to a 2.89 point increase in Self-Esteem •a 1-point increase in Negative Affect corresponds to a 2.42 point decrease in Self-Esteem Beta Coefficients •Beta show the weights assigned to each variable in z-score units, so they can be compared directly •Here, we see that Positive and Negative Affect have similar Beta weights, thus predict Self-Esteem equally well and much better than Openness Squared Semipartial Correlation •Accounts for variance uniquely predicted by each predictor in the full model •Note that the sum of all of the predictors (.14 + .16 +.00) is .30, while R2 is .48 •Indicating that 18% of the variance is accounted for by overlap of the predictors (e.g. they are correlated) •Positive and Negative Affect contribute approximately equal unique variance The Structure Coefficients •SPSS does not generate this for you L •Calculated by: r / R •Ex: Positive Affect r = .55 Square root of R2 (.48) = .69 .55/.69 = .80 •So, we see that Positive Affect is highly correlated with the variate •We will use this more later in the semester...

Research problems suggesting a regression approach?

If the research problem is expressed in either specific or implies prediction, multiple regression analysis becomes a viable candidate for the design.

Multiple Regression generates how many scores?

Multiple regression analysis generates two variations of the prediction equation, one in raw score or unstandardized form and the other in a standardized form (making it easier for researchers to compare the effects of predictor variables that are assessed on different scales of measurement).

Factors Affecting the Computed Pearson r and Regression Coefficients

Outliers •Remember, regression (traditionally) uses a least squares formula to fit the line •If there's an extreme value, it will pull the line toward itself and skew the data •Deletion or transformation of the dataset can mitigate these effects Range (Variance) Restrictions •Low variability will generate a low Pearson's r which leads to lower predictive power in your regression model •This is referred to as Variance Restriction •Can be caused by poor research design •Restricted range of scores due to crummy recruitment? Nonlinearity •Everything we have covered has assessed linear relationships •If there is a non-linear relationship between your variables, you won't capture it with Pearson's r •For example: •Quadratic relationship: Y is associated with the X2 •But, r=0, and the regression line would be flat •Combo of quadratic and linear

Simple Linear Regression - Scores

Raw Scores and Standard Scores •Sometimes raw scores are messy and have wildly different ranges between variables •Transforming raw data into standard scores makes it easy to handle •And understand for other scientists that don't necessarily know anything about what you measured •Z-scores are the most common •Mean of 0.0, Standard deviation of +/- 1.0 •Great for assessing the magnitude of each score without thinking in the scale it was originally measured

Concluding Paragraph

Reporting Standard Multiple Regression Results EX: Negative affect, positive affect, openness to experience, extraversion, neuroticism, and trait anxiety were used in a standard regression analysis to predict self-esteem. All correlations, except between openness and extraversion, were statistically significant. The prediction model was significant F(6,413) = 106.4, p<0.001, and accounted for approximately 60% of the variance of self-esteem. (R2=.607). Self-esteem was primary predicted by lower levels of trait anxiety and neuroticism, and to a lesser extent by higher levels of positive affect and extraversion. Trait anxiety received the the strongest weight in the model, followed by neuroticism and positive affect. Reporting Stepwise Multiple Regression Results Negative affect, positive affect, openness to experience, extraversion, neuroticism, and trait anxiety were used in a stepwise multiple regression analysis to predict self-esteem. All correlations, except between openness and extraversion, were statistically significant The prediction model contained four of the six predictors and was reached in four steps with no variables removed. The model was statistically significant F(4, 415)=157.6, p<.001 and accounted for ~60% of the variance of self-esteem (R2=.603). The unique variance explained by each predictor (as indexed by squared semipartial correlation) was low: trait anxiety, positive affect, neuroticism, and extraversion uniquely accounted for ~4%, 2%, 3%, and <1% of the variance of self-esteem. Simple Regression "Negative affect was used to predict self-esteem using ordinary least squares regression. A statistically significant degree of prediction was obtained F(1, 421) = 200.578, p <.001, r2 = .323. The standardized regression coefficient was -.569. Negative affect explained approximately one third the variance of self esteem.

What is residual variance?

Residual Variance (also called unexplained variance or error variance) is the variance of any error (residual). The unexplained variance is simply what's leftover when you subtract the variance due to regression from the total variance of the dependent variable

What is r2 vs. r. What does r2 change?

Symbolized with R2. •The squared multiple correlations (R2) tells the strength of the complex relationship • When 3 or more variables are involved in the relationship, we cannot use Pearson r. Instead, we use a multiple correlation coefficient. • A multiple correlation coefficient indexes the degree of linear association of the variable with another variable. This is called the coefficient of multiple determination.

Dummy and Effect Coding

The Elements of Dummy and Effect Coding •The basics: this techniques allows you to use categorical variables to predict a quantitative DV. •However, there are decisions to be made about four important elements 1.The number of codes or code sets that are required 2.The categories that are to be coded 3.The values that are to be used as the codes 4.How the categorical variable is represented in the regression analysis •The Number of Code Sets Required •The number of code sets needed is equal to the degrees of freedom of the categorical variable (k-1). •So for our example we have 3 levels of the variable, so k=3 •Degrees of freedom = 3-1 = 2 •The Categories to Be Coded •Each code set represents one of k number of categories in the variable •however, since we give codes to k-1, one level of the category will not have its own code •More on this as we move through •The Values Used as Codes •Recommend always using 0 and 1 •Using any other adjacent or non-adjacent numbers makes interpretation of the beta weight and y-intercept a lot harder in dummy coding •Effect coding uses -1, 0, and 1 •A code set is generated by creating a new column in the data file •Remember k-1: Therefore we need two new variables for our three book color covers Dummy Coding •The key decision to be made in dummy coding is establishing the group that will be the reference category •It is the performance of the each of the other categories compared to the reference category that is the basis of the analysis •A statistically sig. finding is achieved when a non-reference group is different from the reference group •If the overall regression is significant, the individual predictors are analyzed •Choosing what group should be the reference category: •Typically comparing old to new (e.g. a standard treatment to new treatments) •In our case, there's no real basis for choosing one color over another, so use your best judgement (e.g. assume red and blue would sell best, so green is the reference category)

Simple Linear Regression - The Least Squares Solution

The Least Squares Solution •Goal of simple linear regression is to find the correct linear equation for the regression line •Best method is the least squares solution to fit the line to the scatter plot of data •Deals with the distance of the data points from the line •Two parts: •"Squares" because each data point distance is squared (doesn't matter if above/below the line - just need that raw distance). These values are summed up •"Least" because formula makes for the sum of the squared distances is the smallest possible

Regression Equation example

The Raw Score Equation Ypred = a + b1X1 + b2X2 + ... + bnXn •Remember, all values in the raw scores formula are in the units originally measured for that variable •Ex: Using GPA and GRE score to predict success in graduate school •GRE average score range: ~150-165 •GPA average score range: ~3.0-4.0 •"Success" score (Ypred) average score range: ~50-75 •(Assume a y-intercept of -40.50) •Ypred = -40.50 + (7)(GPA) + (0.5)(GRE) •Here, a 1-point change in GPA is associated with a 7 point increase in Success score, when controlling for GRE •Also, a 1-point change in GRE score is associated with a 0.5 point increase in Success score, when controlling for GPA •Ypred = -40.50 + (7)(GPA) + (0.5)(GRE) EXAMPLE •Jim-Bob had a GPA of 3.7 and a GRE score of 150. •Plug his scores into the model to predict his Success score! •Ypred = -40.50 + (7)(GPAJim-Bob) + (0.5)(GREJim-Bob) •Ypred = -40.50 + (7)(3.7) + (0.5)(150) •Ypred = -40.50 + (25.9) + (75) •Ypred = 60.4 •Jim-Bob's predicted Success score is 60.4

Simple Linear Regression - Regression Equations

The Regression Equations •Raw Score Model Ypred = a ± bX •Ypred = the predicted value of the variable being predicted •X = value of the variables used as the basis of the prediction •b = coefficient of the X variable. Mathematically, it is the slope of the line •Indicates how much change in Ypred is associated with one unit of X. •a = Y-intercept of the line. Basically, the value of Y when X = 0. •Standardized Model Yz pred = ±𝛽Xz •Yz pred = the z-score value being predicted. •Xz = the z-score value of the variable used as the basis for prediction •𝛽 = coefficient of the Xz variable. Mathematically, it is the slope based on the standardized X and Y scores •In a simple linear regression, this is the Pearon's r •Notice there's no Y-intercept. That's because it is always (0,0)

Pearson Correlation Using a Quantitative Variable and Dichotomous Nominal Variable

Why a Dichotomous (Two-Category) Predictor Variable Works • Dichotomous variables are nominal so you turn them continuous by dummy coding. Can be used in a hierarchal regression and multiple regression. •Use continuous variables because we are assessing predictive power. •For example, assume X is dichotomous (two-levels) and Y is continuous •X = Sex (male, female) •Y = Test performance •If we dummy code Sex (e.g. 1=male, 2=female), we can correlate to this new "value" with test scores •This method works no matter which sex gets which code; the only thing that changes in the positive/negative of the r value Why an X Variable With Three or More Variables Will Not Work •The dummy codes are no longer interchangeable at >2 variables •If you switch the codes between groups, the stats don't work out •Solution: use one-way between-subjects ANOVA

Bivariate Correlation

an association that involves exactly two variables

Difference between R2 and adjusted R2?

· Meaning of Adjusted R2 · Both R2 and the adjusted R2 give you an idea of how many data points fall within the line of the regression equation. However, there is one main difference between R2 and the adjusted R2 · R2 assumes that every single variable explains the variation in the dependent variable. · The adjusted R2 tells you the percentage of variation explained by only the independent variables that actually affect the dependent variable.

Stepwise MR Output: Coefficients

· The Coefficients table provides the details of the results. Note that both the unstandardized and standardized regression coefficients are readjusted at each step to reflect the additional variables in the model. · Ordinarily ,although it is interesting to observe the dynamic changes taking place, we are usually interested in the final model. Note also that the values of the regression coefficients are different from those associated with the same variables in the standard regression analysis.

Hierarchical Regression Analysis Output: Coefficients

· The Coefficients table tracks the changes in the model as variables were entered. · After the second block, not having participated in a multicultural workshop predicts greater awareness when social desirability and years of experience are statistically controlled, which may be of some theoretical and practical interest to the researchers. But the dynamics evolve. · After the third and final block, only the primary three variables—institutional discrimination, ethnic identity exploration, and collectivism—emerge as the statistically significant predictors in the model controlling for the other variables. · The workshop variable is no longer statistically significant at the .05 level, and social desirability is very close to statistical significance; these trends might provide the impetus to further research.

Hierarchical Regression Analysis Output: Correlations

· The correlation matrix of the variables · As we have already seen, the dependent variable is placed in the first row and first column. · The correlations of the other predictors are quite modest, with the highest being associated with institutional discrimination. As can be seen, awareness of cultural barriers is not significantly (p = .176) correlated with social desirability (r = -.049), and thus its effect as a covariate is likely to be minimal.

Stepwise MR Output: Excluded Variables

· The fate of the remaining variables. For each step, tells us which variables were not entered. In addition to tests of the statistical significance of each variable, we also see displayed the partial correlations. This information together tells us what will happen in the following step. · For example, consider Step 1, which contains the five excluded variables. Positive affect has the highest partial correlation (.294), and it is statistically significant; thus, it will be the variable next entered on Step 2. On the second step, with four variables (of the six) being considered for inclusion, we see that neuroticism with a statistically significant partial correlation of −.269 wins the struggle for entry next. By the time we reach the fourth step, there is no variable of the excluded set that has a statistically significant partial correlation for entry at Step 5; thus, the stepwise procedure ends after completing the fourth step.

Hierarchical Regression Analysis Output: Model Summary

· The main results are contained in here. · We can see in the Model Summary that the R2 value went from .002 to .030 to .276 over the three hierarchical blocks of variables, with the R2 change being statistically significant for the second and third blocks. · The squared multiple correlations (R2) tells the strength of the complex relationship

Interpretation of Statistical Significance

•Hypothesis testing: We want to know if our correlation (r) is significantly different from zero. •Null: the correlation between the to variables is zero (i.e. no relationship) •Goal: the correlation coefficient is large enough at our given alpha level based on our sample size The magnitude of the correlation is dependent on the size of your sample Statistical Significance and Sample Size •Why is sample size so important in correlation? •The distribution of r changes substantially with changes in sample size •Thus, your r value needs to be larger with smaller sample size, and visa versa •Example: at the .05 alpha level you would need... •r = .67 for n=9 •r = .38 for n=27 •r = .20 for n=100 •r = .062 for n=1,000 •The point is, significance and effect size are not the same thing! •Statistical significance ≠ strong relationship

Multiple Regression Output

· The middle table shows the test of significance of the model using an ANOVA. · There are 419 (N − 1) total degrees of freedom. With six predictors, the Regression effect has 6 degrees of freedom. · The Regression effect is statistically significant (A p-value less than 0.05 (typically ≤ 0.05) is statistically significant), indicating that the combination of predictors explains more of the variance of the dependent variable than can be done by chance alone. · Model Summary provides an overview of the results. · Of primary interest are the R Square and Adjusted R Square values, which are .607 and .601, respectively. · We learn from these that the weighted combination of the predictor variables explained approximately 60% of the variance of self-esteem. The loss of so little strength in computing the Adjusted R Square value is primarily due to our relatively large sample size combined with a relatively small set of predictors. · Using the standard regression procedure where all of the predictors were entered simultaneously into the model, R Square Change went from zero before the model was fitted to the data to .607 when all of the variables were simultaneously entered. · The Partial column under Correlations lists the partial correlations for each predictor as it was evaluated for its weighting in the model •Zero-order: Pearson's r correlation of the DV with each IV •Partial: correlation for each IV when controlling for the other IVs •Part: Semi-partial correlation for each IV when the model is finalized •Squaring these values shows the percentage that each IV uniquely contributes to the prediction

Stepwise MR Output: ANOVA

· displays the test of significance of the model using an ANOVA. · The four ANOVAs that are reported correspond to four models, but don't let the terminology confuse you. · The stepwise procedure adds only one variable at a time to the model as the model is "slowly" built. At the third step and beyond, it is also possible to remove a variable from the model. · informs us that the final model was built in four steps; each step resulted in a statistically significant model. Examining the df column shows us that one variable was added during each step (the degrees of freedom for the Regression effect track this for us as they are counts of the number of predictors in the model). · We can also deduce that no variables were removed from the model since the count of predictors in the model steadily increases from 1 to 4.

Hierarchical Regression Analysis Output: Anova

· presents the results of the significance testing of the models. This output is structured akin to the way in which the output from a step analysis is structured. · Here, each model corresponds to each block of variables that we entered in succession. · The results of the first block confirm what we saw from the correlation matrix: Social desirability did not significantly increase our predictive ability. · The second block, composed of the experience variables, and the third block, composed of our predictors of interest, each were statistically significant, that is, a statistically significant amount of prediction was obtained for each of these blocks.

Stepwise MR Output: Model Summary

· the Model Summary presents the R Square and Adjusted R Square values for each step along with the amount of R Square Change. · In the first step, as can be seen from the footnote beneath the Model Summary table, trait anxiety was entered into the model. The R Square with that predictor in the model was .525. Not coincidentally, that is the square of the correlation between trait anxiety and self-esteem (−.7242 = .525) and is the value of R Square Change. · On the second step, positive affect was added to the model. The R Square with both predictors in the model was .566; thus, we gained .041 in the value of R Square (.566 − .525 = .041), and this is reflected in the R Square Change for that step. · By the time we arrive at the end of the fourth step, our R Square value has reached .603. Note that this value is very close to but not identical to the R2 value we obtained under the standard method

Regression Method: The Forward Method

•Adds each predictor in order by which has the highest predictive power at that time (e.g. highest correlation) •The next step is to input the predictor with the highest partial correlation that is significant •We repeat the process for each significant predictor, that is the variable with the highest statistically significant partial correlation.

Regression Method: Backward Versus Forward Solutions

•Backward does not always produce the same model as forward •Predictors being entered into the forward method is more stringent than it takes to be retained in the backward method •It is more difficult to get into the model than to remain in it •When determining entry into the model, alpha is at the .05 level •Removal in the backward method is based on a less stringent alpha of .10 •This is seen most notably for those variables in the border zone •The backward model would "keep" those variables >.05 and <.10 where in they would never make it into the forward model

Regression Method: Evaluation of the Statistical Methods

•Benefits of the Standard Method •More control based on the priority/interest of the researcher •Everything stays in the model because you want it to be there or think its important! •Benefits of the Step Methods •Putting your faith in the math to determine what is important to keep •Criticisms of the Statistical Methods as a Set •Good predictors might get a low weight in the model because of a "masking" effect •Step methods less popular as hierarchal techniques have become more popular •Balancing the Value of All the Statistical Methods of Building the Model •Standard model best for hypothesis about the model as a whole •Stepwise methods are atheoretical and risk less external validity

Collinearity

•Collinearity occurs when two predictors correlate very strongly •Predictors that are highly correlated will wash each other out in the model and receive low weights •However, R2 will still be large! •Avoid r >.7 •Common causes: •Using both the subscales of an inventory and full composite score •Using variables separately that assess the same construct •E.g. height and weight - consider making a composite variable (BMI) •Including variables that are mathematical transformations of each other (e.g. correct and incorrect responses) Solution: The potential solutions include the following: Remove some of the highly correlated independent variables. Linearly combine the independent variables, such as adding them together. Perform an analysis designed for highly correlated variables, such as principal components analysis or partial least squares regression.

Regression Method: The Stepwise Method

•Fusion of the forward and backward methods •Functions same as the forward method until you reach the third predictor Predictors are included in the model if they significantly (p<.05) add to the predicted variance of the DV.

Multicollinearity

•Multicollinearity occurs when >two predictors correlate very strongly •This only refers to correlations between the predictors and not with the DV. •Predictors that are highly correlated will wash each other out in the model and receive low weights •However, R2 will still be large! •Avoid r >.7 •Common causes: •Using both the subscales of an inventory and full composite score •Using variables separately that assess the same construct •E.g. height and weight - consider making a composite variable (BMI) •Including variables that are mathematical transformations of each other (e.g. correct and incorrect responses) Solutions: The potential solutions include the following: Remove some of the highly correlated independent variables. Linearly combine the independent variables, such as adding them together. Perform an analysis designed for highly correlated variables, such as principal components analysis or partial least squares regression.

Pearson's r

•Pearson's correlation •Pearson's r statistic •Assess the extent to which a linear relationship exists between two quantitively measured variables •Pearson's r ranges from -1.00 to +1.00; 0 = no association

Different Types of Relationships

•Perfect Positive Relationships •Each datapoint in the scatterplot is a coordinate of X and Y •The covariation pattern: Every Y score is 10 points higher than each corresponding X score •This is a perfect positive association •Pearson's r correlation value = +1.00 •Direct Relationship •This perfect association makes accurate predictions possible; we can predict unknown values of Y when we know X •Correlation is the foundation of prediction •Predict using regression •Perfect Negative Relationships •The covariation pattern: Every 2-point increase in X is associated with a 1-point decrease in Y •This is a perfect negative association •Pearson's r correlation value = -1.00 •Inverse Relationship •This perfect association makes accurate predictions possible; we can predict unknown values of Y when we know X Again, this is the basis for prediction using regression •Nonperfect Positive Relationships •The covariation pattern: Higher X scores are generally associated with higher Y scores •This is a positive association •Pearson's r correlation value = +0.731 •Direct Relationship •We can plot a line of "best fit" for the trend •This association makes prediction possible, but with some margin for error; we can predict unknown values of Y when we know X, to a certain extent •Absence of Relationship with Variance on Both Variables •The covariation pattern: scores on X are not systematically associated with scores on Y •There is no association •Pearson's r correlation value = -0.033 •No Relationship •The "best fit" trend line is flat •This association makes prediction impossible •Absence of Relationship with No Variance on Both Variables •The covariation pattern is nonexistent •There is no variability, therefore no correlation •Pearson's r correlation value = 0.0 •No Relationship •The "best fit" line cannot be applied •Covariation is not the Same Thing as Mean Difference •Take Home Message: •Two variables can be highly correlated whether their means are of equal magnitudes or very different magnitudes •Correlation assesses the degree to which two variables covary but does not speak to the absolute differences in magnitude of the values taken on by the variables •Positive and negative correlations signify direction of relationship. While r2 reflects the strength

simple linear regression

•Regression is for prediction •Simple linear regression = use a single variable (X) to predict a single variable (Y) •"Simple" because only one IV • "Linear" because we use a straight-line function for the data and use Pearson correlation •"Regression" because we are predicting something •Conceptualize regression as a line of best fit through a scatter plot The goal is to use your knowledge of an X variable to predict an unknown Y variable The Least Squares Solution •Goal of simple linear regression is to find the correct linear equation for the regression line •Best method is the least squares solution to fit the line to the scatter plot of data •Deals with the distance of the data points from the line •Two parts: •"Squares" because each data point distance is squared (doesn't matter if above/below the line - just need that raw distance). These values are summed up •"Least" because formula makes for the sum of the squared distances is the smallest possible

Regression Method: The Backward Method

•Starts with standard method approach: add in all the predictors regardless of the worth to the model •Then, nonsignificant predictors are evaluated for removal based on the whose loss would least significantly decrease the R2 •The process repeats until only significant predictors are remaining in the equation •In our example, Openness would be removed and the process would end because Positive and Negative Affect are significant

Partial Correlation: Squared Multiple Correlation.

•Symbolized with R2. •The squared multiple correlations (R2) tells the strength of the complex relationship • When 3 or more variables are involved in the relationship, we cannot use Pearson r. Instead, we use a multiple correlation coefficient. • A multiple correlation coefficient indexes the degree of linear association of the variable with another variable. This is called the coefficient of multiple determination.

Partial Correlation: Squared Semipartial Correlation

•The amount of variance uniquely (independently) explained by a predictor when combined with other predictors in the model •Describes linear relationship between a given predictor and the variable of the DV.

Structure Coefficients

•The bivariate correlation between a particular independent variable and the predicted score (Ypred) •represents the correlation between one of the predictor variables that is part of the variate (random variable) and the weighted linear combination itself •What is it good for? •Stronger correlations indicate that the predictor variable is a stronger reflection of the construct underlying the variate

Partial Correlation: Squared semipartuial and squared partial correaltion.

•The squared semi partial correlation: •b / (a + b + d + f) •Based on the total variance of the DV •Result: the percentage of the variance of the DV that is associated with the unique contribution of IV1 •The squared partial correlation: •b / (b + f) •Based only on the relationship between the predictor variable and the DV after the portion of variance accounted for by the other predictors is removed •Result: the percentage of the variance of the DV not predicted by the other variables that are exclusively associated with IV1 •The squared multiple correlation (R2): •(a + b + d) •Proportion of total variance of the DV covered by all predictors, not just their unique contributions (b and d) •Generally, we use squared multiple correlation to tell how well the model fits the data •We use the squared semipartial correlation to evaluate how well the model works on an individual predictor level • The squared semi partial correlation represents the proportion of variance of the DV uniquely explained by an IV when the other predictors are taken into consideration. The squared partial correlation is the amount of explained variance of the DV that is incremented by including an IV in the model that already contains the predictors.

Standardized Equation

•There is no longer a Y-intercept; when standardized it is always zero.Yz pred = 𝛽1Xz1 + 𝛽2Xz2 + ... + 𝛽nXzn •Remember, it works the same way as simple linear regression •Yz pred = the predicted z-score on the criterion variable •Xz = z=score of each predictor variable in the equation •𝛽 = weight or coefficient of each X variable •Referred to as the beta weight, standardized regression coefficient, or beta coefficient •Also called partial regression coefficients •There is no longer a Y-intercept; when standardized it is always zero.


Conjuntos de estudio relacionados

Aceable Agent Level 24 Fair Housing Law

View Set

GA-Introduction to Healthcare Science (Part II)

View Set

Company Law: (1) Different business models and introduction to companies

View Set

Growth and Development of Young Children

View Set

2017 First Semester Practice Exam Mrs.Litton

View Set

Milady chapter 3 - Your Professional Image

View Set

PATHO EXAM 3- NEED TO KNOW KIDNEYS

View Set

International Business Environment (Bersant-edition)

View Set

disorders of cardiac conduction of rhythm

View Set

Many students read and reread the chapter or look at the slides many times.

View Set