Stats 2 Exam 3

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Simple Linear Regression

When one independent variable is used to predict one dependent variable When the function predicting one variable from the other is a straight line (finding the equation for the line)

The multiple regression equation (model)

Ypred = a + b1X1 + b2X2 +... + bnXn -a is the Y-intercept, i.e. Ypred when all X's are 0 -bi is the b weight, coefficient for Xi, amount change in Ypred for every unit change in Xi holding all other X's constant. -The b weights are also called partial regression coefficients The goal of regression analysis: -A linear model consisting of the best weighted combination of the predictors to optimally predict the DV. -to find the set of b weights that brings the Ypred values as close as possible to the actual Y values. Technically, the set of regression coefficients -Minimize the sum of squared deviations between predicted and actual Y values, S (Y - Ypred)2, i.e. minimize prediction errors -Optimize the correlation between the predicted and actual Y, rYYpred

The Standardized Multiple Regression Equation

Yz pred = b1Xz1 +b2Xz2 +... + bnXz -All the variables have been transformed to standardized scores, i.e. z scores -All the variables are measured on the same metric, with a mean of 0 and a standard deviation of 1 -bi is the beta weight, beta coefficient, standardized regression coefficient for Xzi -Beta weights are also called partial regression coefficients e.g in standard scores: Yz pred = (.48)GPAz +(.22)GREz Yz pred, Erin = (.48)(1.80) +(.22)(1.70) Yz pred, Erin = 1.24

Covariation

how the variability (differences among the scores) in one variable correspond with the variability (differences among scores) in the other variable

Difference between r and R? r^2 and R^2?

"Multiple" One variable vs. multiple variables as the predictor

Relationship = ?

Covariation = Shared Variance

Squared semipartial correlation (sr^2)

unique contribution of IV to the total variance of the DV (including contribution of other variable) a/(a+b+c+d) Focus for this class

Dummy Coding

-A categorical IV with more than two levels is recoded into a set of dichotomous variable. -The number of dummy variables = # of levels - 1 -The dummy variables are entered as a set in regression analysis, so the variance due to the original categorical variable can be determined. -e.g. IV: Treatment with 3 levels (Med., Psychotherapy, Placebo) -Number of dummy variables = 3 - 1 =2 (subtract one otherwise you will get singularity) -Med =1, no med = 0 -Psychotherapy = 1, no psychotherapy =0 -0, 1 = Psychotherapy, 1, 0 = Med., 0, 0 = Placebo -The level not assigned a dummy variable is the reference group -The placebo condition in the above e.g.

Suppressor Variable

-An IV that increases R2 by correlating with the error variance in another predictor, thus decreasing the error variance, making the IV a better predictor of the DV -Minimal correlation with the DV on its own -Suppress the error variance and make the other variables even better predictors -Often not correlated strongly with the DV Indications that you may have a suppressor variable: -The Pearson correlation between it and the criterion variable is much smaller than its beta weight -Its Pearson correlation with the criterion variable and its beta weight have opposite signs. -It may have a near-zero correlation with the criterion variable but is a significant predictor in the model -It may be correlated only slightly with the DV but is correlated with one or more of the IV's

The Sequential Regression Method

-IV's (or set/block of IV's) enter the equation in an order determined by the researcher. -Each IV (or set of IV's) is assessed in terms of what it adds to the equation at its own point of entry i.e. each IV is assigned the variance, unique and overlapping, that is still left to it, unclaimed by previously entered variables. -Easy/cheap stuff first for shared variance and hard/difficult stuff next to see if it can add to the prediction of the easy/cheap stuff -No unclaimed variance

Three Major Types of Multiple Regression?

-Standard (Simultaneous) -Sequential (Hierarchical) -Stepwise (Statistical) All three compute the b & b coefficients based on how much variance in the DV can be accounted for by each IV. Problem: overlapping variance (i.e. variance in the DV that can be accounted for by 2 or more IV's) Different types of multiple regression differ in how each deal with overlapping variability due to correlation of the IV's, and how the order of entry of the IV's into the equation is determined.

Stepwise Regression

-The order of entry and which IV gets included in the regression model is determined solely by statistical criteria. -Is a controversial procedure. -Which IV gets included and which get excluded could be based on very minor difference in statistics computed from a particular sample. -Variable do not get to stay in equation if it does not earn its keep, if you don't have a big enough sample size and variable does not have unique contribution contributing significantly to R^2 then it doesn't enter the equation and it could depends on those errors, external validity can suffer

Interaction

-The regression coefficient of one IV (X1) varies over the range of another IV (X2) -X2 is said to moderate the effect of X1 on the DV -Interaction is represented by a cross-product term, X1X2 e.g. Yprd. = a + b1X1 + b2X2 + b3X1X2 While the cross-product is not linear, the equation is intrinsically linear because the cross-product is redefined as a variable When you put the cross-product term of b3x1x2 in the equation and the b3 weight is nonsignificant -> no interaction and can all IVs constant and interpret b weight as with every unit increase in x there is this increase in y -When the interaction term is significant, must be careful of interpreting the main effects, i.e. b1 and b2 because each varies over the range of the other variable -Simple effect analysis and look at the moderator and look at different levels of moderator and how that impacts the b weights (use SD below, SD average, and SD above) -Analysis simple effects, simple slopes, that is, the effect of X1 at various level of X2 (e.g. low, medium, high or -1SD, mean, +1SD) -Predictor and moderator need to be centered (convert to deviation scores) when interaction terms are included; find mean and subtract each score and then get deviation score and then interpret b weights for the average amount e.g., for an average amount of mindfulness this is how site impacts

Common causes of multicollinearity

-Using both subscales and full scale scores as predictors -Two measures that are mathematical transformation of each other -Using variables that assess the same construct

The Point-Biserial Correlation

-a special version of the Pearson correlation. -used to measure the relationship between a quantitative and a dichotomous variable. -The dichotomous variable is recoded into 2 consecutive numbers (e.g. 0 and 1 or 1 and 2) -Just a t test, interpreting two means

Testing the b weight, beta, pr, sr (for Standard Regression)

-t tests -test each b and beta weight to see if they are significantly different from 0 -a significant b or beta weight means that the associated IV contribute significantly to the prediction of the DV -Limitations: test only the unique contribution of an IV

Multiple Regression

A very useful extension of simple linear regression. Predict the value on one dependent (criterion) variable from a linear combination (variate) of several independent (predicator) variables. Also serves explanatory purposes by examining the relationships among variables (e.g. which variables in combination more strongly associate with a construct, the correlation between one variable and a set of variables, partial correlations)

Scatterplots

A way to graph and see the correlation between two variables An oval is drawn around the data points to visualize the overall shape of the scatterplot. The dots can be described by an oval, rather than a square/circle, indicates that there is a relationship. The oval slants upward from left to right indicates a positive relationship. The oval does not appear to bend, which indicate a linear relationship The oval is relatively narrow, i.e. the data point cluster close around a line. This indicates a strong relationship

Coefficient of Determination, r^2

A way to quantify the strength of a relationship % variance shared by two variables % variance in one variable that is accounted for, explained by the other variable Example If height (X) and weight (Y) are correlated with r = .50, r2 = .25 Means height accounted for, or explained 25% of the variance in weight 25% of the variance in weight is accounted for, or explained by the variance in height If you can explain 25% of the variance or r=0.25, that is large

Adjusted R^2

Adjusted for inflation in R^2 An estimate of the shrinkage with the model is applied to other sample If sample size large, R squared won't change that much and model reliable when tested on another ample

The Standard Regression Method

All IV's enter the equation in a single "step". The final regression model includes all the predictors Each IV is assessed as if it had entered the regression after all other IV's had entered and done they predictive work. i.e. it's b & b weights are determined by how much of the residual variance in the DV this IV can account for. i.e. each IV is evaluated in terms of what it adds to the prediction of the DV above and beyond those afforded by all the other IV's (the unique contribution each IV makes to the prediction of the DV) All variables treated like they're entered last and determined by unique contribution Shared variance do not get attributed at all

The Regression Equation, Raw Scores

Also known as the regression model, i.e. a theory of how one variable can be predicted from the other. Ypred = a + bX Ypred: the predicted value for the dependent variable, also known as the criterion variable. X: the independent variable or predictor. a: the Y intercept, i.e. Ypred when X = 0, a is a constant. b: the coefficient of the X variable. It is the amount X is weighted by (multiplied by). AKA b weight or b coefficient. It is also the slope of the line, how much Y changes for every unit change in X. Once a and b are determined, the regression line can be drawn Raw scores are often measured on very different metrics (e.g. GPA, GRE) Hard to compare two variables in raw scores forms

Cross-validation

Another way to estimate shrinkage Randomly divide a large sample into two Derive the regression equation with one subsample Use the equation to predict the criterion scores (Ypred) for the other subsample Correlate the predicted scores with their actual scores, this correlation is the cross-validation coefficient

Collinearity, Multicollinearity, Singularity

Can distort the interpretation of multiple regression -If 2 IV's are highly correlated, they are confounded with one another, cannot say which was more relevant -In standard regression, possible for multicollinear predictors to appear insignificant in the prediction of the DV despite each IV correlating highly with the DV When multicollinearity is very high, the analysis cannot proceed

When IV's are uncorrelated with each other

Compare the standardized regression coefficients (beta weights) IV's with bigger b's are more important than those with smaller (absolute) values However, beta weights are affected by the variances of the variables

Double Cross-Validation

Cross-validation in both directions Get two cross-validation coefficients, i.e. two estimates of shrinkage The cross-validation coefficients give an idea of how generalizable the regression model is

Order of Entry of IVs in Multiple Regression

Determines whether a variable gets credit for the variance it shares with other variables, thus affects its b and b weights. Variable enters first will take the most shared variance (large b weight) Variable that enters last will have the least amount of shared variance to take and could be nonsig Standard & Stepwise (your book calls both of these Statistical) If first variable changed, it could change the b weights The order of entry of IV into the equation is determined by the computer program (i.e. by statistical decision-making criteria) Sequential/Hierarchical The order of entry of IV into the equation is controlled by the researcher, decision-making is based on theory and practical reasons.

A Correlation measures three characteristics of a relationship

Direction (pos or neg), form (linear?), and degree (strength, how well does data fit straight line) Positive correlation = direct relationship Negative correlation = inverse relationship

Multiple Regression Prediction Example

IV's: Average GRE scores and undergraduate GPA DV: Early graduate school success measure on a scale from 50 to 80 Let's say the regression equation found is: Ypred = 46.50 + 5(GPA) + .01GRE Want to predict one student, Erin's early success based on her GPA of 3.80 and average GRE score of 650 Ypred = 46.50 + 5(GPA) + .01GRE Ypred, Erin = 46.50 + 5(3.80) + .01(650) Ypred, Erin = 46.50 + 19 + 6.50 Ypred, Erin = 72 The magnitudes of the b weights (partial regression coefficients) are partly dependent on the different metrics of GPA, GRE, and Success Scores. Thus, cannot just compare b weights.

Dealing with Multicollinearity

Identifying collinearity -Examine the Pearson correlations before multiple regression -Rule of thumb: any two IV with an r larger than the middle .7 should not be used together as predictors. Identifying multicollinearity -Tolerance: the amount of a predictor's variance not accounted for by the other predictors -i.e. Tolerance = 1-R2 (R is the correlation between the predictor and a linear combination of the other predictors). -Want large tolerance, don't want less than .1 -A related statistic: variance inflation factor (VIF) = 1/Tolerance, don't want that to be bigger than 10 What to do with multicollinear IVs -Use one variable in the multicollinear set -Create an average of the set

The Least Squares (Ordinary Least squares) Procedure

Is a way of fitting a straight line to an array of data There is only one least squares regression line to an array of data Step 1: find the distance between each data point and the line Step 2: square each distance Step 3: sum the squared distance Step 4: fit the regression line where the sum of squared distance is the smallest possible SPSS does this for us ypred = a (y-intercept) + bx

For sequential & stepwise (SPSS) test the sr2

Look at the Change Statistics (in Model Summary) -> how R square changes with each additonal variable and see how it adds to the prediction of the DV R Square Change = sr2 F Change = F ratio for change in R2 Sig. F Change tells your whether the associated IV adds significantly to the prediction of the DV

R and R^2

R & R2 -index the strength of correlation between the DV and a set of IV's -Similar to how r and r2 index the strength of relationship between two variables. R - the multiple correlation coefficient -R = rYYpred -R is the correlation between the predicted Y values and the actual Y values. -correlation between DV and set of IV -Indexes the association of one variable with a set of other variables. -Ex: How well predicted graduate GPA is correlated with actual graduate GPA or correlation between IVs (undergrad gpa, GRE) and graduate GPA R2 - Coefficient of multiple determination -R2 = % variance in the DV that is accounted for/explained by, the set of predictors or regression model/equation -One way to evaluate a regression model

Issues to Consider with Stepwise Regression

Most useful for prediction -When the goal is to find a subset of IV's that predicts the DV and exclude those IV's that do not add to the prediction Sample -Should be large and representative Additional Problem of R2 -Several IV's consider together may increase R2, while one might not. -For forward selection method, that means no IV enters the equation. Cross-validation -The sample is split into two random sample (80% and 20%) -After the regression equation is found with the 80%, it is validated with the 20% sample to see how well it predicts it

R2 is inflated

Multiple regression analysis maximize R2 Variability due to random/measurement error are treated as if they represent real, reliable differences Overprediction is more a problems with small sample size and large number of predictors. (>20 cases/predictor more acceptable) Inflated with small sample size, SPSS only accounts for main variable rather than all underlying variables Large sample size, R squared not inflated as much R^2 will shrink when sample size small and it's applied to a different sample

The Standardized Regression Equation

Often desirable to convert raw scores to standardized scores, z scores. z score = how many standard deviation a score is from the mean of the distribution. z scores distribution has a mean of 0 and standard deviation of 1 The Standardized Equation: Yz pred = bXz Yz pred is the z score value that is being predicted, the value of the criterion variable in z scores. Xz is the predictor variable converted to z scores. b is the coefficient of the Xz variable, also = r for simple regression. The Y intercept dropped out because when Xz = 0, Yz = 0

The Variables in Multiple Regression

One Dependent (Criterion) Variable (DV) a quantitative (continuous) variable often designated as Y or Ypred the variable being predicted Summative response = interval Two or More Independent (Predictor) Variables (IV) can be continuous or dichotomous (can't be multicategorical) designated by X's (X1, X2, X3,...) used to construct a linear model to predict Y

Correlation in SPSS vs Multiple regression in SPSS?

Pairwise deletion (none completely deleted) vs listwise deletion (case deletion)

Practical Issues

Ratio of Cases to IV's: -Sample size depend on desired power, alpha level, number of predictors, expected effect size -If not in literature, assume medium effect size Simple rules of thumb: -Assuming medium effect size, alpha =.05, power= .80 -N>50+8m (m = # of IV's) for testing R (just model) -N>104+m for testing IV's (test individual beta weights) .25 large effect .15 medium effect under .10 small effect Larger sample size needed when: -Stepwise (statistical) regression is used, a ratio of 40 cases to 1 variable is needed -Small effect size so need larger sample size -Large measurement error -DV is skewed and transformation was not performed Outliers -Have too much impact on the regression analysis -Should be deleted -Can be detected via initial screening or via residual analysis -Safer to screen and delete outliers (univariate and multivariate) before regression analysis

Prediction Error

Regression Equation is the central tendency of a set of data Unless the relationship is prefect, i.e. r = 1, there will be prediction error the stronger the relationship between X and Y, i.e. larger r, the smaller the prediction error (smaller the error band) Standard error of prediction=average amount of error when you use the predicted value as a way to estimate y

Theoretical Limitations/Considerations

Relationships revealed in regression analyses do not imply causality Specification Issues -The regression model configures the IV's together in such a way as to maximize the predication accuracy, thus the specific weights assigned to each IV is relative to other IV's in the analysis. -With a different set of IV's, a variable's b and beta weights, as well as pr's and sr's might be very different -The model is incompletely specified when potentially important variables are not included, external validity may suffer. -Want to carefully consider what variables to include in the regression analysis

Data for Correlation

Requires two scores (X, Y) for each individual. Usually the two variables are simply observed. A scatterplot of the data displays the relationship between the two variables.

Coefficient of Nondetermination, 1- r2

Residual or unexplained variance in above e.g., 1-r2 = 1 - .25 = .75 75% of the variance in weight is unexplained by the variance in height

Range (Variance) Restriction

Restricted range (variance) in a variable in a sample Sample correlation, r - an underestimate of population correlation, r Correlation is based on variability in one set of scores related to the variability in the other set of scores, if there's not variability, then there's nothing to correlate with and will underestimate correlations Two examples of Range Restrictions: -Sample Selection Problem e.g. using only people with advanced degrees to study the correlation between education and income r from the sample would likely be smaller than in r in the population that include education levels ranging from some H.S. to doctoral degrees. -Group Differentiation Problem e.g. a study comparing two groups of Chinese immigrants, one had lived in the US. for 1 year and the other for 2 years, on acculturation levels The two groups are probably not different enough

Shared Variance

Some of the differences (variance) in a set of scores (one variable) is related to, accounted for, or due to another variable (shared variance)

Interpretation of sr^2

Standard Multiple Regression: The squared semipartial correlation, sr2 is the amount R2 is reduced if that IV is deleted from the regression equation i.e. sr2 is the unique contribution of an IV to the % variance explained in the DV if the IV's are correlated, the sum of sr2 often is smaller than R2 (includes the shared variance) Sequential or Stepwise Regression sr2 is the amount of variance added to R2 by each IV at the point that variable enters the equation sr2 is shared variance and any unique contribution that hasn't entered the equation yet sr2 sum to R2 (that hasn't entered the equation yet)

Choosing Regression Strategies

Standard Regression -The most popular -When the method is not specified, it's standard regression -Atheoretical, simply assess relationships among the variables -Answer the questions: overall relationship/model between the DV and the set of IV's (R), the unique contribution of each IV's to R2 Sequential Regression -Testing explicit hypotheses about variance due to some IV's after variance due to other IV's are accounted for. Covariance analysis. -Answers the question: does an IV add significantly to the prediction of the DV above and beyond the prediction provided other IV's already entered. Stepwise -Use for selecting the most useful subset of IV's for prediction to build mean lean prediction machine -Useful for initial model building. -Answers the question: what is the best linear combination of IV's for the prediction of the DV in this sample? -Useful for big samples with lots of variables and you're looking for a subset of variables

Explanatory example of multiple regression research?

Standard multiple regression: Is reading ability in early elementary years (DV) related to perceptual development, parental involvement, and parent education? Most popular How subset of variables can predict Sequential/hierarchical regression: e.g. Baldry, 2003, studied if violence between parents contribute significantly to bullying behavior above and beyond what's predicted by demographic variables and child abuse. Step 1 found that gender (being boy) and age (being older) predicts bullying Step 2 found that child abuse by father added to prediction of bullying. Step 3 found that mother's violence against father contributed significantly to prediction of bullying. The full model was significant, accounted for 14% of the variance in bulling behavior. Specific question: Does the variable that I'm interested in, does it add to the variable that I already know? Put one variable in, let that explain the differences Leftover differences, can the matched variable explain more/the rest?

Strength of relationship?

Statistical Significance First evaluation: Is r significantly differ from 0? (i,e: does a relationship exist between the two variables?) Statistical significance is affected by sample size (n) Examples: for .05 level of significance at power = .80: n=13, df=11, r needs to be about .6 n=21, df=19, r needs to be about .5 n=64, df=62, r needs to be about .3 n=614, df=612, r needs to be about .1 As sample size increases, increases chance of statistical significance and can have smaller r; just tells you if there is a difference (r is or is not 0) Magnitude of Correlation Second evaluation: strong or weak correlations? If don't have cutoffs for practical purposes to judge if big or small effect size, use Cohen(1988): large (.5), moderate (.3), small (.1) Practical significance: is r large enough for the application in question

Prediction example of multiple regression research?

Stepwise regression: Of various personality measures and entrance exams, which few in combination produce good prediction for job success. This lean model is then used for future predictions when hiring. Try to eliminate variables from the model Do when have a lot of variables

The structure coefficient

Structure Coefficient = rIVxDV/R is the bivariate correlation between the predicted Y scores (i.e. the value of the variate/latent variable) and an IV (predictor), rYprd xIV Indexes the correlation between the predictor (latent variable) and the variate. Larger structure coefficient indicates that the predicator is a stronger reflection of the construct underlying the variate since it means the latent variable and measured variable has much in common How good is this measured variable an indicator of the latent variable (y predictor)

Testing the Multiple Correlation, R

Test the significant of the model -H0: in the population, R=0, or all the r's are 0, and all b's are 0 -Does the regression equation (i.e. the linear combination of the IV's) provide better than chance prediction of the DV? -Trivial when N is large -R2 is distributed as F, and test is presented as ANOVA for standard and sequential regression -Statistical/Stepwise regression, adjustments are necessary because not all IV's enter the equation and the test for R2 is not distributed as F First step is to see if model is significant. If multiple R is 0, then none of predictors can predict the outcome and all the b weights are 0 and can't do anything else If multiple R not 0, at least one of the b weights are not 0 and can predict the outcome and better than chance

Pearson Product-Moment Coefficient, r

The most widely used bivariate correlation Measures the degree and direction of the linear relationship between two quantitative variables Ratio of how much two variables share/how much variability each variable has degree to which X and Y vary together/degree to which X and Y vary separately covarability of x and y/variability of x and y separately

The order of entry of a variable?

The order of entry of a variable -impacts how important that variable appear to the solution -e.g. IV2 would have gotten "credit" for areas b, c and d if it was entered first. Some ways the order might be determined -Order of entry are assigned based on logical and theoretical considerations -Researchers can give priority to those variables that are manipulated or believed to have more priority in the causal chain or theoretically more important. (e.g. parent-child temperament match before age of placement in assessing attachment of adopted children or the looking at psychosocial over the machine) -Alternatively, researchers could entered the "nuisance" variables first, the variables of theoretically interest in the last step to see if the variables of interests contribute to the prediction above and beyond that predicted by the "nuisance" variables. (e.g. age and IQ enter first in assessing the effect of a learning enrichment program on performance in a verbal test), covariance analysis example is analyzing age first in memory

Overview of the Regression Process

To find the best fitting line for the scatterplot This line is used as the basis for predicting Y values from X values Try to find the line with the smallest amount of prediction error -> square the distance -> try to find the line with the least square error -> best predicting line

Regression Analysis

Used for prediction

The Regression Equation as a Model

We use a representative sample to build a model Use this model to predict the value of the dependent variable for other individuals in the population the model also help us understand the relationship between X and Y. Interested in the model, not in the prediction

Variate in Multiple Regression

a weighted combination of the IV's can be viewed as a latent variable, underlying construct e.g. undergraduate GPA and GRE scores are used to predict success in graduate school the variate might be "academic aptitude" indexed by the linear combination of undergraduate GPA and GRE scores. Regression analysis is used to find the most effective "academic aptitude" variate to predict success in graduate school

The structure coefficients for standard regression

all the sr2 can be very small when the IV's are moderately or highly correlated Comparing very small sr2 not very productive b weights can be small because they would have small unique contribution but their shared variance and correlation can make the model significnat

Interpreting a regression coefficient

b and b (beta) weights are the slope of the regression line larger absolute value means greater slope, or larger change in Y per unit change of X Ypred = 4 + .50X b = .50, i.e. for every 1 unit change in X, Ypred increases by .50 a = 4, i.e. when X=0, Ypred = 4, or the line cross the Y-axis at 4

When IV's are correlated with each other:

consider the type of regression consider both the full and unique relationship between the IV and the DV consider the correlation with other IV's The full relationship (r) between the IV and DV, and the correlations of IV's with each other (r's) are given by the correlation matrix The unique contribution of an IV is generally assessed by either partial (pr) or semipartial (sr) correlation Consider the structure coefficient

Full correlation (r)?

correlation between an IV and DV r^2: (a+b)/(a+b+c+d)

Semipartial correlation (sr)

correlation between the unique contribution of an IV and the total variance of the DV

Partial Correlation

correlation between the unique contributions of an IV and the residual variance of the DV (not other IV) r^2: a/(a+d)

Correlation

measures and describes the relationship (association,covariation) between two or more variables Simplest is bivariate (2 variables)

Multicollinearity

more than 2 IV's correlate strongly with each other

B weights

partial regression coefficients

Singularity

perfect correlation between IV's

r and r^2

r is not measured on a ration scale so ratio does not make sense (r of 0.40 is not 2X as strong as r of 0.20) intervals in r are not equal e.g. a difference of .10 represents different amount of changes in the variance explained depending on how large the correlations Must keep this in mind when interpreting r Use r2 to help compare two correlations (e.g. r of .7 is about 2x r of .5) r^2 is measured on ratio scale so ratio makes sense e.g. r2 of .40 represents a relationship 2X as strong as r2 of .20, or 40% of variance explained is twice as much as 20% of the variance explained.

Alternative Measures of Strength of Relationship

r^2 r r^2/(1-r^2) : measure of the amount of explained variance to the amount of unexplained variance

Collinearity

two IV's correlate very strongly with each other


Set pelajaran terkait

(2042) Exam #4 Questions *new material*

View Set

Consumer Behavior Chapter 9 - Learning, Memory, Product Positioning

View Set

Econ Chapter 11 Study Plan Questions

View Set