Multiple Regression
Multicollinearity
intercorrelations exost among predictors (iv) problem is if 2 variables are highly correlated, they essentially contain the same or much the same information, thus are measuring the same thing
multiple regression
same as simple regression but with more than 1 iv
Multicollinearity problems
1. severely limits the size of R because iv's are going after the same variability on the dv 2. overlapping info makes it difficult to determine the importance of iv 3. tends to increase the variances of the regression coefficients which results in a more unstable prediction equation
Multiple correlation (R) = to Pearson correlation
Maximized correlation R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression. The definition of R-squared is fairly straight-forward; it is the percentage of the response variable variation that is explained by a linear model. Or: R-squared = Explained variation / Total variation R-squared is always between 0 and 100%: 0% indicates that the model explains none of the variability of the response data around its mean. 100% indicates that the model explains all the variability of the response data around its mean. In general, the higher the R-squared, the better the model fits your data. However, there are important conditions for this guideline that I'll talk about both in this post and my next post.
Simple regression
single dv and iv
F test
test of significance determines if the relationship between the set of iv and the dv is large enough to be significant
Least squares
the line with the smallest amount of total squared error
beta coefficients
the strength of the effect of each iv to the dv
regression coefficient
when the regression line is linear
beta weights
will = the correlation coefficient when there is a singe predictor variable