Exam 2 Stat Meth Bus
Using the best-subsets approach to model building, models are being considered when their
Cp <= k+1
Which of the following is used to find a "best" (most appropriate) model?
Cp statistic or adjusted R square
Choose variables based on VIFj values. For a dataset with 4 independent variable, VIF1=1.54, VIF2=5.54, VIF3=7.89, VIF4=6.90, we should
DELETE X3
In a multiple regression model, which of the following is correct regarding the value of the adjusted r^2
It has to be positive.
The least squares method minimizes which of the following?
SSE
Which of the following statements about linear regression is incorrect?
SSR measures unexplained variation.
D Message 1 https://cdn.discordapp.com/attachments/883501430177148949/1043402912103153695/image.png
The predicted Asking Price of House Type ranch is 81.94 more expensive than the predicted Asking Price of House Type other holding Living Space and Fireplace constant.
D 2 https://cdn.discordapp.com/attachments/883501430177148949/1043403395328917574/image.png
There is insufficient evidence (at α = 0.05) to indicate that the relationship between weight loss (Y) and months on program(X1) varies with session time.
SEXgender of the individual; 1 if female, 0 if male Y_hat= .....+0.39SEX What is the correct interpretation for the estimated coefficient for SEX?
females is estimated to be $0.39 higher than males. who ever has 1 = higher (if negative lower)
In a multiple regression problem involving two independent variables, if b1 is computed to be +2.0, it means that
the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, holding X2 constant.
An interaction term in a multiple regression model may be used when
the relationship between X1 and Y changes for differing values of X2.
Assuming a linear relationship between X and Y, if the coefficient of correlation (r) equals - 0.30,
the slope (b1) is negative.
A dummy variable is used as an independent variable in a regression model when
the variable involved is categorical.
The plot residuals versus the predicted value of Y can be used to discover a possible curvilinear effect in at least one independent variable.
True
The standard error of the estimate, SYX, measures the variability of the observed Y values from the predicted ones.
True
The Variance Inflationary Factor (VIF) measures the
correlation of the X variables with each other
The intercept (b0) represents the
predicted value of Y when X = 0.
The logarithm transformation can be used
to change a nonlinear model into a linear model.
A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and constructed the multiple regression model. The business literature involving human capital shows that education influences an individual's annual income. Combined, these may influence family size. With this in mind, what should the real estate builder be particularly concerned with when analyzing the multiple regression model?
Collinearity
The Durbin-Watson D statistic is used to check the assumption of normality.
False
The Regression Sum of Squares (SSR) can be greater than the Total Sum of Squares (SST).
False
The confidence interval for the mean of Y does not depend on the value of independent variable Xi.
False
To properly examine the effect of a categorical independent variable in a multiple linear regression model we use an interaction term.
False (Dummy Variable)
Stepwise regression approach evaluates all possible regression models.
False Stepwise regression does not fit all models but instead assesses the statistical significance of the variables one at a time and arrives at a single model.
Data that exhibit an autocorrelation effect violate the regression assumption of independence.
True
If the correlation coefficient (r) = 1.00, then all the data points must fall exactly on a straight line with a positive slope.
True
The confidence interval for the mean of Y is always narrower than the prediction interval for an individual response Y given the same data set, X value, and confidence level.
True
The strength of the relationship between two numerical variables can be measured using the correlation coefficient, r.
True
Which of the following is NOT an approach for handling nonlinear relationship between Y and X? Variance inflationary factor Square-root transformation Quadratic regression model Logarithmic transformation
Variance inflationary factor
A regression diagnostic tool used to study the possible effects of collinearity is
the VIF.
The residuals represent
the difference between the actual Y values and the predicted Y values.