Business Statistics Chapter 12 & 13
number of independent variables
"k" represents the n_________ o___ i____________ v__________ in the regression model.
dummy variable
A categorical independent variable with two levels is a what?
overall significance, MSR/MSE
An F-Test is a test for o________ s___________ of the regression model. The test statistic: (F-ration) is F=_____/______
goodness of fit in regression
Coefficient of determination (r-squared): "how well does the data fit the regression line?" is the test for g_______ o_ f___ i__ r____________.
scatter plot
Correlation vs. Regression: A s_________ p_______ can be used to show the relationship between two variables.
strength
Correlation vs. Regression: Correlation is only concerned with the s______ of the relationship.
correlation
Correlation vs. Regression: No casual effect is implied with c__________.
correlation analysis
Correlation vs. Regression: c__________ a___________ is used to measure the strength of the association (linear relationship) between two variables.
spurious correlation
Correlation vs. Regression: s_________ c__________ is when two unrelated variables correlate with each other, either by chance or by an unseen third factor. EG: shoe size and reading ability (3rd factor: age) EG2: Likelihood of dancing; likelihood of vomiting (3rd factor: alcohol consumption)
K, n-k-1
F-Stat follows an F distribution with ___ numerator and ____________ denominator degrees of freedom.
larger, more evidence
F-Test conclusion: The _________ the F-Ratio, the m_______ e___________ exists to reject Ho and prove that the relationship between x & y is statistically significant.
MSR/MSE (F-ratio)>Fa (a stands for alpha)
F-Test: Testing for over-all significance: Reject Ho if ______________
one-tailed
F-tests are o____-t______ tests.
Y, X
If r-squared = 30%: weaker linear relationship; a smaller portion of the variation in __ is explained by variation in ___.
no linear relationship, does not depend on x, no slope (none of the variation in Y is explained by variation is X)
If r-squared=0, then there is n__ l_________ r________, and the value of Y d_____ n_____ d______ o__ ___. The line has n__ s_______.
lower
If the p-value is _______ than alpha, then reject Ho.
y-intercept, x=0
In Y^ = bo+b1xi equation, "bo" is the _-________, the value of Y when ________(includes a symbol)
estimated, predicted
In Y^ = bo+b1xi, the y-hat is the _______ or p________ Y value for the observation i.
value of x, observation y
In equation Y^ = bo+b1xi, "Xi" is the v________ o_ __ for o_________ __.
slope of the line, y, x
In equation, Y^ = bo+b1xi, "b1" is the s_______ o__ t____ l_______, the change in __ divided by the change in ___
SSE/n-k-1
MSE is a ratio of what?
SSR/K
MSR is a ratio of what?
multiple regression
M__________ r___________: regression with two or more independent variables
unexplained variation
Measures of variation: SSE = error sum of squares. This is the u__________ v_________.
explained variation
Measures of variation: SSR = regression sum of squares. This is the e__________ v__________.
total variability
Measures of variation: SST = total sum of squares. This is the t_______ v________.
error sum of squares (Error Adj SS)
Measures of variation: What does SSE represent?
regression sum of squares (Regression Adj SS)
Measures of variation: What does SSR represent?
Total sum of squares (Total Adj SS)
Measures of variation: What does SST represent?
dependent variable, independent variable
Regression analysis is used to: Predict or explain the value of a d___________ v___________ based on the value of at least one i_____________ v__________.
overestimated, negative, underestimated, positive
Residual analysis: Points that have been o_____________ by the model will yield a n_________ residual value. (The estimated line is above the actual point.) Points that have been u_________ by the model yield a p__________ residual value. (The estimated line is below the actual point)
residual
Residual analysis: The r_______ is the difference between the point and the regression line.
difference between its observed and predicted value
Residual analysis: ei+yi-y^: The residual for each observation, ei, is the d_________ b________ i___ o__________ a____ p_________ v________.
simple linear regresion
S________ l__________ r_______: one independent variable
>
Testing for signficance: Reject Ho if (F-ratio) MSR/MSE __Fa
mean square due to regression, mean square due to error
The F-Ratio is a ratio of two variances: MSR represents what? MSE represents what?
x, y statistically significant
The F-Test in regression analysis answers the question: Is the relationship between the independent variable ___ and the dependent variable ___ s_________ s________?
over-all
The F-test tests for ____-___ significance (in linear relationship)
smaller
The __________ the p-value, you have ample evidence to reject Ho.
linear, 0 and 1, stronger, negative
The coefficient of determination (r-squared) measures the strength of the l______ relationship between x and y. It's the test of goodness of fit in regression (How well does the data fit the regression line?). Its value is between _(#)_ and _(#)___. The higher the percentage, the s_________ the linear relationship between x and y. It will never be n________.
percent of the total variability of y (measured by SST) that is explained by the independent variable (measured by SSR).
The coefficient of determination is the p________ o_ t____ t_________ v__________ o__ __ t______ i__ e_________ b___ t___ i__________ v________. It is also referred to as r-squared. The ratio is SSR/SST.
SSR/SST (regression sum of squares/total sum of squares)
The r-squared ration is: r-squared = ___________________
estimate
The simple linear regression equation provides an e__________ of the population regression line. Y^ = bo+b1xi
independent variable (X)
The variable used to predict or explain the dependent variable
dependent variable (Y)
The variable we wish to predict
SST = SSR+SSE
What is the equation for total variation (made up of two parts)?
F-test
Which test is a one-tailed test?
T-test (look up half of alpha)
Which test is a two-tailed test?
T-test, p-value test
Which test(s) tests for individual significance
change in the mean value of Y as a result of a one-unit increase in X
Y^ = bo+b1xi: "b1" (slope) is the estimated c_________ i___ t_____ m________ v_________ o__ __ a__ a r_________ o__ a o______-u________ i_________ i__ __. (see slide 4)
coefficient of determination
r^2 (r-squared) is the what?
0 or 1
dummy variables (categorical independent variables with two levels like yes or no) are coded as what?
negative, positive, no difference
dummy variables: If the coefficient has a negative value, the presence of the qualitative variable has a __________ impact on the model. If the coefficient has a positive value, the presence of the qualitative variable has a __________ impact on the model. If the coefficient is 0 (or near 0), there is n___ d_________ in the presence or absence of the qualitative variables in the model.