Stat Exam
heteroscedasticity
the error variances are not constant
t-test
used to determine whether the coefficients of the regression model are significantly different from zero
f-test
used to determine whether the overall regression model is significant
in a regression analysis is SST = 400 and SSR = 100, r^2 = ___.
0.25
assumptions of simple regression analysis
1. the model is linear 2. the error terms have constant variances 3. the error terms are independent 4. the error terms are normally distributed
the least squares method minimizes which of the following?
SSE
correlation
a measure of the degree of relatedness of variables
r
a measure of the linear correlation of two variables. it is a number that ranges from -1 to 0 to +1, representing the strength of the relationship
confidence interval
an estimate of the average value of y for a given x, denoted E(Yx)-the expected value of y
in simple regression analysis the error terms are
assumed to be independent and normally distributed with zero mean and constant variance
nonconstant error variance
cone shaped residual plot, associated with either heteroscedasticity or homoscedasticity. the error variance is greater for small values of x and smaller for large values of x
if the plot of the residuals is cone shaped, which assumption is violated?
constant variance/homoscedesticity
a quality manager is developing a regression model to predict the total number of defects as a function of the day of week the item is produced. production runs are done 10 hours a day, 7 days a week. the explanatory variable is ___
day of week
standard error of the estimate
denoted Se, is a standard deviation of the error of the regression model and has more practical use than SSE
residual
each difference between the actual y values and the predicted y values is the error of the regression line at a given point, y - y^
prediction interval
estimates a single value of y for a given value of x
a fruit stand owner develops a regression line, y=30 + 3x to predict y = sales amount per day ($100s), using x = the number of visitor per day (10s). the slope of this regression line suggests this: ____.
for every 10 additional visitors per day, on average, the sales amount is predicted to increase by 300 dollars
response variable
in multiple regression analysis, this is the dependent variable, y
if the correlation coefficient between two variables is 0
it means no linear relationship is present between the two variables
if the correlation coefficient between two variables is -1
it means the two variables have a perfect negative correlation
nonlinear residual plot
parabolic, not linear residual plot.
healthy residual graph
plot is relatively linear; the variances of the errors are about equal for each value of x
Pearson product-moment correlation coefficient, or coefficient of correlation
r
multiple regression
regression analysis with two or more independent variables or with at least one nonlinear predictor
partial regression coefficient
represents the increase that will occur in the value of y from a one unit increase in that independent variable if all other variables are held constant, Bi
nonindependent error terms residual plot
slanted plots, as the value of the residual is a function of the residual value net to it. a small negative residual is next to a small negative residual
the total of the squared residuals is called the
sum of squares error
homoscedasticity
the assumption of constant error variance
independent variable
the explanatory variable, designated as x
simple regression (bivariate regression)
the most elementary regression model that involves two variables in which one variable is predicted by another variable
regression analysis
the process of constructing a mathematical model or function that can be used to predict or determine one variable by another variable or other variables
coefficient of determination
the proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x), or r^2. ranges from 0 to 1
dependent variable
the variable to be predicted, designated as y
if the correlation coefficient between two variables is +1
there is a perfect positive relationship between the two sets of numbers