Business Analytics II Chapter 13

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

four assumptions of regression

linearity, independence of errors, normality of error, equal variance

standard error of estimate

measures the variability of the observed Y values from the predicted Y values, equal to sq.root(SSE/(n-2))

durbin-watson statistic

used to measure autocorrelation, measures correlation between each residual and the residual for the previous time period; if positive autocorrelation -> D will approach 0

residual analysis

visually evaluates these assumptions and helps you determine whether the regression model that has been selected is appropriate

relevant range

when using regression models, only consider the ____________________ of the independent variable in making predictions; *includes all values from the smallest to the largest X used in developing the regression model*; you can interpolate but not extrapolate

B0

y intercept for the population, represents the mean value of Y when X=0

H1

B1 does not equal 0

H0

B1=0; if rejected, you conclude that there is evidence of a linear relationship

regression mean square (MSR)

SSR/1 = SSR

total sum of squares (SST)

a measure of variation of the Yi values around their mean; total variation is divided into explained variation and unexplained variation; *equal to the regression sum of squares plus the error sum of squares*

independent variable

also known as the explanatory variable,

dependent variable

also known as the response variable,

prediction line

also known as the straight line formed from the simple linear regression

f test

alternative to t test, use this to determine whether the slope is statistically significant; Fstat: regression mean square/mean square error -> MSR/MSE

autocorrelation

basic assumption of regression model is independence of errors, but this is violated when data are collected over sequential time period because a residual at any one time period sometimes is similar to residuals at adjacent time periods -> validity is doubtful in this case

mean square error (MSE)

error variance; SSE/(n-2)

residual

estimated error value is the difference between the observed (Yi) and the predicted values (^Yi) of the dependent variable for a given value of Xi

regression sum of squares (SSR)

explained variation, represents variation that is explained by the relationship between X & Y; *based on the difference between ^Yi (the predicted value of Y from the prediction line) and _Y (the mean value of Y)*

reject H0 in f test

if Fstat > Falpha

point estimate

predicted value of y given the x variables

coefficient of determination

r2; equal to the regression sum of squares divided by the total sum of squares (SSR/SST); gives us the proportion of variation in Y that is explained by the variation in the independent variable X in the regression model

Ei

represents the random error in Y for each observation, i; *vertical distance of the actual value of Yi above or below the expected value of Yi on the line*

independence of errors assumption

requires that the errors be independent of one another, particularly important when data are collected over a period of time; time series - a residual may sometimes be related to the residual that precedes it -> cyclical pattern so it's violated

normality assumption

requires that the errors be normally distributed at each value of X; if they appear to depart substantially from the normal distribution -> it's violated

equal variance assumption (homoscedasticity)

requires that the variance of the errors be constant for all values of X; if it is funnel-shaped -> it's violated

B1

slope for the population, represents the expected change in Y per unit change in X

linearity assumption

states that the relationship between variables is linear; if the model is appropriate, you will not see any pattern in the plot

least-squares

this method minimizes the sum of the squared differences between the actual values (Yi) and the predicted values (^Yi) using the prediction line;

t test

to determine the existence of a significant linear relationship between the X and Y variables, you test whether B1=0

error sum of squares (SSE)

unexplained variation, represents variation due to factors other than the relationship between X & Y; *based on the difference between Yi & ^Yi*


Kaugnay na mga set ng pag-aaral

Culture (assessment project questions)

View Set

الفصل الثاني / حضارة بلاد الرافدين (العراق)

View Set

lesson 1 module 2: trade-offs and opportunity costs

View Set

Med Surg GI/GU Diabetes Questions

View Set