Business Analytics II Chapter 13

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

four assumptions of regression

linearity, independence of errors, normality of error, equal variance

standard error of estimate

measures the variability of the observed Y values from the predicted Y values, equal to sq.root(SSE/(n-2))

durbin-watson statistic

used to measure autocorrelation, measures correlation between each residual and the residual for the previous time period; if positive autocorrelation -> D will approach 0

residual analysis

visually evaluates these assumptions and helps you determine whether the regression model that has been selected is appropriate

relevant range

when using regression models, only consider the ____________________ of the independent variable in making predictions; *includes all values from the smallest to the largest X used in developing the regression model*; you can interpolate but not extrapolate

B0

y intercept for the population, represents the mean value of Y when X=0

H1

B1 does not equal 0

H0

B1=0; if rejected, you conclude that there is evidence of a linear relationship

regression mean square (MSR)

SSR/1 = SSR

total sum of squares (SST)

a measure of variation of the Yi values around their mean; total variation is divided into explained variation and unexplained variation; *equal to the regression sum of squares plus the error sum of squares*

independent variable

also known as the explanatory variable,

dependent variable

also known as the response variable,

prediction line

also known as the straight line formed from the simple linear regression

f test

alternative to t test, use this to determine whether the slope is statistically significant; Fstat: regression mean square/mean square error -> MSR/MSE

autocorrelation

basic assumption of regression model is independence of errors, but this is violated when data are collected over sequential time period because a residual at any one time period sometimes is similar to residuals at adjacent time periods -> validity is doubtful in this case

mean square error (MSE)

error variance; SSE/(n-2)

residual

estimated error value is the difference between the observed (Yi) and the predicted values (^Yi) of the dependent variable for a given value of Xi

regression sum of squares (SSR)

explained variation, represents variation that is explained by the relationship between X & Y; *based on the difference between ^Yi (the predicted value of Y from the prediction line) and _Y (the mean value of Y)*

reject H0 in f test

if Fstat > Falpha

point estimate

predicted value of y given the x variables

coefficient of determination

r2; equal to the regression sum of squares divided by the total sum of squares (SSR/SST); gives us the proportion of variation in Y that is explained by the variation in the independent variable X in the regression model

Ei

represents the random error in Y for each observation, i; *vertical distance of the actual value of Yi above or below the expected value of Yi on the line*

independence of errors assumption

requires that the errors be independent of one another, particularly important when data are collected over a period of time; time series - a residual may sometimes be related to the residual that precedes it -> cyclical pattern so it's violated

normality assumption

requires that the errors be normally distributed at each value of X; if they appear to depart substantially from the normal distribution -> it's violated

equal variance assumption (homoscedasticity)

requires that the variance of the errors be constant for all values of X; if it is funnel-shaped -> it's violated

B1

slope for the population, represents the expected change in Y per unit change in X

linearity assumption

states that the relationship between variables is linear; if the model is appropriate, you will not see any pattern in the plot

least-squares

this method minimizes the sum of the squared differences between the actual values (Yi) and the predicted values (^Yi) using the prediction line;

t test

to determine the existence of a significant linear relationship between the X and Y variables, you test whether B1=0

error sum of squares (SSE)

unexplained variation, represents variation due to factors other than the relationship between X & Y; *based on the difference between Yi & ^Yi*


Set pelajaran terkait

Culture (assessment project questions)

View Set

الفصل الثاني / حضارة بلاد الرافدين (العراق)

View Set

lesson 1 module 2: trade-offs and opportunity costs

View Set

Med Surg GI/GU Diabetes Questions

View Set