Business Stats MGT 3310 Pearson Chapter 14 & 15 Exam
The adjusted multiple coefficient of determination is adjusted for the
The number of independent variables
The following information regarding a dependent variable y and an independent variable x is provided: SUM OF ALL x=90 SUM OF ALL y= 340 n= 4 SSR= 104 SUM OF ALL (y-y-bar)(x-x-bar)= -156 SUM OF ALL (x-x-bar)^2= 234 SUM OF ALL (y-y-bar)^2= 1974 The mean square error (MSE) is
picture
The following information regarding a dependent variable (y) and an independent variable (x) is provided. y= 5, 6, 5, 7, 8 x= 2, 3, 4, 5, 6 SSE= 1.9 SST= 6.8 Determine the coefficient of determination (r^2)
r= 0.8489 so, 0.8489*0.8489 = 0.7206 (r)
The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is the
The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is the Residual which we write as e=y-^y; which is graphically the vertical distance between the regression line and the data point. Often denoted as e. Sum and mean of residual are equal to zero.
In regression analysis, the model in the form y=B0+B1X+ e is called the a. regression equation b. correlation model c. estimated regression equation d. regression model
The equation that describes how y is related to x and an error term is called the regression model. this regression model used in simple linear regression is: d. regression model
The equation which has the form of E(y)= ^y=b0+b1x1+b2x2+...+bpxp is
b0, b1, b2 etc. are the estimates of B0, B1, B2 etc. Estimated Multiple Regression Equation
If only MSE is known, you can compute the a. r-square b. coefficient of correlation c. standard error (SE) d. ith residual
c. standard error
Regression analysis was applied between demand for a product (y) and the price for the product (x), and the following estimated regression equation was obtained. ^y=120-10x Based on the above estimated regression equation, if price is increased by 3 units, then demand is expected to
slope= -10 so, if price is incresed by 3 units, the demand is expected decrease by (3*10) so 30 units. So demand is expected to; -Decrease by 30 units
The standard error of the estimate is the a. standard deviation of t b. square root of SSE c. square root of SST d. square root of MSE
- SE= square root of MSE d. square root of MSE
A term used to describe the case when the independent variables in a multiple regression model are correlated is
-Correlation is the measure of association between any 2 variables. -Leverage is a measure of how far the independent variable values of an observation are from those of the other observations. -Multicollinearity is a phenomenon that occurs when the independent variables are highly correlated with each other. -Regression is a technique of predicting a variable under study using other independent variables. Multicollinearity
The following information regarding a dependent variable (y) and an independent variable (x) is provided. y= 5, 6, 5, 7, 8 x= 2, 3, 4, 5, 6 SSE= 1.9 SST= 6.8 The last squares estimate of the slope is
0.7
The following information regarding a dependent variable (y) and an independent variable (x) is provided. y= 5, 6, 5, 7, 8 x= 2, 3, 4, 5, 6 SSE= 1.9 SST= 6.8 the coefficient of correlation is
0.8489 (r)
Regression analysis was applied between sales data (y in $1000s) and advertising data (x in $100s) and the following information was obtained. ^y= 12+1.8x n=17 SSR= 225 SSE= 75 sb1= .2683 Based on the above estimated regression equation, if advertising is $3000, then the point of estimate for sales (in dollars) is
12+1.8*30=66so; $66,000
Below you are given a partial computer output from a multiple regression analysis based on a sample of 16 observations. The interpretation of the coefficient of x1 is that
A one unit increase in x1 will lead to a 7.682 decrease in y when all other variables are held constant.
The following estimated regression equation was developed relating yearly income (y in $1000s) of 30 individuals with their age (x1) and their gender (x2) (0 if male and 1 if female). ^y= 30+.70x1+3x2 Also provided are SST = 1200 and SSE = 384. From the above linear function for multiple regression, it can be said that The yearly income (in $) expected of a 24-year-old female individual is
Ans: Given that Regression eqn: Y'=30+0.7 x1+3 x2 when X1=24,X2=1 Y'=30+0.7*24+3*1 Y'=49.80 yearly income (in $) expected of a 24-year-old female individual is 49.80
A measure of goodness of fit for the estimated regression equation is the
Coefficient of Determination R^2 is a measure of expalnatory power of regression and is defined as the ratio of variation explained by regression to the total variation or in other words it is the proportion of variation explained by regression. Calculated as: R^2= Explained Sum of Squares/Total Sum of Squares If the value of R^2 is close to 1, variation is expalined by regression and the regression fitted is good. If the value of R^2 is close to zero, no variation is expalined by regression and the fitted model can not be used for prediaction. It can also be noted that R^2 is the square of correlation between observed and predicted dependent variable. R^2= (Corr(y^y))^2 and it lies between 0 and 1. Multiple Coefficient of Determination
Correlation analysis is used to determine
Correlation analysis is uded to determine whether there is a linear relationship between an independent variable (x or predictor) and a dependent variable (y or contineous). - the strength of the relationship between the dependent and the independent variables
The mathematical equation relating the independent variable to the expected value of the dependent variable; that is E(y)=B0+B1X, is known as the a. regression equation b. correlation model c. estimated regression equation d. regression model
Each distribution of y values has its own mean or expected value. The equation that describes how the expected value of y, denoted E(y), is related to x is called the regression equation (for simple linear regression) a. regression equation E(y)=B0+B1X
SSE
Error sum of squares *Once you have calculated the error sum of squares (SSE), you can calculate the SSTR and SST. When you compute SSE, SSTR, and SST, you then find the error mean square (MSE) and treatment mean square (MSTR), from which you can then compute the test statistic.*
In a multiple regression analysis, SSR = 1000 and SSE= 200. The F stat for this model is
F= SSR/SST =1000/200 =5
Solving for SST then for coefficient of dtermination (r^2)
First, you need to calculate the overall average for the sample, known as the overall mean or grand mean. If you have 12 total observations, then you may obtain the overall mean by adding up the 12 sample values and dividing by 12. You then compute the SSTR with the following steps for each column: Compute the squared difference between the column mean and the overall mean. Multiply the result by the number of elements in the column. The calculations are based on the following results: There are four observations in each column. e.g. The overall mean is 2.1. The column means are 2.3 for column 1, 1.85 for column 2 and 2.15 for column 3. After you compute SSE and SSTR, the sum of these terms is calculated, giving the SST. The total sum of squares (SST) equals the sum of the SSTR and the SSE. So using the battery example, you get SST=SSTR+SSE E.G. (from hw); 800+200=1000 (SST) R^2 (coefficient of determination)= 1-SSE/SST =1-200/1000=1-0.2=0.8 =.800
In a regression analysis between sales (y in $1000) and advertising (x in dollars) resulted in the following equation ^y= 30,000+5x the above equation implies that an a. increase of $4 in advertising is associated with an increase of $5,000 in sales b. increase of $1 in advertising is associated with an increase of $5 in sales c. increase of $1 in advertising is associated with an increase of $34,000 in sales d. increase of $1 in advertising is associated with an increase of $5,000 in sales
Implies that a $1 increse in advertising is associated with an increase of $5,000 in sales (as sales is in x1000) slope= 5, by definition of slopes. d. increase of $1 in advertising is associated with an increase of 5,000 in sales.
In regression analysis, the unbiased estimate of the variance is a. coefficient of correlation b. coefficient of determination c. mean square error d. slope of the regression equation
In regression analysis, the unbiased estimator of variance is the sum of squares divided by the degrees of freedom, which is the mean square error. MSE provides the estimate of; it is SSE divided by its degrees of freedom. c. mean of squares
Interpretation of coefficient
In simple linear regression, we interpret b1 as an estimate of the change in y for a one-unit change in the independent variable. In multiple regression analysis, this interpretation must be modified somewhat. That is, in multiple regression analysis, we interpret each regression coefficient as follows: bi represents an estimate of the change in y corresponding to a one-unit change in xi when all other independent variables are held constant. In the Butler Trucking example involving two independent variables, b1 = .06113. Thus, .06113 hours is an estimate of the expected increase in travel time corresponding to an increase of one mile in the distance traveled when the number of deliveries is held constant. Similarly, because b2 = .923, an estimate of the expected increase in travel time corresponding to an increase of one delivery when the number of miles traveled is held constant is .923 hours. yˆ = .869 1 + .06113x1 + .923x 2
Larger values of r2 imply that the observations are more closely grouped about the
Larger values of r2 imply that the observations are more closely grouped around the least squares regression line. - Least Squares Line
The mathematical equation which has the form of E(y)= B0+B1x1+B2x2+....Bpxp relating the expected value of the dependent variable to the value of the independent variables is
Multiple regression equation
In order to test for the significance of a regression model involving 3 independent variables and 47 observations, the numerator and denominator degrees of freedom (respectively) for the critical value of F are
Test Stat: DF= number of observations-number of independent variables-1 FStat: DF= Demoniator DF= number of observations-number of independent variables-1 numerator DF= number of independent variables 3 and 43
As the value of the multiple coefficient of determination increases,
The goodness of fit for the estimated multiple regression equation increases.
SST
Total Sum of Squares The sum of squares associated with the source of variation referred to as "Total" is called the total sum of squares (SST). The sum of all observations. Note that the results for the Chemitech experiment suggest that SST = SSTR + SSE, and that the degrees of freedom associated with this total sum of squares is the sum of the degrees of freedom associated with the sum of squares due to treatments and the sum of squares due to error. In other words, SST can be partitioned into two sums of squares: the sum of squares due to treatments and the sum of squares due to error. After you find the SSE, your next step is to compute the SSTR. This is a measure of how much variation there is among the mean. With a low SSTR, the means of the different types are similar to each other.
In a regression analysis, the regression equation is given by y=12-6x. If SSE-510 and SST=1000, then the coefficient of correlation is
We know that coefficient of determination= 1 - (SSE/SST) = 1-(510/1000) =1-0.51 =0.49
The following estimated regression equation was developed relating yearly income (y in $1000s) of 30 individuals with their age (x1) and their gender (x2) (0 if male and 1 if female). ^y= 30+.70x1+3x2 Also provided are SST = 1200 and SSE = 384. From the above linear function for multiple regression, it can be said that the expected yearly income of
X2 is 1 for female and 0 for male ,so, for a female there is addition of 3 extra units(in thousands) in expected yearly income. So, it can be said that the expected yearly income of females is $3000 more than males.
The following information regarding a dependent variable (y) and an independent variable (x) is provided. y= 5, 6, 5, 7, 8 x= 2, 3, 4, 5, 6 SSE= 1.9 SST= 6.8 The least squares estimate of the y-intercept is
b= 3.4
In regression analysis, the error term e is a random variable with a mean or expected value of a. 0 b. 1 c. mule d. x bar
e is a random variable referred to as the error term. The error term accounts for the variability in y that cannot be explained by the linear relationship between x and y. -with a mean or expected value of a. 0
A regression model between sales (y in $1000), unit price (x1 in dollars), and television advertisement (x2 in dollars) resulted in the following function: ^y= 8-4x1+5x2 SSR= 3500 SSE= 1500 SS= 20 The coefficient of the unit price indicates that if the unit price is
increased by $1 (holding advertisement constant), sales are expected to decrease by $4000. **coefficient of x1 is -3, i.e. for every one unit increase in x1, there will be loss of 3 units in Y value This means that for every one dollar increase in unit price or x1, there will be a decrease of $3000 in sales or Y, keeping x2 constant or unchanged. Coefficient of x2 is +5, i.e. for every one unit increase in x2, there will be gain of 5 units in Y value This means that for every one dollar increase in television advertisement or x2, there will be an increase of $5000 in sales or Y, keeping x1 constant or unchanged.**
In a multiple regression analysis involving 15 independent variables and 200 observations, SST= 800 and SSE= 240. The multiple coefficient of determination is
k= 15 SST= 800 SSE= 240 n= 200 R^2 (coefficient of determination)= 1-SSE/SST =1- 240/800 =1- 0.3 =0.7 =0.700
A regression model involved 5 independent variables and 136 observations. The critical value of t for testing the significance of each of the independent variables coefficients will have
k=5 n=136 Remember that there are p=k+1 ordinary conditions, one fro every one of the obscure relapse coefficients. The anser for the ordinary conditions will be the least- squares estimators subsequently, p=k+1 =5+1 p=6 Now the critical value of t for testing the significance of every one of the autonomous factors coefficient will have is, T alpha/2. n-p assuming alpha= 0.05 At that point estimation of t is, critical value= T0.005, 130=2.614177 DF= n- (k+1); 136-(5+1); =136-6; =130
Regression analysis was applied between sales (in $1000) and advertising (in $100) and the following regression function was obtained ^y=500+4x Based on the above estimated regression line, if advertising is $10,000, then the point estimate for sales (in dollars) is
sale= response variable ($1000) advertising= ($100) = explanatory variable regression line is y= 500+4x if advertsing is $1000, the point estimate of sale is -$900,000
If the coefficient of determination is a positive value, then the coefficient of correlation a. must also be positive b. must be zero c. can be either positive or negative d. can be larger than 1
the coefficient of determination is the square of the correlation coefficient, that is coefficient of determination = (correlation coefficient)^2 the correlation coefficient can take any value from -1 to +1 and coefficient of determination being the square of correlation coefficient will always be positive so answer is c. can be either positive or negative