Data Analysis: Chapter 13:

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

three important assumptions that are tested for random error E

1. the errors are normally distributed 2. the errors have constant variance 3. the errors are independent

A measure of relative fit for a simple regression line is called the R2 or ___________ __________ ___________.

coefficient of determination

Which of the following is a "goodness of fit" measure?

coefficient of determination

A __________ interval for Y, the response variable, predicts the mean of Y whereas a _________ interval for Y predicts the individual value for Y.

confidence; prediction

A response variable, Price, is defined as the selling price of a used car. Three predictor variables include, age, the age of the car in years, mileage, the mileage of the car in thousands of miles, and cylinders, the number of engine cylinders. The estimated regression equation is: Price = 10,000-1800Age-50Mileage+1200Cylinders. Predict the average price of 2 year old car with 50,000 miles and 4 cylinders.

$8700 *note: use mileage = 50, not 50,000*

A multivariate data set will have ________

1. a single column of Y values 2. n rows of observations 3. k columns of X values

Variance inflation caused by multicollinearity can result ________.

1. in untrustworthy t statistics for the coefficient estimates 2. in difficulty identifying the contribution of each predictor 3. in wider confidence intervals for the coefficients of parameters than warranted

Variance inflation cause by multicollinearity can result __________

1. in wider confidence intervals for the coefficients of parameters than warranted 2. in difficulty identifying the contribution of each predictor 3. in untrustworthy t statistics for the coefficient estimates

4 criteria for regression assessment

1. logic 2. fit 3. parsimony 4. stability

Limitations of simple regression

1. multiple relationships usually exist 2. biased estimates if relevant predictors are omitted 3. lack of fit does not show that X is unrelated to Y is the true model is multivariate

A multiple regression model is preferred over a simple regression model because ________.

1. rarely does one predictor explain the variation in Y as well as several predictors 2. it is possible that a predictor can appear unrelated to Y in a simple regression but it can show significance when combined with another predictor

If two predictor variables, X1 and X2, are suspected of having an interaction effect on the response variable Y, we can test for this by adding the term _______ to the model.

B3x1x2

If a predictor variable, Xj, is suspected of having a quadratic relationship with the response variable Y, we can test for this by adding the term __________ to the model.

Bjxj2

The ______ statistic is defined as the ratio MSR/MSE.

F

The response variable (Y) is assumed to be related to the ______ predictors by a linear equation called the ______ _________ ___________.

K; population regression model

Match the regression criteria with the reasoning

Logic: Is there an expectation that the predictor will help explain variation in the best response? Fit: Does the overall regression show significant predictive ability? Parsimony: Does each predictor contribute significantly to the model? Stability: are the predictors independent enough from each other so that the model is stable?

A response variable is defined as the selling price of a used car. Three predictor variables include age of the car, the mileage f the car, and the number of cylinders. The proper estimated regression equation would be:

Price = b0 +b1Age + b2Mileage + b3Cylinders

The standard error of the regression, se, is calculated by taking the square root of ________ divided by ______.

SSE; n-k-1

What does SSE represent in regression analysis?

The amount of variation in Y that is left unexplained

If c binary variables are created for a categorical predictor with c categories, the regression calculations will fail because we will have ________.

a redundant predictor that causes perfect collinearity

Suppose we define a qualitative variable called payment method and the categories are credit card, personal check, or cash. The binary variables defined are CC =1 if pay by credit card (0 otherwise) and PC = 1 if pay by personal check (0 otherwise). If both CC = 0 and PC = 0 it means the payment method was by ______

cash

A correlation matrix can be used to identify possible ______ between two predictor variables.

correlation

A response variable, Price, is defined as the selling price of a used car. Three predictor values include, Age, the age of the car in years, Mileage, the mileage of the car in thousands of miles, and Cylinders, the number of engine cylinders. The estimated regression equation is: Price = 10,000-1800Age-50Mileage+1200Cylinders. If the estimated price of a used car is $8700 and the actual selling price is $9000, residual is _____.

ei = $300

multiple regression

extends simple regression to include several independent variables. - required when a single-predictor model is inadequate to describe the true relationship between the dependent variable Y and its potential predictors.

True or false: A binary predictor variable is tested for significance using a different test statistic than used for a qualitative predictor variable

false

True or false: Software packages such as Excel or MINITAB routinely report left-tail p-values when testing multiple regression coefficients.

false

When the predictor variables are related to each other rather than being independent we have a condition called _______.

multicollinearity

Coefficient instability would be when X1 and X2 both show strong correlation with Y and ________.

one or both of their coefficients are not significant

Klein's Rule

suggests that we should worry about the stability of the regression coefficient R

A response variables, Price, is defined as the selling price of a used car. Three predictor variables include, Age, the age of the car in years, Mileage, the mileage of the car in thousands of miles, and cylinders, the number of engine cylinders. The estimated regression equation is: Price= 10,000-1800Age-50Mileage+1200Cylinders. The coefficient -1800 means ________.

that for each year increases in a car's age, the price decreases by $1800 on average

If we fail to reject the null hypothesis that the coefficient Bk = 0 then we conclude ______.

that the predictor variables X is not associated with the response variable Y.

identify the estimated multiple regression equation

y(hat)=b0+b1x1+b2x2+...+bkxk

identify the population multiple regression model.

y=B0 + B1x1+B2x2+...+Bkxk+E

The objective when conducting a regression analysis is _______.

to find a linear equation that minimizes the sum of the squared differences between the observed response and the estimated response

True or False: R2adj is always less than R2.

true

stepwise regression

uses the power of the computer to fit the best model using 1,2,3....k predictors.

A response variable, Price, is defined as the selling price of a used car. Three predictor variables include, Age, the age of the car in years, Mileage, the mileage of the car in thousands of miles, and Cylinders, the number of engine cylinders. The estimated regression equation is: Price= 10,000-1800Age-50milage+1200Cylinders. By how much will the price be reduced for each additional 10,000 miles on the car?

$500

"When two explanations are otherwise equivalent, we prefer the simplest explanation." This is known as the principle of ______

Occam's Razor

The R2 value falls in the range __________ to ________.

0 to 1

Each category of a qualitative variable can be converted to a binary variable by assigning the value ______ or ______ to indicate the presence or absence of the condition.

0;1

The quick rule for finding 95% confidence or prediction intervals for Y substitutes the number _________ for the __________ statistic

2; t

Doane's Rule states that there should be at least ________ observations for each predictor variable

5

Identify the four criteria for regression assessment

Logic Fit Parsimony Stability

Principle of Occam's Razor

When two explanations are otherwise equivalent, we prefer the simpler, more parsimonious one.


Kaugnay na mga set ng pag-aaral

Rock and Roll Midterm Studyguide

View Set

OB Chapt 10 Fetal Development & Genetics

View Set

Delegation subtopic in Leadership/Management Evolve Adaptive Quizzing

View Set

Adding and Subtracting Polynomials - Quiz

View Set

Alcohol, Drugs, and Human Behavior Review Exam 3

View Set

Épocas y Movimientos Literarios

View Set