MKT 317 Exam 1

Ace your homework & exams now with Quizwiz!

Real estate data in a large city was analyzed using recent homes sales from houses between 1000 and 3000 square feet. A model predicting the average selling price based on square footage is given below.Average price (in thousands of dollars) = -75 + 0.125*(square footage)Complete the sentence below.For every additional 100 square feet, the average home price increases by ___________.

$12,500

Suppose you have the following model that predicts the unexpected costs (in dollars) of a renovation project, based on the number of days that the renovation project is scheduled to take: unexpected costs = 50 + 100(Days) What is the predicted average unexpected costs among all projects that are scheduled to last 4 days?

$450

The resulting model is: Predicted Petal.Width = -0.24 + 0.52(Petal.Length) -0.21(Sepal.Length) + _______(Sepal.Width) Please round your answer to two decimal places

.22

Create a simple linear regression model using the iris data set where the Y-variable is Petal.Width and the X-variable is Petal.Length. Predicted Petal.Width = b0 + b1(Petal.Length) What is the estimate for the slope, b1? Please round to two decimal places

0.42

Suppose we would like to create the model Y = b0 + b1(X) We use the appropriate commands in R, and obtain the output in the image below. What is the value of the estimate of the slope, b1?

1.93575

Suppose we have the model Y = 10 * (X)^1.25 Whenever X increases 10%, the average value of Y increases about

12.65%

Suppose a large data set includes information about the weights (measured in carats) and prices (measured in US dollars) of recent diamond sales. The data produce the linear model below, and the R-squared value for this model is 0.85Predicted Price = -2,256 + 7,756(weight)What can we conclude from the R-squared value of 0.85?

85% of the variability in prices of recent diamond sales can be explained by the diamond's weights.

From the model above, we can conclude that ______% of the variability in Petal.Width can be explained by Petal.Length. You may round your answer to the nearest whole number.

93

The multiple linear regression model created above is generally accurate - we know this because approximately _____ % of the variability in Petal.Width can be explained by a combination of Petal.Length, Sepal.Width, and Sepal.Length. Please round your answer to the nearest whole number.

94

Suppose we have the model Y = 10 * (1.25)^x Whenever X increases 3 units, the average value of Y increases about

95%

For this question, we will use the model ln(Y) = 2 + 7(ln(x)) What describes the relationship between X and the average value of Y?

A percentage change in X corresponds to a percentage change in the average value of Y.

For this question, we will use the model ln(Y) = ln(50) - 6(X) What describes the relationship between X and the average value of Y?

An absolute change in X corresponds to a - percentage change in the average value of Y.

For this question, we will use the model ln (Y) = 2 + 4X What describes the relationship between X and the average value of Y?

An absolute change in X corresponds to a percentage change in the average value of Y.

Suppose Y = 3 + 2X Which of the following statements explains the relationship between X and Y?

An absolute change in X corresponds to an absolute change in Y.

Using the scatter plot below, estimate the correlation coefficient between variables X and Y. "Plot creates a heart or dots."

Approximately 0

Suppose we have a dependent variable, Y, and 50 independent quantitative variables: X1 to X50. Furthermore, suppose that we would like to create a simple linear regression model that predicts the average value of Y based on the value of exactly one of the independent variables. What is a reasonably accurate and efficient method to determine which of the 50 quantitative have the strongest linear relationship with Y?

Create correlation coefficient matrix plot and look for the biggest dot.

Suppose that we believe that for every additional dollar spent on social media advertising, the sales will increase by a certain percentage. What type of model should we use to create the interpretation above? In this model, sales is the Y-variable and social media advertising budget is the X-variable.

Exponential regression model

Every additional two cylinders will decrease the fuel economy by about 3.6 miles per gallon.

False

From the given model output, we can conclude that the size of the team is not correlated with salary.

False

From the model output, we can conclude that for every extra year of experience, the average salary increases 1.2 thousand dollars.

False

Suppose we would like to create the model Y = b0 + b1(X) We use the appropriate commands in R, and obtain the output in the image below. True or False: The output indicates that estimated average value of X is 1.93575

False

The p-value for graduate credits is smaller than the p-value for certificate credits, so we can conclude that the correlation between graduate credits and salary is stronger than the correlation between certificate credits and salary.

False

True or False: R-squared measures multicollinearity

False

True or False: This model reduces to Predicted mpg = 41.1 - 1.8(cyl) - 3.6(wt)

False

True or False: the model output above indicates that there is not a statistically significant correlation between disp and mpg.

False

When comparing cars of the same weight, every additional two cylinders will decrease the fuel economy by about 3.6 miles per gallon.

False

When comparing people with the same number of graduate credits and same number of certificate credits, then for every additional year of experience a manager has, the average salary increases 1.2 thousand dollars.

False

In the plot below, there is a scatter plot whose dots follow a generally straight-line pattern. This is telling us that a(n) ________ model is an appropriate option for creating a model that predicts life expectancy based on GDP per capita. (Image courtesy of gapminder.org)

Logarithmic

Suppose we have the following data about 1000 customers at a cell phone company - years: the number of years the individual has been a customer (values in the data are numbers ranging from 0 to 15) - price: the monthly price of the customer's cell phone plan (values in the data are numbers ranging from 30 to 150) - Renew: does the individual renew their contract at the end of the year (values in the data are text: either yes or no). Is it appropriate to use multiple linear regression to predict if a person will renew their contract based on the length of time they have been a customer and the current price they are paying?

No

Of the four options listed below, which do you think is the most accurate at modeling the relationship between Y and X?

Power

Suppose we would like to use the survey data from 500 randomly selected MSU students to estimate the average amount of money that a "typical" undergraduate student at MSU would be willing to spend on an environmentally-friendly t-shirt.This is an example of:

Predictive statistics

To answer the question "when we are comparing homes of the same size, can we say that newer houses are more expensive than older houses (of the same size)", we can create a linear regression model where the Y-variable is PRICE and the X-variable(s) is/are: (If the model that answers this question requires more than one X-variable, then select ALL X-variables that must be included in this model.)

SIZE AGE

To answer the question "when comparing homes of the same size, can we conclude that the average home price is higher for homes that have more special features than homes (of the same size) that have fewer special features", we can create a linear regression model where the Y-variable is PRICE and the X-variable(s) is/are: (If the model that answers this question requires more than one X-variable, then select ALL X-variables that must be included in this model.)

SIZE SPEC.FEATS

To answer the question "among homes with the same size, is there a difference in average home price based on property tax", we can create a linear regression model where the Y-variable is PRICE and the X-variable(s) is/are: (If the model that answers this question requires more than one X-variable, then select ALL X-variables that must be included in this model.)

SIZE TAX

To answer the question "can we conclude that the average home price is higher for homes that have more special features than homes that have fewer special features," we can create a linear regression model where the Y-variable is PRICE and the X-variable(s) is/are: (If the model that answers this question requires more than one X-variables, then select ALL X-variables that must be included in this model.)

SPEC.FEATS

Suppose in a multiple linear regression model, the independent variable X1 is strongly correlated with X4. The independent variables X2, X3, X5, and X6 are not related to any of the other independent variables. What can we say about the model Y-hat =b0 + b1x1 + b2x2 + b3x3 + b4x4 + b5x5 + b6x6?

Since X1 is strongly correlated with X4, then this model has high multicollinearity.

To determine if there is a statistically significant correlation between PRICE and TAX, we can create a linear regression model where the Y-variable is PRICE and the X-variable(s) is/are: (If the model that answers this question requires more than one X-variable, then select ALL X-variables that must be included in this model.)

TAX

Suppose you have variables named Supply and Demand, which are saved in an Excel data file called MyData. Suppose you would like to make the linear regression model: Demand = b0 + b1 (Supply) What are the "X" and "Y" variables?

The X-variable is Supply and the Y-variable is Demand.

Suppose the correlation coefficient between X and Y equals -0.0001 What does this tell us?

There is either a very weak correlation between X and Y or no correlation between X and Y.

Suppose we create a linear model, Y = b0 + b1(X), and the R-squared is very small. R-squared = 0.0002 What can we conclude?

There is not a strong linear relationship between X and Y, however it is possible that X and Y might have a relationship that is not linear.

​[This is not a continuation of any previous question; the y-variable and x-variables in the questions are different from all previous questions.] Suppose in the multiple linear regression model Y-hat = b0 + b1x1 + b2x2 The correlation coefficient between Y and X1 is -0.004 The correlation coefficient between Y and X2 is 0.96 The correlation coefficient between X1 X2 and is 0.00003 What can we say about the model Y-hat = b0 + b1x1 + b2x2

This model has little or no multicollinearity.

​[This is not a continuation of any previous question; the y-variable and x-variables in the questions are different from all previous questions.] Suppose in the multiple linear regression model Y-hat = b0 + b1x1 + b2x2 The correlation coefficient between Y and X1 is 0.98 The correlation coefficient between Y and X2 is 0.96 The correlation coefficient between X1 X2 and is 0.00003 What can we say about the model Y-hat = b0 + b1x1 + b2x2

This model has little or no multicollinearity.

​[This is not a continuation of any previous question; the y-variable and x-variables in the questions are different from all previous questions.] Suppose in the multiple linear regression model Y-hat = b0 + b1x1 + b2x2 The correlation coefficient between Y and X1 is -0.004 The correlation coefficient between Y and X2 is 0.96 The correlation coefficient between X1 X2 and is 0.97 What can we say about the model Y-hat = b0 + b1x1 + b2x2

This model has very high multicollinearity.

True or False: The correlation coefficient measures the strength of a correlation between two variables

True

True or False: The multiple linear regression model whose Y-variable is mpg and X-variables are wt, cyl, and disp is about as accurate as the multiple linear regression model whose Y-variable is mpg and X-variables are wt and cyl.

True

When comparing cars with the same weight (wt) and displacement (disp), every additional two cylinders will decrease the fuel economy by about 3.6 miles per gallon.

True

When comparing cars with the same weight (wt) and same number of cylinders (cyl), then a change in disp will not have a significant linear impact on mpg.

True

When comparing managers with the same number of graduate credits, same number of certificate credits, and who manage the same number of people, then there is a statistically significant relationship between salary and years experience.

True

True or False: VIF is one way to measure multicollinearity.

True: high VIF means strong multicollinearity.

Suppose we would like to create the model Y = b0 + b1(X) We use the appropriate commands in R, and obtain the output in the image below. What can we say about the relationship between X and the average value of Y?

We are over 95% confident that X is correlated with Y (i.e. there is a linear relationship between X and the average value of Y).

When do we want to avoid high multicollinearity in a multiple linear regression model?

We want to avoid multicollinearity when we wish to interpret slopes/coefficients in a model, but it is ok to have high multicollinearity in a model that is only used for prediction.

Suppose we have the model Y = 10 * (1.25)^x Which of the statements below will be true about the general relationship between X and the average value of Y?

Whenever X increases 1 unit, the average value of Y will increase 25%.

Use the correlation coefficient matrix plot below. Which variable has the strongest correlation with Y?

X1 - Look for the largest dot (can be either red or blue) that is either in the Y-variable's row or column.

Suppose we have four independent variables, X1, X2, X3, X4. The correlation coefficient between Y and X1 is 0.82 The correlation coefficient between Y and X2 is 0.002 The correlation coefficient between Y and X3 is -0.07 The correlation coefficient between Y and X4 is -0.99 Which independent variable has the strongest correlation with Y?

X4

Suppose you have quantitative variables named Return and Investment, which are saved in an Excel data file called ROI. Suppose you imported the data into R and you would like to create a linear regression model that estimates the return based on the amount of the investment. Return = b) + b1(Investment) What R command do we use to create this linear model?

lm(Return ~ Investment, data=ROI)

Suppose we have the model Y = 10 * (1.25)^x Whenever X increases 10%, the average value of Y increases about

none of the above

Suppose data were collected about the demand of a certain product, and the analysis of this data resulted in the linear regression model below. This equation that demonstrates the relationship between a product's demand, in thousands of units, and the number of A-list celebrities that tweet positive comments about the product on Twitter. The p-value for the estimate of the intercept is 0.0001 and the p-value for the estimate of the slope is 0.02 Demand, in thousands of units = 400 + 3*(number of celebrity tweets) Complete this sentence: For every additional 10 A-list celebrities who tweet about the product, _____________.

we predict the demand will increase about 30 thousand units.

In the equation Y=b0 + b1(X) What is the "independent variable"?

x


Related study sets

Chapter 21: Respiratory Care Modalities

View Set

ch. 9 data privacy and confidentiality

View Set

2.4) Quantitative vs. Qualitative Research

View Set

England Part 2The Canterbury tales, English 12 Unit 4 Test, Quiz 3 11-14, Nun's Priest Tale, Medieval literature 1-9, 11-14, 17, English 4 Unit 5, Canterbury Tales, Canterbury Tales Test, Medieval England, English Exam Characters, Final Test, English...

View Set

Elements for Chemistry (name elements and element symbol)

View Set

Automate the Boring Stuff with Python by Al Sweigart - Ch 8-11

View Set

Lesson 13 - Politics, power and violence

View Set