Exams

Ace your homework & exams now with Quizwiz!

Suppose we would like to estimate the monthly revenue for a specific small business, where the variable revenue is measured in dollars. We will use the independent variables: X (quantitative) Quarter (Q1, Q2, Q3, or Q4) Suppose we use the command in R: lm(Revenue ~ X + Quarter, data=BronlynsData) The coefficients table from the model is given below: Doc 1.20 For a month in Q4 where the variable X equals $4000, the predicted revenue equals $_______ . - $62,000 - We do not have enough information to answer this question. - $30,000 - $104,000

- $62,000

Suppose we have a variable Group, whose values are A and B. We also have a quantitative variable Y. We use the R command lm(Y ~ Group) and obtain the table below: Doc 1.27 What is the predicted value of Y in Group B? We do not have enough information to answer this question wit the model provided. - 250 - 50 - 200 - 50

- 200

Suppose we have a variable Group, whose values are A and B. We also have a quantitative variable Y. We use the R command lm(Y ~ Group) and obtain the table below: Doc 1.27 What is the predicted value of Y in Group A? - 200 - 250 - 150 - 50 - We do not have enough information to answer this question wit the model provided.

- 250

Real estate data in a large city was analyzed using recent homes sales from houses between 1000 and 3000 square feet. A model predicting the average selling price based on square footage is given below.Predicted average price (in thousands of dollars) = -25 + 0.13*(square footage)Complete the sentence below.For every additional 500 square feet, the predicted average home price increases ___________. - 38 thousand dollars - 40 thousand dollars - 65 thousand dollars - 78 thousand dollars - 90 thousand dollars - None of the above.

- 65 thousand dollars

Suppose we have the following model that predicts the cost of a project based on the number of custom features. Predicted cost (thousand dollars) = 3 + 1.1(custom features) What is the predicted cost of a project with 4 custom features? - 12 thousand - 4.4 thousand - 7.4 thousand - 16.4 thousand

- 7.4 thousand

Suppose the R-squared for the linear regression model below equals 0.73 Y = 3 + 1.1(X) What can we conclude? - 73% of the variability of Y can be explained by X. - Since the R-squared is not less than 0.05, we conclude that there is not a statistically significant correlation between X and Y. - There is a 73% probability of obtaining the model results assuming that the null hypothesis is true. - We are 73% confident that X is correlated with Y.

- 73% of the variability of Y can be explained by X.

In the scatter plot below, we see a straight-line trend. Doc 1.1 The plot above indicates that we can quantify a general relationship between X and Y. What type of relationship can we quantify? - A percentage change in X corresponds with a percentage change in Y. - An absolute change in X corresponds with a percentage change in Y. - An absolute change in X corresponds with an absolute change in Y. - A percentage change in X corresponds with an absolute change in Y.

- An absolute change in X corresponds with a percentage change in Y.

Suppose Dr. Bronlyn is doing an analysis of customer satisfaction scores for a large organization. She has data indicating customer spending, customer age group, and customer satisfaction scores. Suppose Dr. Bronlyn uses the data to conclude "there is not an interaction between spending and age group with respect to customer satisfaction scores." What does this conclusion mean? - Changes in spending will impact the predicted customer satisfaction scores about the same for all age groups. - There are differences in average spending when comparing the age groups. - Changes in spending will impact the predicted customer satisfaction scores differently for the different age groups. - Customer satisfaction scores are different in different age groups.

- Changes in spending will impact the predicted customer satisfaction scores about the same for all age groups.

Suppose we have the linear regression model below, where discount rate is measured as a percentage, and sales and budget are both measured in thousand dollars. Predicted Sales = 45 + 1.2(Discount Rate) + 1.4(Advertising Budget) True or False: From the equation above, we have enough information to conclude that whenever the advertising budget increases 1 thousand dollars, the predicted sales are expected to increase 1.4 thousand dollars. - True - False

- False

Suppose we are interested in predicting the average value of Y based on independent quantitative variables X1, X2, X3, and X4. We obtain the model output below. Doc 1.6 True or False: Of the variables X1, X2, X3, and X4, the variable X1 is the most strongly correlated with the Y-variable. - True - False or not enough information

- False or not enough information

Suppose we create a model to predict the average revenue (millions) There are three segments: A, B, and C, and the variable Advertising budget is given in million dollars. Predicted average Revenue (millions) = 50 + 1.2*(Advertising Budget, millions) + 3.75(Segment B) - 2.8(Segment C) True or False: When we control for budget, the predicted revenue is 3.75 million dollars higher than any other segment. - True - False or not enough information

- False or not enough information

Suppose we have a multiple linear regression model predicting Revenue (thousand dollars) using the following independent variables: RandD: Research and Development Budget (thousand dollars) Quarter (Q1, Q2, Q3, or Q4) Weather (nice, cloudy, or rainy). We use the R command lm(Revenue ~ RandD + Quarter + Weather, data=BronlynsData) and obtain the following coefficients table in the model output: Doc 1.19 Is the following statement true or false: After controlling for weather and RandD budget, the average sales in Q3 is about the same as all other Quarters. - True (we can conclude this from the model output) - False or not enough information

- False or not enough information

Suppose we have the model Y = log(2) + 1.2*log(X) What is the general format for the interpretation of the relationship between X and Y? - For every increase in X by ____ units, the average value of Y will increase ____ %. - For every ___ % increase in X, the average value of Y will increase ____ units. - For every increase in X by ____ units, the average value of Y will increase ____ units. - For every ___ % increase in X, the average value of Y will increase ____ %.

- For every ___ % increase in X, the average value of Y will increase ____ units.

Below are two scatter plots. The first shows a scatter plot between X and Y where the dots appear in a straight-line pattern, and the second shows a scatter plot between log(X) and Y where the dots appear to follow a curved pattern. Doc 1.3 What information do we get about the data from these two plots? - For these variables, a linear model is more accurate than a power model. - For these variables, a power model is more accurate than a linear model. - For these variables, a logarithmic model is more accurate than a linear model. - For these variables, a linear model is more accurate than a logarithmic model.

- For these variables, a linear model is more accurate than a logarithmic model.

Suppose Dr. Bronlyn uses Tableau to create a linear regression model for the variables X and Y. Tableau provides the following output: Trend line: Y = 7 + 1.5(X) p-value: 0.9223 R-squared: 0.0003 What can we say about the correlation between X and Y? - It is a very weak correlation. - It is a very strong correlation.

- It is a very weak correlation.

Suppose Dr. Bronlyn uses Tableau to create a linear regression model for the variables X and Y. Tableau provides the following output: Trend line: Y = 7 + 1.5(X) p-value: 0.9229 R-squared: 0.0003 What can we say about the correlation between X and Y? - It is a very strong correlation. - It is a very weak correlation.

- It is a very weak correlation.

What is an R-squared used for? - It measures if a linear regression model is the best model; if the R-squared is larger than 0.95, then we are 95% confident that a linear regression model is the best type of model for the data. - It measures if there is a statistically significant correlation between the X-variable and the Y-variable. - It measures if there is multicollinearity in the model. - It measures how closely the data follow a linear trend (i.e. the strength of a correlation).

- It measures how closely the data follow a linear trend (i.e. the strength of a correlation).

Suppose an organization wishes to know if job satisfaction is related to salary. They use job satisfaction survey results and salary data to create the model below Predicted Satisfaction = b0 + b1(Salary) Which of the following is true about the linear regression model above? - Satisfaction is the dependent variable and Salary is the independent variable. - Satisfaction is the independent variable and Salary is the dependent variable. - b0 is the dependent variable and b1 is the independent variable. - b0 is the independent variable and b1 is the dependent variable.

- Satisfaction is the dependent variable and Salary is the independent variable.

What is the main idea of module 3? - Sometimes a percentage change is a more appropriate way to explain a relationship between variables than an absolute change, and it's possible to create models that give interpretations using a percentage change for one or both variables. - The best models are often the most complicated models. - All possible regression models are created using either the original variable or the log of the variable, and now that we know Module 3 we know every type of model that can possibly be made.

- Sometimes a percentage change is a more appropriate way to explain a relationship between variables than an absolute change, and it's possible to create models that give interpretations using a percentage change for one or both variables.

Suppose we are interested in three variables: quantitative variables X and Y, as well as a Group variable (whose values are A or B). Suppose we create a scatter plot where we plot the X variable on the x-axis, the Y variable on the y-axis, and the dots are color-coded by Group (A or B). Trend lines for each of the two groups are added to the plot. Suppose the two trend lines are parallel but not equal (nor approximately equal); it appears that the trend line for Group B is shifted up a large amount from Group A's trend line. Doc 1.13 Which of the following does the plot suggest? - The plot suggests that there is an interaction between X and Group with respect to Y. - The plot suggests that there is not an interaction between X and Group with respect to Y.

- The plot suggests that there is not an interaction between X and Group with respect to Y.

Suppose we create a regression model using the variables: Y, X, Segment (A or B). lm(Y ~ X*Segment ) Information about the coefficients of the model are given below. Doc 1.14 In the table above, we see the estimate of the coefficient for X:SegmentB equals 4. What does this mean? - When we control for X, the average value of Y is 4 higher for Segment B than Segment A. - The slope of the trend line for Segment B equals 4. - The slope of the trend line for Segment B is 4 more than the slope of the trend line for Segment A.

- The slope of the trend line for Segment B is 4 more than the slope of the trend line for Segment A.

Suppose we would like to create a linear regression model predicting Demand based on Supply: Predicted Demand = b0 + b1(Supply) What are the variables in this model? - The variables are Supply and Demand. - The variables are b0 and b1 - The variables are Supply, Demand, b0 and b1 - The variables are the R-squared and the p-values that we calculate using R.

- The variables are Supply and Demand.

Suppose Dr. Bronlyn creates a linear regression model in Tableau. When she places her mouse over the linear trend line, Tableau provides the following output: Y = 10 + 20*(X) p-value: 0.0001 R-squared: 0.1113 What can we conclude? - There is a statistically significant correlation between X and Y, and that correlation is very strong. - There is a statistically significant correlation between X and Y, but that correlation is not very strong. - There is not a statistically significant correlation between X and Y.

- There is a statistically significant correlation between X and Y, but that correlation is not very strong.

Suppose that an organization is trying to understand the relationship between supply and demand. The correlation coefficient for the variables Supply and Demand equals -0.98 What does this mean? - There is a strong correlation between supply and demand, and as supply increases, demand decreases. - This is an impossible situation, as correlation coefficients are not allowed to be negative. - There is a strong correlation between supply and demand, and as supply increases, demand also increases. - There is either no correlation (or an incredibly weak correlation) between supply and demand. - We're 98% confident that supply is correlated with demand.

- There is a strong correlation between supply and demand, and as supply increases, demand decreases.

Suppose Dr. Bronlyn creates a linear regression model in Tableau. When she places her mouse over the linear trend line, Tableau provides the following output: Y = 10 + 20*(X) p-value: 0.0001 R-squared: 0.1117 What can we conclude? - There is a statistically significant correlation between X and Y, and that correlation is very strong. - There is not a statistically significant correlation between X and Y. - There is a statistically significant correlation between X and Y, but that correlation is not very strong.

- There is not a statistically significant correlation between X and Y.

Suppose the correlation coefficient of X and Y is 0.0001 What can we conclude? - There is not a strong linear relationship between X and Y. - Whenever X increases 1 unit, the average value of Y increases 0.0001 units. - There is not any type of relationship between X and Y. - There is a statistically significant correlation between X and Y.

- There is not a strong linear relationship between X and Y.

True or False: We use dummy variables whenever we want to include a categorical variable as an independent variable in a model. - True - False

- True

For the linear regression model Y = b0 + b1(X): The p-value for the intercept is large: about 0.98 The p-value for the slope is very small: less than 2 times 10^(-16) What can we conclude? - Since the p-value for the slope is very small, we can conclude that there is a very strong correlation between X and Y. - Since the p-value for the intercept is large, we can conclude that there is a very strong correlation between X and Y. - Since the p-value for the slope is very small, we can conclude that there is a very weak correlation between X and Y. - We are not able to assess the strength of the correlation between X and Y with the output provided. - Since the p-value for the intercept is large, we can conclude that there is not a strong correlation between X and Y.

- We are not able to assess the strength of the correlation between X and Y with the output provided.

Suppose we create a model to predict the average revenue (millions) There are three segments: A, B, and C, and the variable Advertising budget is given in million dollars. Predicted average Revenue (millions) = 50 + 1.2*(Advertising Budget, millions) + 3.75(Segment B) - 2.8(Segment C) What is the predicted average revenue in Segment B? - 3.75 million dollars - 53.75 million dollars - 50 million dollars - We are unable to answer this question with the model provided.

- We are unable to answer this question with the model provided.

Suppose we have the following multiple linear regression equation that predicts the weekly revenue of a retail store based on its monthly advertising budget. The budget and revenue variables both use the units million dollars, and the store number is a 4-digit number that identifies the store. There are 10 stores in the data set. The R command below is used: lm( revenue ~ budget + store, data=BronlynData ) and we get the coefficients table below Doc 1.26 What is the predicted revenue for a store #3124, whose advertising budget is $2 million? - 2.6 million - We are unable to answer this question with the model provided. - 198.8 million - 42.6 million

- We are unable to answer this question with the model provided.

Suppose we have the linear regression model below, where both supply and demand are given in thousand units. Predicted demand = 200 - 1.5*(supply) What can we conclude? - Whenever supply increases 1.5 thousand units, the predicted demand decreases 200 thousand units. - Whenever the supply increases 2 thousand units, the predicted demand decreases 3 thousand units. - Whenever supply increases 200 thousand units, the predicted demand decreases 1.5 thousand units. - Whenever the supply increases 2 thousand units, the predicted demand decreases 197 thousand units.

- Whenever the supply increases 2 thousand units, the predicted demand decreases 3 thousand units.

Suppose we have a large data set where The correlation coefficient between X1 and Y equals -0.34 The correlation coefficient between X2 and Y equals -0.97 The correlation coefficient between X3 and Y equals 0.24 The correlation coefficient between X4 and Y equals 0.81 Which of the variables X1, X2, X3, and X4 are the most strongly correlated with Y? - X1 - X2 - X3 - X4 - We are not given enough information to determine which of these four variables has the strongest correlation with Y.

- X2

Doc 1.8 Use the correlogram (correlation matrix plot) above. Which variable is the most strongly correlated with Y? - X1 - X2 - X3 - It is impossible to determine this without additional information.

- X3

Suppose we wish to calculate a linear regression model Y = b0 + b1(X) We load the data into R, create a linear regression model, and obtain the coefficient table below Doc 1.7 Is there a statistically significant correlation between X and Y? - Yes; there is a statistically significant correlation between X and Y. - No; there is not a statistically significant correlation between X and Y.

- Yes; there is a statistically significant correlation between X and Y.

Suppose we have the following variables for a data set about a food truck that sells cupcakes. CupcakeRevenue is measured in dollars. Temperature is measured in Fahrenheit. The weather variable has four values: nice, cloudy, rain (no thunder), or thunderstorms. We use the following model lm(CupcakeRevenue ~ Temperature + Weather, data=CupcakeFoodTruckData) The coefficients table is below Doc 1.23 Complete the following sentence: When comparing days with the same temperature, on average, the revenue on days when there is rain (no thunder) is - about the same as days that are nice or cloudy. - about the same as revenue on any other type of weather. - about 4. - about the same as days when there are thunderstorms.

- about the same as days when there are thunderstorms.

Suppose we have data about a very large retail chain. The data contains information from 11100 customers. Some of the variables in the model are: UnitPrice: the average unit price of items purchased per customer. Sales: the total sales per customer (in dollars) Segment: A, B, or C Location: North, South, East, or West We create a model lm( log(Sales) ~ log(UnitPrice) + Segment ) and obtain the following results Doc 1.18 Select the best phrase that completes the following sentence: When we control for unit price, the predicted sales in Segment A ________ . - are about the same as all other segments. - does not have an interaction with unit price. - are not statistically significant. - are about the same as in Segment C.

- are about the same as in Segment C.

Suppose we create a model predicting a project's Profit based on Research and Development spending, Administration costs, and Marketing Spending. The units for all variables is dollars, The coefficients table for the model that predicts the Profit is given below. The estimates are rounded (and based on data), so interpretations will include the word "about." Doc 1.5 When comparing projects with the same administrative costs and the same marketing spending, then whenever the research and development spending increases $1000, the predicted profit - is about $51010 - does not change - increases about $51010 - increases about $0.80 - increases about $8000 - increases about $800

- increases about $800

Suppose you have quantitative variables named Supply and Demand We would like to create a linear regression model predicting Demand based on Supply Doc 1.2 Suppose the data are stored in a data set named "PretendData" What R command do we use to create this linear model? - lm(Demand ~ Supply, data=PretendData) - lm(Supply ~ Demand, data=PretendData) - lm(Demand + Supply, data=PretendData) - lm(Supply + Demand, data=PretendData)

- lm(Demand ~ Supply, data=PretendData)

Suppose we have the following variables for a data set about a food truck that sells cupcakes. CupcakeRevenue is measured in dollars. Temperature is measured in Fahrenheit. The weather variable has four values: nice, cloudy, rain (no thunder), or thunderstorms. We use the following model lm(CupcakeRevenue ~ Temperature + Weather, data=CupcakeFoodTruckData) The coefficients table is below Doc 1.22 Complete the following sentence: When comparing days with the same temperature, on average, the revenue on cloudy weather days is $100 higher than the revenue _________ . - on nice or rainy (no thunderstrom) days. - on days with thunderstorms. - on nice weather days. - on any other type of day.

- on days with thunderstorms.

Suppose we have data from a small group of customers, and we want to use the data from this group of customers to help explain relationships and behavior of all potential customers (most of whom we do not have data for). When we are using data from a small sample, and making inferences about a larger population, what type of statistics are we using? - predictive statistics - descriptive statistics

- predictive statistics

Suppose Dr. Bronlyn has a data set named MKT317ExampleData One of the variables in this data set is named "Revenue." What R command did we learn in class that would tell us the average value of Sales in the MKT317ExampleData data set? - summary(MKT317ExampleData) - We'd create a linear model named "RevenueModel," and then use the command coef(RevenueModel) - average(Revenue, data=MKT317ExampleData) - mean(Revenue, data=MKT317ExampleData)

- summary(MKT317ExampleData)

An analyst for a very large retail chain wishes to know which variables impact daily sales. There are many potential independent variables in the data set. The analyst would like to answer this very specific question: When we control for population density, does a higher temperature amount increase sales? To create a model that would answer this question, we can create a model where the Y-variable is Sales, and the X-variables are: - temperature (and no other variables). - use every independent variable in the data set. - temperature, population density, and precipitation (and no other variables). - temperature and population density (and no other variables).

- temperature and population density (and no other variables).

Suppose we create the following model using quantitative variables budget and profit (in dollars) and categorical variable group (A, B, C) lm( profit ~ budget + Group , data=Data ) The coefficients table is below: Doc 1.25 In Group C, when the budget equals $1,200, the predicted profit equals $ ____.

11,000

Suppose we create the following model using quantitative variables budget and profit (in dollars) and categorical variable group (A, B, C). lm( profit ~ budget + Group , data=Data ) The coefficients table is below: Doc 1.24 When we control for budget, the predicted profit is $ ____ higher in Group C than Group A.

1400

Suppose we would like to estimate the monthly revenue for a specific small business. The dependent variable Monthly Revenue, which is measured in dollars. We will use the independent variables: Temperature (measured in degrees F) Quarter (Q1, Q2, Q3, or Q4) Suppose we use the command in R: lm(Revenue ~ Temperature + Quarter, data=BronlynsData) The coefficients table from the model is given below: Doc 1.21 For a month in Q3 where the temperature is 85 degrees F, the predicted revenue equals $_______ .

15,465

Suppose we would like to estimate the monthly revenue for a specific small business. The dependent variable Monthly Revenue, which is measured in dollars. We will use the independent variables: Temperature (measured in degrees F) Quarter (Q1, Q2, Q3, or Q4) Suppose we use the command in R: lm(Revenue ~ Temperature*Quarter, data=BronlynsData) The coefficients table from the model is given below: Doc 1.12 For a month in Q3 where the temperature is 90 degrees F, the predicted revenue equals $_______ .

22,890

Suppose we create the following model using quantitative variables budget and profit (in dollars) and categorical variable group (A, B, C) lm( profit ~ budget*Group , data=Data ) The coefficients table is below: Doc 1.11 In Group C, when the budget equals 2,600, the predicted profit equals $ ____.

23,200

Suppose we create the following model using quantitative variables X and profit (in dollars) and categorical variable group (A, B, C) lm( log(profit) ~ X + Group , data=Data ) The coefficients table is below: Doc 1.17 In Group C, when X = 2.0, the predicted profit equals $ ____.

270

Suppose we calculated the model lm(log(Y) ~ X) and obtained the output below: Doc 1.4 We can simplify this equation to Predicted Y = a*b^X, where b = _____ .

45

Suppose we create the following model using quantitative variables X and profit (in dollars) and categorical variable group (A, B, C) lm( log(profit) ~ X + Group , data=Data ) The coefficients table is below: Doc 1.16 When we control for X, the predicted profit in Group C is _____ % higher than in Group A.

49

Suppose we would like to estimate the monthly revenue for a specific small business. The variable Monthly Revenue is measured in dollars. We will use the independent variables: X (quantitative) Quarter (Q1, Q2, Q3, or Q4) Suppose we use the command in R: lm( log(Revenue) ~ X + Quarter, data=BronlynsData) The coefficients table from the model is given below: Doc 1.15 For a month in Q3 where the variable X equals 5, the predicted revenue equals $_______ .

4915

Suppose we have the equation Y = 4 * X^(2.0) When x increases 30%, the predicted value of Y increases ____ %. You may answer to the nearest whole number.

69

Suppose we are interested in predicting the average value of Y based on independent quantitative variables X1, X2, X3, and X4. We obtain the model output below. Doc 1.9 True or False: Using this model, we would conclude that after controlling for X1, X2, and X4, whenever X3 increases 1 unit, the predicted value of Y increases 3 units. - True - False or not enough information

- True

Suppose you would like to create a model whose interpretation will be "As supply increases ___%, the predicted demand will decrease ___ %" What type of model would directly create this type of interpretation? - logarithmic model - polynomial model - exponential model - linear model - power law model

- power law model

Dr. Bronlyn creates a linear model in Tableau for the variables X and Y. She views details of the linear model, and Tableau provides the following information. Coefficients Table Doc 1.10 R-squared for model: 0.3421 To determine if there is a statistically significant correlation between X and Y, we would look at the number ________ that appears in the output above, and make a conclusion about if X and Y are correlated based on if that number is greater than 0.05 or less than 0.05. Please do not round your answer: answer the exact number.

0.0380

Suppose we have variables X, Y, and Group, where X and Y are quantitative, and Group can be either A, B, or C. We use the R command lm( log(Y) ~ X + Group, data=BronlynsData) and obtain the model below, where ln represents the natural log and GroupB and GroupC are dummy variables. ln(Y) = -2.8 + 1.4*X + 1.1*(GroupB) + 4.7*(GroupC) Assume all p-values are small. When we control for X, the predicted Y in Group C is ____ times higher than in Group A.

110

Suppose we have the equation Y = 350 + 25*X When x increases 50 units, the predicted value of Y increases ____ units. You may answer to the nearest whole number.

1250

Suppose you are creating a multiple linear regression model to predict the average revenue for a proposed new product. Suppose you have a large dataset with 8,790 rows of data and 3 variables. The three variables are described below: Dependent variable: Revenue generated from a product, measured in dollars. Independent variables: RandD: Research and Development budget, measured in dollars. Category (Product Category, categorical variable). Each product has exactly one product category, and there are a total of 17 different product categories in the data. Suppose you create a multiple linear regression model (with no interaction terms) using the methods that we learned in class. We use the R command Model <- lm(Revenue ~ RandD + Category) summary(Model) How many dummy variables will appear in the coefficients table in the model output?

16

Suppose we have a large data set with three variables: Y, X, and Group, where Group can either equal A, B, or C. We create the following regression model, where GroupB and GroupC are dummy variables representing Group B and Group C. Predicted Y = 8 + 8*(X) + 5*(GroupB) + 7*(GroupC) + 3*(GroupB)*(X) + 6*(GroupC)*(X) Compute the predicted Y for Group B when X = 18

211

Suppose we have the equation Y = 4 * 1.21^(X) When x increases 3 units, the predicted value of Y increases ____ %. You may answer to the nearest whole number.

77


Related study sets

CH 26 Fluid, Electrolyte & Acid-Base Balance

View Set

AP Euro Chapter 19: A Revolution in Politics: The Era of the French Revolution and Napoleon

View Set

HCAL ll Lewis Chapter 15 - Cancer

View Set

Chapter 10: Cost Recovery Deductions

View Set

ACC 210 Chapter 6 smartbook assign

View Set