MKTG 434 - Adv MKTG Analytics EXAM 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

marketing mix modeling is used to

-A marketing mix model can help you identify the most effective future marketing strategies based on the analysis of history sales and marketing costs information -evaluate different components of marketing plans such as advertising, promotion, packaging, media, sales force numbers, etc.

stepwise regression advantages:

-A researcher can manage large numbers of potential independent variables to choose the best-fitting regression models -It provides the order in which variables are removed or added.

you will run a logistic regression if you have these research questions:

-If you want to predict whether a person will subscribe to a magazine or not -If you want to predict whether a person will respond to a direct mail campaign or not -If you want to predict whether a person will purchase a new product or not -If you want to predict whether a cell phone customer will "churn" by the end of year and switch to another service or not

when to use time series technique:

-Sales quantities/revenues -Airline passenger volume -Traffic volumes -Economic metrics such as interest rates, unemployment rates

adjusted r-squared

-a modified version of R2 that has been adjusted for the number of independent variables in the regression model -increases only if a new independent variable improves the model more than would be expected by chance -decreases if the new independent variable improves the model less than would be expected by chance

actual sales vs. predicted sales from regression model

-actual value is the value that is obtained by observation or by measuring the available data. -predicted value is the value of the variable predicted based on the regression analysis.

Review 1. Look at the coefficients table and the line graph comparing actual Levi's sales and predicted Levi's sales and discuss how to improve the overlap between actual sales and predicted sales.

-create a dummy variable to capture seasonality -lagged variable to capture a carry-over effect of marketing -if you have additional list of variables and want to check the best regression model in a systematic way, then do stepwise regression

R-squared

-indicates what proportion of the variances of your dependent variable is explained by your independent variable(s) -shows the overall strength of your regression model. -The range of this R2 is between 0 and 1 -The closer R2 is to 1, the higher the proportion of the variable your dependent variable explains, so we might say the better the regression model you set up.

Review 4 (1a). What is the difference between R-squared and Adjusted R-squared?

-r-squared increases but never decreased -adjusted r-squared increases only if new IV improves the model and only decreases if new IV did not improve the model

use adjusted r-squared when

-to compensate for the addition of variables and only increases if the new predictor enhances the model above what would be obtained by probability. Conversely, it will decrease when a predictor improves the model less than what is predicted by chance. -if you are building Linear regression on multiple variable, it is always suggested that you use Adjusted R-squared to judge goodness of model.

stepwise regression disadvantages

1. Multicollinearity is usually a major issue.(VIF) 2. Some variables may be removed from the model even though they are important to include. -these variables can be manually added back in

marketing mix modeling... (4 things)

1. identifies which marketing inputs should be considered 2. identify the best weights for marketing inputs-marketing objectives 3. understand how a change in the marketing budget will affect future sales 4. determine how to optimally spend current marketing budget

interpret unstandardized coefficients of independent variables using Levi's example:

Here we see 312.393 as an unstandardized coefficient (B); this means that as you increase one billboard advertising, the Levi's sales increases on average across the three years by $312.39.

interpretation of r-squared if number is 0.601

If r-squared is 0.601 then the regression model explains 60.1% of the variances in the dependent variable.

standardized coefficient (beta)

If you have several marketing variables with different scales of units, you want to ignore the differences in scales of units and compare across different variables

Review 4 (3c). At the 95% confidence level, based on the p-values for the three independent variables, which independent variable(s) do you think significantly decrease(s) a customer's willingness to patronize an upscale restaurant?

Prefer simple décor (negative value)

when to use stepwise regression:

Stepwise regression is an appropriate analysis when you have many variables and you're interested in identifying a useful subset of the predictors. -Stepwise regression is used to see how the variance explained, r-squared, changes by adding (or removing) each predictor to the model one at a time

Review 2a. Explain the concept of the carry-over effect of marketing and which variable can be used to explain it?

The carryover effect of advertising states that time lag between the consumers being exposed to the advertisement and their response to the same advertisement -If the coefficient of the lagged sales is higher, then we can expect a longer carry-over effect of marketing - which means marketing efforts are persistent. If the coefficient of the lagged sales is smaller, then we can expect a shorter carry-over effect of marketing.

moving average forecasting

The idea of a simple moving average is that the pattern of random components will continue to happen in the future. -Assumption: future observations will be similar to the recent past.

correct mathematical model for the predicted sales of Levi's:

The predicted weekly sales of Levi's = 751.862 + 0.268(the previous-week sales of Levi's) + 9.432(news ad inches for Levi's) + 124.897(number of billboards up during that week) +79.455(news ad inches for cheaper hats) + 0.726(seconds of radio advertising) + 2813.587(the first week of school) + 5918.145(the weeks of Christmas season)

multiple regression analysis example (identify the independent and the dependent variable): Among the store interior, taste of food, variety of options and price, which variables will influence a customer's willingness to visit the restaurant?

independent: store interior, taste of food, variety of options and price dependent: a customer's willingness to visit the restaurant

Unstandardized coefficients (B)

indicate how much the value of your dependent variable increases as your independent variable increases by one unit.

interpret the results from Cox & Snell R. Square

is 0.489, which indicates that the logistic model explains 48.9% probability of customer retention.

Using age, driving license years, and annual kilometers, segment the customers into three groups. Who are they?

k-means cluster

the logistic model solves these problems:

log[p/(1-p)] = a + B1X1 + B2X2 + e -p is the probability that event Y occurs, p(Y=1) -p/(1-p) is the "odds"

Analysis: Among customer type, fuel, and driving license years, which factor influences annual kilometers the most?

multiple regression

Analysis: What model can be used to predict annual kilometers and to what extent?

multiple regression

Review 4 (2a). What is a null hypothesis for this research question.

no linear relationship between IV and DV

simple regression

one IV and one continuous DV

ANOVA

one categorical variable with three or more groups, one continuous variable

independent samples t-test

one categorical variable with two groups, one continuous variable

logistic regression

one or more than one IV, one categorical DV -Binary logistic regression if DV is a binary response (e.g., choice or no choice) -Multinomial logistic regression if DV is a range of finite options (e.g., Sprint, AT&T and T-Mobile)

k-means cluster

only continuous variables

time-series analysis: seasonality component

pattern of regular fluctuations in the data over time

prediction vs. forecast

prediction: used to uncover and understand relationships between variables (IV, DV) -example: which marketing promotion will most increase the sales of Levi's jeans? forecast: an estimation of the value of one variable in the future. -example: revenue forecasting for next year

Review 4 (3a). Based on the coefficients for preference for waterfront view, preference for unusual desserts, and preference for simple decor, which variable has the strongest effect on a customer's willingness to patronize an upscale restaurant?

prefer simple decorations (check with standardized coefficients)

triple exponential smoothing technique

provides a means for decomposing data that have both trend and seasonality. The Holt-Winters method is commonly used for triple exponential smoothing.

Be able to interpret R-squared, p-value with specific examples and cases

r-squared shows how well the regression model fits the observed data -ex: r-squared is 60% then 60% of the data fir the regression model If p-value ≤ 0.05 then it is statistically significant and rejects the null hypothesis. If p-value > 0.05 then the results are not statistically significant and supports the null hypothesis.

regression-based forecasting is

studying the relationships between data points, which can help you -Predict sales in the near and long term -Understand inventory levels -Understand supply and demand.

Review 4 (2c). What is the p-value from the ANOVA table, and how do you interpret the p-value?

table says p-values is > 0.05 so we're 95% confident there is a linear relationship between at least one IV and DV in model

Key difference between a logistic regression model and a linear regression model is

the Dependent Variable Simple & Multiple Linear regression: continuous DV (e.g., sales) Logistic regression: categorical DV -binary logistic regression if DV is binary response -Multinomial logistic regression if DV is a range of finite options (e.g., Sprint, AT&T and T-Mobile)

definition of cross-elasticity

the change of sales/the change of other department's advertising.

find forecasted dependent if the regression model is given:

the dependent is the Y variable. The regression equation is Y = bX + a where Y is the dependent variable.

basic assumption in multiple regression: independence assumption

the independent variables must be statistically independent and uncorrelated with one another

Multicollinearity

the presence of strong correlations among independent variables

significance (p-value)

the probability that the null hypothesis (the population coefficient of the variable is not significantly different from zero) is rejected although it is true

multiple regression analysis uses...

the same concepts as bivariate analysis but uses more than one independent variable

Review 4 (2b). What is the alternative hypothesis for this research question.

there is a linear relationship between IV and DV

Chi-square test

two categorical variables

correlation

two continuous variables

multiple regression

two or more than two IVs, one continuous DV

Analysis: Among customer type, fuel, driving license years, car category, which factor influences whether customers claim an insurance or not?

two-step cluster

time-series analysis: random

unpredictable residuals

Review 4 (1b). Between R-squared and Adjusted R-squared, which one should be used to interpret the model summary? Select the more appropriate value and interpret it.

use adjusted r-squared for this model. This model explains 60.1% variation of DV (likelihood visiting the restaurant)

single exponential smoothing technique

uses a data smoothing parameter referred to as α (alpha). This parameter α represents smoothing of the time series. -Effective for data with purely random component -No trend or seasonality

how to develop a mathematical equation and how to interpret the equation using a coefficient table from Levi's:

weekly sales of Levi's = constant + lagged_sallevis + newspaper ad inches for Levis + number of billboards + news ad inches for cheaper hats + seconds of radio + first week of school + weeks of Christmas season

multiple regression equation: y= a + b1x1 + b2x2 + b3x3 +... +bxxm

y = the dependent, or predicted variable xi= independent variable a = the intercept bi= the slope of independent variable m = the number of independent variables in the equation

example of interpretation of unstandardized coefficients:

you can explain how much sales will be generated by each promotion plan and how much sales are generated due to seasonality (Christmas and school seasons) or due to the carry-over effect of advertising.

Analysis: Is there a significant difference between males and females in the years of holding a driving license?

independent sample t-test

Analysis: Is there a significant relationship (=association) between driving license years and annual kilometers?

Pearson correlation analysis

How to set up the best model:

1. select a subset of marketing variables -three criteria: 2. check seasonality and create dummy variable: -how to detect seasonality: what weeks are included such as the start of school season or Christmas -how to create dummy variables: (values 0 or 1) indicates the absence or presence of some categorical effect. If the dummy variable is 0, the variable will not affect the dependent variable, and vice versa if the dummy variable is 1. [Transform] to [Compute Variable] and Type [XMAS] in the Target Variable and type "0" in Numeric Expression to Click [OK] -interpret the unstandardized coefficient of dummy variables: 3. consider the "carry-over effect of advertising" and create a lagged variable -carry-over effect of advertising: "the portion of advertising that retains its effect and affects consumers even beyond the period of its exposure". -how to create a lagged variable: Go to [Analyze] [Regression] to [Linear] and Include "lag_sallevis" into your existing regression model.

multiple regression analysis tells us... (3 things)

1. which factors predict the dependent variable 2. which way each factor influences the dependent variable 3. how much each factor influences the dependent variable

bivariate regression

=simple regression -this analysis uses only one independent variable

stepwise regression definition:

A method of fitting regression models by using an automatic procedure to choose the optimal sets of independent variables.

Is there a significant difference among different categories of car owners in the annual driving kilometers?

ANOVA

how to identify multicollinearity

One commonly used method is the variance inflation factor (VIF). The VIF is a single number, and a rule of thumb is that as long as the VIF is less than 10, multicollinearity is not a concern.

explain the significance levels for unstandardized and standardized coefficients:

Unstandardized coefficient represents the amount of change in a dependent variable Y due to a change of 1 unit of independent variable X. -are produced by the linear regression model after its training using the independent variables which are measured in their original scales i.e, in the same units in which we are taken the dataset from the source to train the model Standardized coefficient compares the strength of the effect of each individual independent variable to the dependent variable. The higher the absolute value of the beta coefficient, the stronger the effect -If your regression model contains independent variables that are statistically significant, a reasonably high R-squared value makes sense. The statistical significance indicates that changes in the independent variables correlate with shifts in the dependent variable.

basic assumption in multiple regression: variance inflation factor (VIF)

VIF can be used to assess and eliminate multicollinearity -VIF is a statistical value that identifies what independent variable(s) contribute to multicollinearity and should be removed -Any variable with VIF of greater than 10 should be removed

Mean absolute deviation (MAD) / Forecast Accuracy

Where Ar= actual value of the time series at time Fr=forecast value for time n= number of forecast values Lower mad means better forecasting model

r-squared definition

a measure of goodness of fit of the regression model

marketing mix modeling definition

a statistical analysis to: 1. estimate the impact of various marketing mix strategies on sales 2. forecast the impact of marketing mix strategies on future sales

Time-series analysis

a technique that analysts use to (a) uncover any implicit structure (patterns or trends) in the data and (b) model that structure to make forecasts. -The assumption is that the future, at least in the short term, will continue the structure of the past.

double exponential smoothing technique

adds a second parameter β (beta) to α. The β parameter is called the trend smoothing factor. -For data that exhibit trends but not seasonality

Review 4 (3b). Is there any multicollinearity problem in this regression model? Explain this.

as all VIF > 10, there is no multicollinearity issue in this model

how you got the predicted sales

ask SPSS to save the unstandardized predicted dependent variables -SPSS calculates predicted sales: go to [Analyze] to [Regression] to [Linear] and Select the same set of the independent variables in the seventh regression as in the stepwise regression procedure. Click [Save] and Check "Unstandardized" under Predicted values to Click [Continue] and Click [OK]

Analysis: Using age, car category, and annual kilometers and claims, segment the customers. How many clusters did you find? Who are they?

binary logistic regression

interpret the odds ratio

can be interpreted as the effect of one unit of change in X in the predicted odds ratio with the other independent variables in the model held constant.

two-step cluster

categorical and continuous variables/without pre-determined number of clusters

Analysis: Is there a significant relationship (=association) between gender and making an insurance claim?

chi-square test

Review 3b. explain the concept of Constant. Interpret the unstandardized coefficients of Constant.

constant = average $ earned given regression and are not explained by IV

time-series analysis: trend component

direction of the data changing over time

how to interpret the cross-elasticity of advertising in the regression model:

for the Levi's equation, the cross-elasticity is 79.743 x news ad inches for cheaper hats

Interpret the overall percentage of cases

from the classification table, the overall percentage of cases that are correctly predicted by the model. In this model, 94.1% cases are correctly predicted.(18939+4180)/(4180+857+586+18939) =0.941 94.1%


Ensembles d'études connexes

chapter 4, chapter 3, chapter 5, clin psych ch.1, 2 bnks, chapter 7, Chapter 8

View Set

Practice Test 2: PMBOK 7 and Process Group Practice Questions (251 to 500)

View Set