BDAN Final

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Suppose a sample of five retail stores' monthly profits are: $4,000, $7,000, $5,000, $3,000, and $1,000. What will the sample variance of stores profits be?

$5 million

Suppose the regression line to describe the relationship between Y and a dichotomous treatment (X = 1 for treated, = 0 untreated) is given by Y = 4 + 3X. Suppose that one of the observations that was treated was observed to have an outcome, Y = 8. For this observation, what is the residual?

1

Suppose you wanted to determine if you should reject the null hypothesis that running a playing fast tempo (as opposed to slow) music in your store decreases sales by $100 on average. On 65 randomly selected days with fast tempo music on the average store sales is $2,345 with a sample standard deviation of 45, while on 75 randomly selected days with slow tempo music on the average store sales is $2,555, with a sample standard deviation of 65. What would be the proper t-stat for this hypothesis test?

2345−2555+100 / √ 45^2/65+65^2/ 75

Suppose you want to build a 90% confidence interval for the ATE when the average outcome for the treated (55 subjects) is 12, with a sample standard deviation of 4, while the average outcome for the control (65 subjects) is 10, with a sample standard deviation of 6. Which of the following would be the proper confidence interval?

2±1.65(√4^2/55+6^2/65)

Assuming you are testing a null hypothesis (two-tailed) about the population mean and determining whether to reject the null hypothesis based on the t-stat at the 95% confidence level, which t-stat would warrant rejecting the null hypothesis?

3

If one was estimating a simple regression of Earnings (Y) on Height of individuals (X), and got a coefficient on the Height variable of 30, what would the coefficient on Height be if you added 3 inches to every individual in the sample but kept their earnings the same?

30

If one is planning to use multiple regression to summarize how the variables X1, X2, X3 explain the variation in Y, how many parameters are involved in estimating the linear regression?

4

For a random variable that can take values from zero to 10, what would be the maximum sample variance that could be observed from a sample of two observations?

50

Suppose you have a random sample of 2,179 credit scores from a population of mortgage applicants with a sample mean of 620 and sample standard deviation of 70, and would like to calculate the t-stat for the null hypothesis that the population mean is 610. Which of the following is the correct construction of the t-stat?

620−610 / 70/√2179

Suppose you have a random sample of 200 students' GMAT scores that have a sample mean of 700 and sample standard deviation of 50, and would like to calculate the 90 percent confidence interval of the population mean. Which of the following would be the correct construction?

700 ± 1.65 (50/√200)

The government runs an experiment where a random sample of 200 adults over 40 get a 5% tax cut and a random sample of 200 adults under 40 get no tax cut. Their results show that spending, on average, increased 8% with a 5% tax cut. They conclude that a 5% tax cut for all adult Americans will increase spending by 8%. Why is this logic flawed?

A 5% tax cut for all Americans suffers from selection bias.

Consider the following proposed determining function for the winning percentage of baseball teams WinPcti = 0.500 - 0.1 Team ERAi + 0.2 Team BAi + Ui, where Team ERA is the team earned run average, and Team BA is the team batting average. What is the effect (change) on winning percentage from an increase in Team ERA from 2.3 to 3.3?

A decrease of 0.1 in winning percentage

As the size of a random sample gets larger, what does the distribution of the sample mean begin to resemble?

A normal distribution

Randomizing the treatment in an experiment facilitates all of the following conclusions except for what?

ATE = 0

A confidence interval can be constructed for which of the following population parameters?

All of the answers are correct. Population variance, Population mean, Population standard deviation

In a dichotomous regression all of the following conditions must hold except for what?

An equal number of positive and negative residuals.

Which of the following limits the use of experimental data in business settings/applications?

Conducting experiments relevant for business questions are often not feasible.

In evaluating the hypothesis test on experimental data, all the following will change the resulting p-value except for what?

Confidence level

In evaluating the hypothesis test on experimental data, all the following will change the resulting test statistic except for what?

Confidence level

If we wish to use a regression line to determine the effect of multiple treatment levels (e.g., Treatment = 1, 2, 3), why can&'t we just plot the average outcome for each treatment level and "connect the dots?"

Connecting the dots generally will not form a line.

All of the following are conditions that will hold at the estimated coefficients for the simple linear regression line (of Y on X) except for what?

Covariance between Y and the residuals is zero.

In estimating the causal relationship between Sales and Price in the following determining function Salesi = β0 + β1Pricei + Ui, what assumption in addition to E [Ui ] = 0, justifies that the estimated coefficient on Price can be interpreted as an estimate of the causal effect of Price on Sales?

E [PriceiUi ] = 0

In trying to measure the "treatment effect" of getting an MBA on career earnings, the observation that individuals with an MBA have higher salaries than individuals without them is likely to suffer from which of the following?

ETT > ATE

Why should we expect to estimate the treatment effect of price on quantity sold to be so difficult in nonexperimental data?

Firms vary their prices strategically in response to expectations of the resulting sales.

Why is Line A a better fit for the data in this graph?

For Line A, the average error (difference between the actual Profits and point on the line) is zero.

Why is Line B a better fit for the data in this graph?

For Line B, the residuals (difference between the actual Profits and point on the line) are uncorrelated with Price.

In principle, why are t-statistics and critical values not as useful in the practical construction of confidence intervals in most applications?

In large samples (i.e., large degrees of freedom) the t-distribution and standard normal distribution are very similar.

Suppose you're running a multiple regression of Home Prices (in thousands of $) on five different treatment variables including number of bedrooms, number of bathrooms, total square feet, total lot size, and garage size, where all the treatment variables have been standardized (i.e., transformed to have a mean of zero and standard deviation equal to 1). If the coefficient on number of bedrooms is estimated to be 3, how would you interpret the coefficient on the number of bedrooms?

Increasing the number of bedrooms by 1 standard deviation, holding number of bathrooms, total square feet, lot and garage size fixed increases the average home price by three thousand dollars.

In the following determining function, Earningsi = α0 + α1Educationi + Ui, what might be a factor contained in Ui?

Innate ability, which might be correlated with education and earnings

Suppose you're running a multiple regression of Home Prices on five different treatment variables including number of bedrooms, number of bathrooms, total square feet, total lot size, and garage size, where all the treatment variables have been standardized (i.e.,Which condition transformed to have a mean of zero and standard deviation equal to 1). Which conditions about the multiple regression must hold?

Intercept for the multiple linear regression equals the sample average home price.

Which of the following will yield nonexperimental data?

Investment performance over the last ten years of several portfolio managers

As the size of the random sample gets larger, what happens to the standard deviation of the distribution for the sample mean?

It gets smaller.

In the scientific method why is it crucial to start with a question before conducting the empirical analysis?

It motivates the specific variation in the treatment required to test the hypothesis.

For the same sample, the 95% confidence interval will have what relation to the 99% confidence interval?

It will be smaller.

Suppose you had a random sample of 50 observations with a sample mean of 10 and sample standard deviation of 5. Suppose another observation is observed that has a value of 10. How will the sample mean change from the original sample?

It will not change.

Suppose you estimate the following regression of a Movie's ticket sales and the season of year (Summer =1 if in May, June, July, August, =0 otherwise) that movie's initial release was in: Sales = 295,342 + 40.24 Summer. You are willing to believe you have a random sample of store locations, and a sufficiently large sample size. Which statements are well justified by the regression results?

Movies released in the summer tend to have higher ticket sales.

Suppose one runs the regression of Y on X1 and X2 and both coefficients on X1 and X2 are positive. All of the following correlation conditions must hold in the sample except for which one?

None of the answers is correct

Given the average outcomes for the treated and control groups, and their respective sample standard deviations how does the number of observations impact the p-value of a hypothesis test?

None of the answers is correct.

Given the average outcomes for the treated and control groups, and their respective sample standard deviations, how does the number of observations impact the spread of a confidence interval?

None of the answers is correct.

If one is attempting to make a prediction on how much sales will increase in the event of a price discount of 10%, which step will not use deductive reasoning in conducting the prediction?

None of the answers is correct.

If you have a sufficiently large sample that was randomly drawn from your target population with a randomly assigned treatment then all of the following conditions hold except for what?

None of the answers is correct.

If one was estimating a simple regression of Earnings (Y) on Height of individuals (X), and got a coefficient on the Height variable of 30, what would the intercept be if you added 3 inches to every individual in the sample but kept their earnings the same?

None of these choices are correct.

Suppose you have a random sample of 21 credit scores from a population of mortgage applicants with a sample mean of 620 and sample standard deviation of 70, and would like to calculate the 99 percent confidence interval of the population mean credit score. Which of the following would be the correct construction?

None of these choices are correct.

As long as our sample is large enough it will be the case that the average outcome for the treated group should have what distribution?

Normal

Under which sort of prediction is R-squared particularly informative of the value of the regression results?

Passive prediction

Which of the following settings best describes a setting in which one would be making an active prediction?

Predicting the change in click-through rates following the change in banner size.

Which of the following is an example of nonrandom treatment assignment in nonexperimental data?

Prices impact on number of products sold

Which of the following departments is likely to experience experimental data more frequently than nonexperimental data?

R&D

If the t-stat for the sample estimate of a coefficient, M1 calculated as the following t=∣∣∣m1−1Sm1∣∣∣ , where m1 is the estimated coefficient and Sm1 is the properly estimated standard error for the coefficient, comes out to be 2.7, what is the appropriate conclusion?

Reject the null hypothesis that the population coefficient M1 = 1 at a 99% confidence level.

For a given set of sample statistics, changing the null hypothesized value (K) for a population mean changes everything except for what?

Sample standard deviation

When running an experiment, suppose we assume the participants in the experiment are a random sample of the population. Let Yi be the outcome for individual i and let Di equal one if individual i receives the treatment and zero otherwise. What does the assumption of a random treatment imply?

Selection Bias = 0

In trying to measure the "treatment effect" of taking Tylenol on reducing your "next day's temperature from today", and given the fact that you only take Tylenol when you're feeling sick, which of the following conditions are likely?

Selection Bias > 0

How can applying the insights from the experiment ideal and the scientific method approach facilitate better analysis with nonexperimental data?

Sharpen attention towards the variation in treatment that is most appropriate for measuring treatment effects

Given that nonexperimental datasets are likely to have treatments that have not been randomly assigned, what is likely to contribute to difference in average outcomes between groups with different treatment levels?

Some of those differences are driven by a selection bias

Nonexperimental data are likely to have issues with all of the follow assumptions except for what?

Sufficiently large sample size to appeal to normality

If you are comfortable with assumptions required for causal analysis and you have estimated the relationship between Sales and running a promotion together with a price discount to be Salesi = 140.3 (60) + 4.3(0.7) Promo with standard errors reported in parenthesis. What would you predict to occur in the event of running a discount next week?

That it will increase sales by 4.3, and you are 90% confident that Promotions have some effect on Sales.

In making an active prediction that using a large banner advertisement will increase click-through rates based on a sample of data, what is not an appropriate criticism that someone might have for your prediction?

That the underlying population distribution is not normal.

Suppose you have a random sample of employees in your company and their tenure. The sample mean of this sample is 4.2 years and the sample standard deviation is 4.5 years. How would knowing that the random sample was of size 100 instead of 60 change the 90% confidence interval for the population mean of employee tenure?

The 90% confidence interval will be smaller for 100 than 60.

Suppose you're running a multiple regression of Home Prices on two treatment variables, City, which is a binary variable for whether or not the home is located in a city or not, and Finished Basement, which is a binary variable for whether or not the home has a finished basement. If you solve for the multiple regression using the moment conditions all of the following conditions must hold except for what?

The correlation between the residuals and Home Prices must be equal to zero.

Suppose your lead analyst runs a simple regression of Profits (Y) on price (X). You know that the average profit in the sample was $1,000 and the average price was $25. If your analyst reports that the intercept from the simple regression is 900, what can you infer about the estimated slope?

The estimated slope is 4.

If you're running a multiple regression of employee Hours Worked on Tenure (in number of years) and MBA (a binary variable equal to 1 for an employee with an MBA, 0 otherwise), which moment conditions would be used?

The first two answer options are correct.

Suppose Apple is considering increasing its advertising expenditures to promote the most recent iPhone. To try to assess the effect of such a move, it looks at sales in several small markets. In some of these markets Apple increased advertising expenditures by 30% and in others there was no change. When conducting this "experiment," what is the treatment?

The increase in advertising expenditures

Why is it helpful to think about the moment conditions of a simple linear regression even if OLS yields the same estimates?

The moment conditions are used directly to produce the slope and intercept.

Why is it helpful to think about the moment conditions of a simple linear regression even if OLS yields the same estimates?

The moment conditions facilitate assessing assumptions about causality.

Which of the following are you assuming is true to calculate the p-value of a test statistic?

The null hypothesis

In the dichotomous regression if one was to replace the regression line prediction of the outcome means (for treated/untreated) with medians, which of the following conditions would hold?

The number of strictly positive residuals would equal the number of strictly negative residuals.

Which of the following is not an assumption required to build a confidence level for an ATE in an experimental design?

The outcomes for the treated group and control group are the same.

Which of the following is not a required assumption in the reasoning behind constructing a hypothesis test of a population mean?

The population distribution is normal.

After estimating a regression of your firm's store sales and the number of local competitors as follows: Sales = 321,752 + 70.35 Number of Competitors. You are willing to believe you have a random sample of store locations, and a sufficiently large sample size. What is wrong with the following logic, "We need more competitors to enter the markets we're in, so that our sales will rise?"

The positive coefficient on number of competitors is not a causal estimate.

Which of the following conditions ensures that the estimates of the coefficients for the population regression equation are distributed normally?

The sample is "large" enough.

Suppose you send out 350 surveys to random sample of all past customers (your target population) asking them to report their level of satisfaction with your product. Of the 350, you used the 112 that responded to the survey to construct a confidence interval for the population "satisfaction score." What might be a potential problem with this confidence interval?

The sample you're using is not a random sample from the target population.

All of the following are statements of the criteria used to find the line that "best" describes the data in a multiple regression except for what?

The size of the residuals is not correlated with the outcome level.

Beyond the conditions required for consistent estimation of a model for causality, in order to conduct inference what additional assumption is required?

The size of the sample is sufficiently large (e.g., 30(K + 1), where K is the number of coefficients).

Which of the following is a reason why estimating the slope and intercept of a simple linear regression line using the least absolute deviations (LAD) approach is not as common as OLS?

The solution for LAD isn&'t always unique. C

Suppose you're running a multiple regression of Home Prices on two treatment variables, City, which is a binary variable for whether or not the home is located in a city or not, and Finished Basement, which is a binary variable for whether or not the home has a finished basement. If you solve for the multiple regression using the moment conditions which condition must hold?

The sum of the residuals for the observations with Finished Basements (=1) must be zero.

In a dichotomous regression which condition must hold?

The sum of the residuals for treated and untreated groups must be equal.

In a broad sense, the role of a confidence interval for the population mean is meant to accurately portray what?

The uncertainty involved with observing a sample and not the entire population.

If the determining function for Sales is given by Salesi = α0 + α1Pricei + Ui, what will the correlation between Sales and Price be?

There is not enough information.

When calculating the sample variance of a random sample, you divide the sum of the squared deviations (from the sample mean) by N - 1 instead of N to ensure the estimator achieves what property?

Unbiasedness

Which of the following objects must be the same sign as the covariance of variables X and Y?

Unconditional correlation of X and Y.

In the simple linear regression, the intercept will equal the sample average of the outcome (Y) variable if which of the following is true?

X=0

Which of the following equations cannot be estimated using linear regression techniques?

Y = m1X × m2Z

After estimating a regression of each employees number of contracts sold and their tenure (in number of years) at the company as follows: Contracts = 30.5 + 4.5 Tenure. You are willing to believe you have a random sample of employees, and a sufficiently large sample size. Given these results, is it appropriate to make the following claim, "Our more tenured employees at the company on average get awarded more contracts"?

Yes, you're making a passive prediction.

The distinction between causality and correlation is best described as:

causality implies a change in one variable creates a change in another, correlation implies variables move together.

The step that requires the use of inductive reasoning when making an active prediction from a sample of data is:

determining the population parameter from a sample.

As long as your sample is large enough, you don't have to worry about using the sample standard deviation in place of the unknown population standard deviation in constructing a confidence interval because ________.

for a large sample, the t-distribution is similar to the standard normal distribution

When implementing the scientific method, one moves from a research question to a proposed idea based on limited evidence that justifies further investigation, also known as a:

hypothesis.

While there are a few instances in the business world where one will observe experimental data, understanding the scientific method is critical because:

it's the gold standard for establishing causality.

The correlation between X and Y holding at least one other variable constant is known as:

partial correlation

The most robust (but perhaps impractical) way to estimate the price elasticity of demand for your product would be:

randomize the price over a period of time and estimate the difference in sales resulting from those changes.

The difference between the observed outcome and the corresponding point on the regression line for a given observation is a:

residual.

The R-squared of a regression is 1 - X, where X is the:

sum of squared residuals divided by the total sum of squares.

Suppose you have a random sample of 2,179 credit scores from a population of mortgage applicants with a sample mean of 620 and sample standard deviation of 70, and would like to determine if this is sufficient enough to rule out that the population mean is not 610. Which of the following objects would you calculate to make this decision?

t-stat

The total sum of squares is given by the sum of:

the squared difference between each observation Y and the average value for Y.

When making passive predictions, it is not important to be able conclude that:

your estimate describes a causal relationship between treatments and the outcome.

When making active predictions, it is important to be able to conclude that:

your estimate of the coefficients is a causal estimate.

To determine the intercept and slope coefficient in a simple linear regression line of Y on X, all the following conditions will be used except for what?

∑Ni=1(Yi−mXi ) Xi / N=0

Using OLS to solve for the slope and intercept of a simple linear regression will yield a regression line that satisfies which of the following conditions?

∑^Ni=1 ei Xi / N=0

If you're running a multiple regression of employee Hours Worked on Tenure (in number of years) and MBA (a binary variable equal to 1 for an employee with an MBA, 0 otherwise), what moment conditions would not be used?

∑^Ni=1(Hoursi−b−m1Tenurei)MBAi / N=0

Suppose that the following regression equation best describes the co-movement between Sales, Price and Number of Competitors: Salesi = B + M1Pricei + M2NumCompi. What moment condition would not be used to yield a consistent estimate of B, M1, M2?

∑^Ni=1(Salesi−b−m1Pricei)Pricei/ N=0


Kaugnay na mga set ng pag-aaral

Penny Chapter 28: The Fetal Gastrointestinal System Review Questions

View Set

Practice exam questions: mancuso

View Set

American History Ch. 2 Quiz (T/F)

View Set

Biology II - Chapter 36 Mastering

View Set