ECON FINAL
Which statement about homoscedasticity is correct?
Homoskedasticity is when the standard errors do not depend on the values of the X's
Is there any evidence of a violation in model assumptions?
Yes, there is evidence of heteroskedasticity
A website advertises job openings on its website, but job seekers have to pay to access the list of job openings. The website recently completed a survey to estimate the number of days it takes to find a new job using its service. It took the last 30 customers an average of 60 days to find a job. Assume the population standard deviation is 10 days. Calculate a 95% confidence interval of the population mean number of days it takes to find a job
[56.4216, 63.5784]
Which of the following would be the rent for a 1,000-square-foot apartment that has two bedrooms and two bathrooms?
$1,130
Target sells a coat for $200 normally. You have a $10 off coupon and they are running a 20% off sale. If the cashier takes the coupon off first before applying the sale, how much will the coat cost (pre-tax)?
$152
Find the margin of error for the confidence interval
1.45 inches
The median and the standard deviation for this sample are the closest to ________.
1.62 and 6.11
The minimum sample size n required to estimate a population mean with 95% confidence and the desired margin of error 1.5 was found to be 198. Which of the following is the approximate value of the assumed estimate of the population standard deviation?
10.7690
A store wants to create a model for sales. In their model, they want to include the month. How many dummy variables does the store need to include assuming they are open year-round?
11
Suppose that we have a qualitative variable Month with categories: January, February, etc. How many dummy variables are needed to describe Month?
11
A sample regression equation is given by: Suppose that when x = 10, y is observed to be 9. What is the residual of the model prediction and does the model under or overpredict the value of y?
2, the model underpredicts y
A student receives a test score of 70%. The teacher announces to the class that each student will have their test score increased by 3%.How many percentage points will this increase the student's test?
2.1
What is the predicted GPA for a male student who is 20 years old and averages 8 hours of sleep each night?
2.911
A person wants to buy a laptop which costs $1000. When he purchases the laptop, he first applies a 20% discount and then uses a $15 off coupon. What percent of the cost of the laptop was removed by the discount and the coupon?
21.5%
Math scores on the SAT are normally distributed with mean at 500 points and a standard deviation of 100 points. If 1000 students take the SAT how many would be expected to have a math score above 700 points?
23
For a uniform distribution from 3 to 10, what is the probability of an observation being between 6 and 8?
29%
If SSE = 300 and SST = 625, compute SSR
325
The model y = β0 + β1X + β2D + β3XD + ε is an example of a ________.
linear regression model with dummy variable, quantitative variable, and interaction variable
In general, the null and alternative hypotheses are ________.
mutually exclusive
A sample of a given size is used to construct a 95% confidence interval for the population mean with a known population standard deviation. If a bigger sample had been used instead, then the 95% confidence interval would have been ________ and the probability of making an error would have been ________.
narrower; unchanged
Quantitative variables assume meaningful ________, whereas qualitative variables represent some ________.
numeric values, categories
Simple linear regression analysis differs from multiple regression analysis in that _______.
simple linear regression uses only one explanatory variable
A local courier service advertises that its average delivery time is less than 6 hours for local deliveries. When testing the two hypotheses, H0: μ ≥ 6 and HA: μ < 6, μ stand for
the mean delivery time
If the null hypothesis is rejected at a 1% significance level, then ________.
the null hypothesis will be rejected at a 5% significance level
The adjusted multiple coefficient of determination (the adjusted R2) is adjusted for _____.
the number of independent variables
If the chosen significance level is α = 0.05, then ________.
there is a 5% probability of rejecting a true null hypothesis
The coefficient of determination R2 is ________.
usually higher than adjusted R2
What is the expected change to 𝑌𝑥 if increase by 4 units?
𝑌𝑥 decreases by 10 units
Suppose you want to test the hypothesis at the 99% level that average height is less than 70 inches. What will your conclusion be and why?
Fail to reject - 70 inches is inside of the confidence interval
On a standard normal curve, the area to the right of what Z-score is 95%?
-1.645
Imagine that the length of a newborn dragon, X, is normally distributed with a mean μ = 6 inches and standard deviation σ = 1.5 inches. What is the probability that the baby dragon will be smaller than 3 inches?
0.0228
A radar unit is used to measure speeds of cars on a highway. The speeds are normally distributed with a mean of 70 mph and a standard deviation of 5 mph. What is the probability that a car picked at random is travelling at more than 80 mph?
0.025
The average plane flies at 900 km/hour, the speed are normally distributed with a standard deviation of 60 km/hour. What is the probability that a plane picked at random is travelling at a speed less than 810 km/hour?
0.0668
The probability P (Z < −1.28) is closest to ________.
0.10
A multiple regression model with two independent variables is estimated using 20 observations resulting in SSE = 550 and SST = 1000. Which of the following is the correct value of R2?
0.45
If SSE = 300 and SST = 625, compute R2
0.52
Suppose you run a multiple regression model with 4 independent variables. If there are 100 observations and the SSE = 75, what is the MSE?
0.789
An economist uses regression analysis to determine the relationship between used car price (y) and the age of a car (x). The analysis resulted in the following equation: The above equation implies that an increase of:
1 year in the age of the car is associated with a decrease in $500 in the price of the car
For a regression, if SSE = 200 and SST = 1,400. What is SSR?
1,200
Suppose you construct a 95% confidence interval based on a sample of students' height and find a confidence interval of (67.3, 70.2). What is the point estimate for students' average height?
68.75
Suppose you construct a 95% confidence interval based on a sample of students' height and find a confidence interval of (67.3, 70.2).What is the point estimate for students' average height?
68.75
For some uniform distribution, there is a 40% probability of drawing an observation between 0 and 3. Assuming the distribution begins at 0, where does the distribution end?
7.5
Assume we get the following estimate equation:Y=70+8x1―2.5x2 What is the intercept coefficient for the regression equation?
70
The coefficient of determination indicates that ___
80.92% of the variation in Rent is explained by the sample regression equation
Use the following Excel regression output to answer questions 27 - 31.The dependent variable in the model is sales in units. Price is the price of the item being sold.TV and Internet advertising are the amount of money spent on the respective types of advertising. Luxury is a dummy variable for whether the item is a luxury good.Y (Sales) = B0 + B1 X1 (Price) + B2 X2 (TV Adv.) + B3 X3 (Internet Adv.) + B4 X4 (Luxury) + ε
92.2 units
For any normally distributed random variable with mean μ and standard deviation σ, the percent of the observations that fall between [μ - 2σ, μ + 2σ] is the closest to ________.
95%
Which of the following is true about a mode?
A dataset can have multiple modes only if there is more than one observation
For the following regression equation, interpret the coefficient for mileage. ln (Price)=23,000―0.037 Mileage
A one mile increase in mileage leads to a 3.7% decrease in price
For the following regression equation, interpret the coefficient for mileage.ln(-Prıce)=23,000-0.037Mileage
A one mile increase in mileage leads to a 3.7% decrease in price
Using Model A, which of the following is the estimated average difference between the salaries of male and female employees with the same years of education, weeks of training but male has one more months of experience?
About $618
Which is a way to increase the value of R2
Add an independent variable to the model which is useful for predicting the dependent variable
If we wanted to compare the goodness-of-fit for the model above with a simple regression only including Sleep, which of the following would we look at?
Adjusted R2
Which variables are statistically significant at the 95% level?
All of the above
Which of the following statements is true regarding the results above?
All of the variables included in the model are significant at the 10% level
When the level of confidence decreases, the margin of error:
Becomes smaller
If we dropped every independent variable except one, what would we expect to happen to the value of R2
Decrease
Consider the sample regression equation = 12 + 3X1 - 5X2 + 7X3 - 2X4. When X1 increases by 1 unit and X2 increases by 2 units, while X3 and X4 remain unchanged, what change would you expect in the predicted Y?
Decrease by 7
Which of the following will increase the width of a confidence interval?
Decrease the sample size
The difference between the observed value and the predicted value is called the
Error
What is the name of the variable that is used to predict another variable?
Explanatory
A survey that only asks democrats "Who do you think will win the local election" is an example of non-response bias
False
According the empirical rule for normally distributed variables, 75% of the values fall within one standard deviation of the mean
False
Adding an independent variable, which has no predictive power, to a regression model will usually lower the value of R2
False
An assumption of the classic regression model is that there is no Multicollinearity
False
Another name for an explanatory variable is the dependent variable
False
As our F-statistic decreases closer to zero, we are more likely to reject the null hypothesis
False
For data with a heteroskedasticity problem, the estimated parameter coefficients will be wrong if we do not correct for the heteroskedasticity
False
For=0.05, we will reject the null hypothesis for a p-value that is 0.08
False
Heteroskedasticity is when your X variable is correlated with your Y variable
False
If the null hypothesis is H0: ß1 > 0, then we are doing a right tailed hypothesis test
False
In the simple regression model, X is a response or dependent variable
False
P (Z < 0.47) = 0.6808.In the formula above, 0.6808 is our test statistic
False
The can range between -1 and 1
False
The interpretation of coefficients of each independent variable in a multiple linear regression model is the same as the interpretation of coefficient of the independent variable in a simple linear regression model
False
The letter Z is used to denote a random variable with any normal distribution
False
The normal distribution is always symmetric and centered around zero
False
The sample variance can be positive or negative
False
The sum of Squared Residuals (SSE) is the variation in the data that is explained by our model
False
The variance and standard deviation are the most widely used measures of central location
False
The width of the confidence interval is the margin of error
False
When we use the goodness of fit measures like and adjusted , we should only be motivated to make them as close to 1 as possible
False
What does the estimated intercept (b0) in regression analysis?
The value of the response variable when all of the explanatory variables are 0
Which of the following could be an example of autocorrelation
Financial crisis on the housing market
Many cities around the United States are installing LED streetlights, in part to combat crime by improving visibility after dusk. An urban police department claims that the proportion of crimes committed after dusk will fall below the current level of 0.84 if LED streetlights are installed. Specify the null and alternative hypotheses to test the police department's claim
H0: p ≥ 0.84 and HA: p < 0.84
Which of the following represents an appropriate set of a population hypotheses?
H0: μ = 0, HA: μ ≠ 0
Which of the following are two-tailed tests?
H0: μ = 10, HA: μ ≠ 10
A typical soft drink from the local vending machine has 12 oz. in it. If you think that the company is actually under filling the cans, which of the following is the relevant hypothesis test?
H0: μ > 12; HA: μ < 12
For a t-test or an F-test, as the test statistic becomes farther from zero, the chance of rejecting the null hypothesis...
Increases
Which of the following will decrease the width of a confidence interval?
Increases the sample size
Which of the following will increase the width of a confidence interval?
Increasing the level of confidence
Based on the regression output, which form of advertising is more effective?
Internet
Expedia would like to test if the average round-trip airfare between Philadelphia and Dublin is less than $1,200. Which of the following hypothesis tests should be performed?
Left-tailed
Suppose we run a simple linear regression with college GPA as the dependent variable and distance from campus as the independent variable. If the slope coefficient on distance from campus is negative, how could we interpret this? Y (GPA) = B0 + B1 (Distance) + ε
Living farther from campus is associated with a lower GPA
A fitted least squares regression line ________.
May be used to predict a value of y if the corresponding X value is given
Which descriptive statistic does not give information about the spread of the data?
Mean
Data on income in America tends to have a very long right-tail. Knowing this, which value will be larger, median household income or mean household income?
Mean Household Income
The Boom Company has recently decided to raise the salaries of all employees by 10%. Which of the following is (are) expected to be affected by this raise?
Mean, median, and mode
After you graph the data, you realize that there is a large left tail. Knowing this, which value will be larger: the mean or median?
Median
Using the same data set, four models are estimated using the same dependent variable, however, the number of independent variables differs. Which of the following models provides the best fit?
Model 1
The accompanying table shows the regression results when estimating Y = β0 + β1X + ε. Is X significantly related to y at the 5% significance level?
No, because the p-value of 0.0745 is greater than 0.05
Suppose we found out that there is a heteroskedasticity problem that was not corrected. Are we able to tell whether the coefficient for luxury is statistically significant without correcting the standard errors?
No, we cannot know how much the p-value will change without running the corrected regression
If the median is equal to the mean for our data, which of the following must be true?
None of the above is true
Which of the following is the necessary condition for creating confidence intervals for the population mean?
Normality of the estimator
Which of the following does not represent a continuous random variable?
Number of students in the classroom
If we run a simple linear regression and find an R2 = 0.12, we can conclude which of the following?
The variation in X explains 12% of the variation in Y
If we run a simple linear regression and find an R2 = 0.45, we can conclude which of the following?
The variation in X explains 45% of the variation in Y
Interpret the coefficient on price
On average holding all else constant, a $1 decrease in price will increase the number of units sold by 3.6 units
Which of the following is a correct conclusion for the coefficient on male?
On average, all else constant female students have higher GPAs than male students
Suppose we regress the salary of professional athletes on experience and gender and get the following estimate equation: 9𝑆𝑎𝑙𝑎𝑟𝑦= 200,500+782,945∗ln (𝐸𝑥𝑝𝑒𝑟𝑖𝑒𝑛𝑐𝑒)+ 175,000∗𝑀𝑎𝑙𝑒37. What is the correct interpretation of the coefficient on experience?
On average, all else constant, a 1% increase in experience is associated with an additional $7,829.45 in salary.d. None of the above are correct
ssume we instead get the following estimate equation:ln (Y)=70+0.08x1―2.5x2 What is the correct interpretation for the coefficient on X1?
On average, all else constant, a one unit increase in X1 is associated with an 8 percent increase in Y
Which of the following is the interpretation for the coefficient on age?
On average, all else constant, an additional 1 year of age is associated with an additional 0.07 points of GPA
Given the results reported above, we can say all of the following except
On average, the minimum price for a house in Lexington is $64,350
If X has a normal distribution with µ = 100 and σ = 5, then the probability P(85 ≤ X ≤ 90) can be expressed in terms of a standard normal variable Z as ________.
P (-3 ≤ Z ≤ -2)
For the model y = β0 + β1X + β2D + ε, which approach is used for testing the significance of a dummy variable D?
P value
It is known that the length of a certain product X is normally distributed with μ = 20 inches. How is the probability P(X > 16) related to P(X < 16)?
P(X > 16) is greater than P(X < 16)
High levels of multicollinearity make each of the following unreliable except:
Point Estimate
What is the most typical form of a calculated confidence interval?
Point estimate ± Margin of error
If we ran a simple linear regression with our dependent variable being ice cream sales and our independent variable being temperature, what sign would we expect the coefficient on temperature to be?Y (Sales) = B0 + B1 (Temperature) + ε
Positive
What relationship do the following data have? X=7,9,1,2,1,6,2,0 Y=4,1,2,2,0,2,6,2,9
Positive and non-linear
A variable that cannot be measured in numerical terms is called a _____.
Qualitative variable
Which of the following would most likely lower the R2 of this model?
Removing "Brick" from the model and running again
In order to test the joint significance of our model as a whole, which of the following would we do?
Run the regression and interpret the significance of the F-statistic
Using Model B, what is the regression equation for males?
Salary = 5,322.51 + 139.5366 Educ + 3.3488 Experience
Which of the following variables are significant at the 5% level?
Square Feet
The following scatterplot indicates that the relationship between the two variables x and y is ________.
Strong and negative
The variable Train is deleted from Model A which results in Model B. Which of the following justifies this choice?
The adjusted R2 of Model B is higher than the adjusted R2 of Model A, and the variable is not individually significant in Model A
For a simple linear regression of car price (Y) on mileage (X), what is the correct interpretation of0, the intercept?
The average price of a car with zero mileage
Giving the confidence interval at 95% level for the coefficient on sleep is (0.091, 0.277), which of the following is true?
The coefficient on sleep is statistically significant from 0
When we construct a confidence interval, what is our parameter of interest?
The mean
In the sample regression equation ^y = b0 + b1x, What is y?
The predicted value of y, given a specific x value
Below is the estimated regression equation of car price on mileage. Which is the correct interpretation of the coefficient on mileage?P-rıce=23,000-0.57Mileage
The price of a car decreases by 57 cents for every additional mile that car is driven, on average
Which factor does not affect the size of the margin of error?
The sample mean
What does it mean when we say that the tails of the normal curve are asymptotic to the x axis?
The tails get closer and closer to the x axis but never touch it
In a simple linear regression model, we find that β 1 is not significantly different from zero. Which of the following can we conclude?
There is no evidence of a linear relationship between X and Y
What is the purpose of calculating a confidence interval?
To provide a range of values that, with a certain measure of confidence, contains the population parameter of interest
A 95% confidence interval is larger (has a bigger range) than a 90% confidence interval all else constant
True
A regression's R2 can never be smaller than its adjusted R2
True
If a small segment of the population is sampled, then an estimate will be less precise than if a large segment was sampled
True
In a one-tailed test, the rejection region is located under one tail (left or right) of the corresponding probability distribution, while in a two-tailed test this region is located under both tails
True
In a simple linear regression, if we can reject the null hypothesis of an F-test, we can also be certain that our only independent variable is statistically significant
True
Increasing the confidence level will increase the width of a confidence interval, all else constant
True
Standard Deviation is a measure of the variation of the data series
True
Stratified Sampling is generally more accurate than cluster sampling and is preferred when possible
True
Taking a sample of Lexington residents will likely result in biased estimates compared to the entire population of Kentucky
True
The F-statistic from a regression output is used to test the significance of the entire model
True
The Standard Normal Distribution has a mean, median, and mode of 0
True
The adjusted R2 will increase only if adding a new variable X really contributes to explaining Y by more than by purely random factors
True
The assumed null hypothesis for regression coefficients is that H0: 0 (is not statistically significant. Therefore HA: -=! 0 (is statistically significant)
True
The standard normal distribution is a normal distribution with a mean equal to zero and a standard deviation equal to one
True
Which of the following statements about variance is the most accurate?
Variance is the average of the squared deviations from the mean
In a multivariable regression, if we know that one of the coefficients is statistically significant, what can we say about the conclusion of an F-test?
We will reject the null hypothesis because at least one variable is statistically different from zero
Which statement about homoskedasticity is not correct?
When heteroskedasticity is present, standard errors do not depend on any independent variable
In the model y = β0 + β1X + β2D + β3XD + ε, the dummy variable and the interaction variable cause
a change in both the intercept as well as the slope
A portfolio's annual total returns (in percent) for a five-year period are:
c
If a significant relationship exists between X and Y and the coefficient of determination shows that the fit is good, the estimated regression equation should be useful for:
estimation and prediction
The slope coefficient for Bedroom indicates that, holding other independent variables constant, ________.
for each additional bedroom the rent is predicted to increase by $226
Which of the following relationship can be described by using the y as a dummy variable?
whether a student will pass or fail a test
The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 800 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 750. Assume that the population standard deviation is 350. The value of the test statistic is
z = -1.429
Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where Spring, Summer, and Fall are the dummy variables, and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. Assuming the same temperature and precipitation condition, what is the difference between the predicted humidity for summer and winter days?
β3
Consider the following simple linear regression model: y = β0 + β1x + ε. The random error term is
ε