Tölfræði 2
If in a model, all the points on a scatter diagram lie on a straight line, what is the value of the Sum of Squared Errors? ●0 ● Infinity ●1 ● cannot be determined
0
The range of the values for Durbin Watson statistic is between: ● -1 to 1 ● Unlimited ● 0 to 4 ● 0 to 2
0 to 4
Sam wants to analyse electricity prices. He has set up a multiple regression model with 3 independent variables, his dataset includes 24 monthly observations. What is the lower critical value of the one sided Durbin-Watson test at 5% significance level? ● 1.08 ● 1.10 ● 1.54 ● 1.66
1.10
Anna is analysing a multiple regression model with 2 independent variables; there are 69 observations in her dataset. She needs to check whether the errors from her model are homoskedastic. She has decided to set up the White Test with 5% significance level. What is the critical value used in this test? ● 3.84 ● 7.81 ● 5.99 ● 11.07
11.07
Arna is analysing a multiple regression model with three independent variables. She needs to check whether the errors from her model are homoskedastic. She has decided to set up the Breusch-Pagan test. How many degrees of freedom will the N*R_squared statistic have? ●2 ●9 ●3 ●8
3
If a categorical independent variable, such as education level, contains exactly 4 categories: bachelor, Master, PhD, MBA, then how many variables will be needed to represent these categories in a multiple regression? ●1 ●3 ●2 ●4
3
If an independent variable, such as seasons, contains exactly four categories, then how many dummy variable(s) will be needed to uniquely represent these categories? ●4 ●3 ●2 ●5
3
In a one way analysis of variance with 3 groups and n=15 the value of the F statistic is equal to 3.9. What is the critical value (with alpha of 5%)? Do you reject or fail to reject H0? ● -3.885; reject ● 3.885; reject ● 3.885; fail to reject ● -3.885; fail to reject
3.885; reject
In a multiple regression analysis with 7 independent variables and a sample size of 320, the degrees of freedom for a test of the significance of an individual coefficient are equal to: ● 319 ● 313 ● 312 (n-k-1) ● impossible to determine
312
Anna is analysing a multiple regression model with two independent variables; there are 69 observations in her dataset. She needs to check whether the errors from her model are homoskedastic. She has decided to set up the White test. How many degrees of freedom will the N*R_squared statistic have? ● 66 ●5 ●2 ● 69 Number of coefficients in the auxiliary equation p = number of predictors (p**2 + 3*p) / 2
5
What is the effect on wage for every extra year of education for a female according to the estimated model below? log(wage)=0,389-0,227female+0,082educ-0,0056female*educ ● 0.56% ● 23.84% ● 7.64% ● 8.2%
7.64%
Katrin needs to verify whether there is a problem of heteroskedasticity in her model (OLS regression with 3 independent variables). She has set up a Breusch-Pagan test, the value of the statistic is 39. What is the 5% critical value and the conclusion from the test? ● 9.49 ; there is no evidence of heteroskedasticity ● 3.182 ; there is evidence of heteroskedasticity ● 7.81 ; there is evidence of heteroskedasticity ● 2.353; there is no evidence of heteroskedasticity
7.81 ; there is evidence of heteroskedasticity
Variable Master is a dummy equal to 1 if an individual has a Master degree and 0 if he does not have a Master degree.What is the correct interpretation of the following model? wage = 202.2 + 1543.2 Master wage is in dollars ● A person with a Master degree earns 15.43% more than a person without a Master degree. ● person with a Master degree earns 1745 dollars and 4 cents more than a person without a Master degree ● A person with a Master degree earns 0.1543% more than a person without a Master degree. ● A person with a Master degree earns 1543 dollars and 2 cents more than a person without a Master degree.
A person with a Master degree earns 1543 dollars and 2 cents more than a person without a Master degree.
y = production cost, x1 = labour salary, x2 = material cost log y = 3.046 + 1.2456 log x1 + 0.5722 log x2Interpret the estimated regression coefficient of x1 ● All else being equal, 24.56% increase in production cost will be associated with a 1% increase in labour salary. ● All else being equal, 1% increase in production cost will be associated with a 24.56% increase in labour salary. ● All else being equal, 1% increase in production cost will be associated with a 1.2456% increase in labour salary. ● All else being equal, 3.046% increase in production cost will be associated with a 1.2456% increase in labour salary.
All else being equal, 1% increase in production cost will be associated with a 1.2456% increase in labour salary.
What is the correct interpretation of the following model:Salary = 15+25educationWhere Salary is measured in dollars and education measured in years. ● An extra year of education increases salary by 250 dollars ● An extra year of education increases salary by 25 dollars ● An extra year of education increases salary by 2.5% ● An extra year of education increases female salary by 25 dollars
An extra year of education increases salary by 25 dollars
What is the correct interpretation of the following model: log_salary = 15 + 0.25 education where salary is measured in dollars and education is measured in years. ● An extra year of education increases salary by 25 dollars ● An extra year of education increases salary by 0.25 dollars ● An extra year of education increases salary by 25% ● An extra year of education increases salary by 0.25 %
An extra year of education increases salary by 25%
What are some of the effects of heteroskedasticity? ● Error term variance becomes constant ● Error term variance is not constant ● Confidence intervals and t-tests are no longer valid. ● Confidence intervals and t-tests become valid. ● Both A and D are correct ● Both B and C are correct
Both B and C are correct
Which tests can we perform to check for heteroskedasticity? ● T-test ● F-test ● Breusch-Pagan test ● White test ● Both A and C are correct ● Both C and D are correct
Both C and D are correct
After testing residuals for cointegration using the Dickey-Fuller test, what method(s) can we use to deal with nonstationary time series if the variables have unit root but are not cointegrated? ● Estimate the equation in its original form ● Change the functional form of the model to first differences and estimate the equation ● Plot a scatter graph and draw a trendline ● All of the above ● None of the above
Change the functional form of the model to first differences and estimate the equation
Arnar is a big fan of Dungeons and Dragons. He often plays the game on Friday evenings with a bunch of other people. However, after last evening he is worried that his D-20 dice (a special dice with 20 sides) has been swapped for a fake one - it still has 20 sides but they do not come up with the same probability each. He wants to check this dice - what is the best test with which to do that? ● T-test ● One way ANOVA ● Chi-square goodness of fit test (It allows you to draw conclusions about the distribution of a population based on a sample) ● Wilcoxon Signed Rank test
Chi-square goodness of fit test
When the upper bound of the 95% confidence interval is equal to 3.5 and the lower bound is -0.3 we can conclude that: ● Coefficient is significantly different from 0 ● We have 5% success of correctly estimating our coefficient of interest ● Confidence interval is inconclusive ● Coefficient is not significantly different from 0
Coefficient is not significantly different from 0
When the upper bound of the 95% confidence interval is equal to 3.5 and the lower bound is 0.3 we can conclude that: ● Coefficient is not significantly different from 0 ● Confidence interval is inconclusive (getur ekki verið inconclusive) ● We have 5% success of correctly estimating our coefficient of interest ● Coefficient is significantly different from 0
Coefficient is significantly different from 0
What is the correct information regarding the following model: salary = 10 + 30 education where salary is measured in dollars and education is measured in years Standard error of the education coefficient is 30. ● Coefficient on education is not significant ● Coefficient on education is significant ● Standard error or the slope is not signincant ● Standard error of the slope is significant t value = (B / SEb ) = 30 / 30 = 1 t critical = max 1.282 1 < 1.282 so we cannot reject H0 (B = 0) og þá er B education not significant
Coefficient on education is not significant
What is the correct information regarding the following model: salary = 10 + 30 education where salary is measured in dollars and education is measured in years Standard error of the education coefficient is 3. ● Coefficient on education is not significant ● Coefficient on education is significant ● Standard error or the slope is not signincant ● Standard error of the slope is significant t value = (B / SEb ) = 30 / 3 = 10 t critical = max 3.0910 > 3.09 so we reject H0 (B = 0) og þá er B education significant
Coefficient on education is significant
Katrin has set up a model without an intercept to evaluate the effect of the presidential election on the stock market. She has used daily data from the last year and set up a multiple regression model. Which test she cannot use while evaluating this model? ● T-test ● Dickey Fuller test ● F-test ● Durbin Watson test Durbin-Watson test requires three assumptions-> one of them is: The regression model includes an intercept term.
Durbin Watson test
Katrin has set up a model evaluating hourly electricity prices using observations from the last year (price of elecitricity is the dependent variable). She has set up a test to verify whether data are stationary. The t-statistic for electricity is equal to -1.2. What can she conclude? ● Electricity price is significantly different from zero ● Electricity price is neither stationary nor stationary - test is inconclusive ● Electricity price is stationary ● Electricity price is not stationary
Electricity price is not stationary
What does it mean that the error term is homoscedastic? ● Error terms have problems with autocorrelation of order 1. ● Error terms come from at least two different distributions. ● Error terms have a constant variance. ● Error terms do not have constant variance.
Error terms have a constant variance.
What does it mean that the error terms come from the same distribution? ● Errors are autocorrelated ● Errors are heteroskedastic ● Errors are characterised by a non constant variance ● Errors are homoskedastic
Errors are homoskedastic
Lagrange Multiplier is used to test if: ● Errors are serially correlated ● Errors are normally distributed (stórt úrtak / punktarit / histogram lítið úrtak) ● Errors are stationary (Dicky Fuller test) ● Errors come from different distributions (test for heterscedasticity)
Errors are serially correlated
While setting up Dickey-Fuller test for the exchange rate over a period of 30 months, Arnar calculated t statistic to be equal to -1.32. What can he conclude? ● Exchange rate is not stationary ● Exchange rate is stationary ● Exchange rate is a time series ● Exchange rate is cointegrated
Exchange rate is not stationary
In order to test the validity of a multiple regression model involving 5 independent variables, an intercept and 50 observations, the statistic for assessing the overall significance of the model follows: ● Student's t-distribution with 44 degrees of freedom ● F-distribution with 5, 44 and 45 degrees of freedom ● F-distribution with 5 and 44 degrees of freedom ● Student's t-distribution with 45 degrees of freedom
F-distribution with 5 and 44 degrees of freedom
Arnar is estimating a multiple regression with days of the week dummies. He wants to verify whether these dummies are together important for the model. For this he needs to set up: ● T-test for each dummy ● Overall F-test ● Confidence interval ● F-test for a subsample (for subgroups)
F-test for a subsample
In order to verify whether exchange rate Granger causes balance of payments Arnar needs to set up: ● F-test for subgroup ● T-test ● Chi-squared test ● Dickey Fuller test
F-test for subgroup
While setting up Dickey-Fuller test for the exchange rate over a period of 25 months, Arnar calculated t statistic to be equal to -0.92. What can he conclude? ● He fails to reject H0 of series being stationary ● He rejects H0 of series being nonstationary ● He rejects H1 of series being stationary ● He fails to reject H0 of unit root.
He fails to reject H0 of unit root.
Dickey-Fuller test is used to check: ● If one variable Granger causes the other (F-test for subgroups) ● If time series is stationary ● If cross sectional data are stationary ● If there is cointegration between the independent variables
If time series is stationary
In regression models, multicollinearity arises when the ______. ● Dependent variables are highly correlated with one another. ● Independent variables are highly correlated with one another. ● Independent variables are highly correlated with the dependent variable. ● Error terms do not have the same variance. ● Sigmundur Davíð Gunnlaugsson becomes prime minister.
Independent variables are highly correlated with one another.
A time series is a: ● set of measurements on a variable collected at the same time or approximately the same period of time. ● Is a set of measurements, ordered overtime, on a particular quantity of interest. ● Model that attempts to analyze the relationship between a dependent variable and one or more independent variables. ● Model that attempts to forecast the future value of a variable.
Is a set of measurements, ordered overtime, on a particular quantity of interest.
Which of the following is true of the error term used in linear regression? ● It represents the joint influence of all the dependent variables in the regression model. ● It represents the joint influence of factors, other than the dependent and independent variables, on the regression model. ● It represents the joint influence of all the independent variables in the regression model. ● It represents the combined effect of the dependent, independent, and non- represented factors on the regression model.
It represents the joint influence of factors, other than the dependent and independent variables, on the regression model.
Which of the following is true of the error term used in linear regression? ● It represents the joint influence of all the dependent variables in the regression model. ● It represents the joint influence of factors, other than the dependent and independent variables, on the regression model. ● It represents the joint influence of all the independent variables in the regression model. ●It represents the combined effect of the dependent, independent, and non- represented factors on the regression model.
It represents the joint influence of factors, other than the dependent and independent variables, on the regression model.
Which of the following is expected to occur in multiple regression analysis if an important variable is omitted from the list of independent variables? ● It will lead to multicollinearity between predictor variables. (bara ef 2 x breytur eru háfylgnar) ● It will lead to biased least squares estimators. (sama og B) ● It will lead to unbiased least squares estimators. ● It will lead to biased estimators of the variance. (SEb, öll önnur frávik frá forsendum gera þetta)
It will lead to biased least squares estimators.
Which procedure is not used when working with a regression analysis and determining the quality of the OLS model? ● White test ● Variance Inflation Factor ● Kruskal-Wallis ● Confidence interval
Kruskal-Wallis
Maria has set up a regression using monthly data. Among the independent variables there is a lagged dependent variable on the right hand side. Maria wants to test the model for autocorrelation, which test should she use? ● Durbin Watson test ● Dickey Fuller test ● Lagrange Multiplier test ● Granger test
Lagrange Multiplier test
Arna has set up a model to evaluate the effect of volcano eruption on the number of tourists. She has used monthly data from the last three years and set up a simple regression model. She wants to verify whether there is a second order correlation problem in the model. Which test she should use to verify that? ●Lagrange Multiplier test ● Cointegration test ● Dickey Fuller Test ● Dickey fuller test
Larange multiplier test
Katrin has set up a model to evaluate the effect of presidential election on the stock market. She has used daily data from the last year and set up a multiple regression model with a lag of the dependent variable as one of the explanatory variables.Which test should she use to test for autocorrelation? ● Granger causality ● Durbin Watson ● Dickey Fuller ● Lagrange Multiplier test
Larange multiplier test
What would you conclude if you fail to reject H0: b1 = b2 = ... = bk = 0 ? ● No relationship exists between the dependent variable and the independent variables ● a strong relationship exists among the independent variables ● The independent variables are good predictors of the dependent variable ● more information is needed to answer the question
No relationship exists between the dependent variable and the independent variables
What are the consequences of a not constant error variance? ● OLS estimates of standard errors of slope coefficients are not reliable ● OLS coefficients are under- or over-estimated ● OLS predicted values are random ● OLS residuals are serially correlated
OLS coefficients are under- or over-estimated
An experiment evaluating in total 5000 trash bags has been conducted to compare the breaking strength of trash bags produced by 5 different companies. What statistical test should be used to verify whether the mean breaking strength of these bags is equal across the five producers? ● T-test ● One way analysis of variance ● Chi-square goodness of fit ● Wilcoxon Signed Rank Test
One way analysis of variance
_________ is the probability of obtaining a test statistic more extreme than the observed sample value, given that H0 is true. ● Type I error ● Type II error ● P-value ● Chi-square test
P-value
How can we determine if there is first-order serial correlation? ● Perform the Augmented Dickey Fuller Test (ADF Test) ● Perform an F-test ● Perform the Durbin-Watson test ● None of the above
Perform the Durbin-Watson test
When in a scatter plot all points are on the straight line we can conclude that: ● R squared is equal to 0 ● Sum of squared errors is equal to 1 ● R squared is equal to 1 ● Total sum of errors is equal to 1
R squared is equal to 1
When analysing a linear probability model we knot that: (sleppum þessum kafla) ● The dependent variable has only two values: 0 or 1 ● R squared is the best measure of the overall fit of the model ● At least one of the independent variable is dummy ● The errors are homoskedastic
R squared is the best measure of the overall fit of the model and The errors are homoskedastic
What remedies can be used to avoid heteroskedasticity? ● Redefine variables ● Perform a Breusch-Pagan test ● Perform a t-test ● Check the Variance Inflation Factors (VIF)
Redefine variables
Y=B0+B1X1+B2X2+B3X3+B4X4+E and Y=B0+B1X1+B2X2+E were run using a sample of 30 observations. The SSE (Sum of Squared Errors) for the first regression is 298.4 and 382.3 for the second regression. Test H0 : B3 = B4 = 0. ● Reject H0 at a = 0.01 ● Reject H1 at a = 0.025 ● Reject H0 at a = 0.05 ● Fail to reject H0 at a < 0.10
Reject H0 at a = 0.05
_____ occurs when errors are not independently distributed. ● Linearity (bara hægt að skoða myndum) ● Multicollinearity (fylgni milli x breyta, hefur ekkert með errors að gera) ● Functional form problem (bara hægt að nota myndir og fræðilegar heimildir, ekki hægt að prófa) ●Serial correlation
Serial correlation
What are possible consequences of Serial Correlation? ● The regression model includes a lagged dependent variable as an independent variable ● Serial correlation causes OLS to no longer be the minimum variance estimator (of all linear estimators) ● An OLS model can be used to estimate a GLS model. ● None of the above
Serial correlation causes OLS to no longer be the minimum variance estimator (of all linear estimators)
Name two common examinations you can use to check for multicollinearity ● Simple correlation coefficient and White test ● Variance inflation factors (VIF) and Breusch Pagan test ● White test and Breusch Pagan test ● Simple correlation coefficient and Variance inflation factors (VIF) ● Dickey Fuller test and Breusch Pagan test
Simple correlation coefficient and Variance inflation factors (VIF)
Sum of squared errors is computed as: (Residual Sum of squares) ● Sum of squared distances between the predicted value of the dependent variable and its mean ● sum of squared distances between the dependent variable and its mean ● Sum of squared distances between the dependent variable and its predicted value ● Sum of squared distances between the dependent variable and the independent variable
Sum of squared distances between the dependent variable and its predicted value
N-k-1 is how the degrees of freedom are calculated in case of: ● Chi square test (nei mismunandi eftir aðstæðum) ● Ttest ● White test (df = k) ● F test ( df1 = k í reg k-1 í anova df2 = n-k-1 í reg n-k í anova)
T test
N-k-1 is how the degrees of freedom are calculated in case of: ● T-test (t-test for individual coefficient) ● Lagrange Multiplier test (chi-squared, df = the number of constraints in the H0) ● White test (flókna jafnan) ● Chi-squared test
T-test
A marketing firm wants to check whether there is an association between the amount of Nutella people eat and how much time they spend driving a car. Which method could they use to test this association? ● T-test ● Wilcoxon Signed Rank test ● Contingency table based on a chi-square distribution ● Durbin-Watson test
T-test and Contringency table based on a chi-square distribution
Arna has been analysing yearly GDP for the last 38 years. She has set up a model in which she explains GDP with 2 explanatory variables. While testing for positive autocorrelation she finds out that d is equal 1.68 at 5% significance level. What is the conclusion of this test? ● Test indicates no positive autocorrelation ●Test indicates no negative autocorrelation ●Test indicates negative autocorrelation ●Test indicates positive autocorrelation
Test indicates no positive autocorrelation
Sam wants to analyse electricity prices. He has set up a dynamic model with 3 explanatory variables, his dataset includes 24 monthly observations. What is the conclusion of the one sided Durbin-Watson test at 5% significance level where d=1.89? ● Test indicates positive autocorrelation ● Test indicates no positive autocorrelation ● None of the answers is correct ● Test indicates negative autocorrelation calcuation: N = 24 K= 3 dL (tafla) = 1.1 dU (tafla) = 1.66 d < dL - Reject H0 -Positive correlation d > dU - Do not reject H0-No correlation dL< d < dU- Inconclusive -Do not know
Test indicates no positive autocorrelation
Arna has been analysing yearly GDP for the last 85 years. She has set up a model in which she explains GDP with 4 explanatory variables. While testing for positive autocorrelation she finds out that d is equal 1.68 at 5% significance level. What is the conclusion of this test? ●Test is inconclusive ●Test indicates negative autocorrelation ●Test indicates no autocorrelation ●Test indicates positive autocorrelation
Test is inconclusive
OLS assumption 6 says that there should be no multicollinearity problem in the model. This means: ● That errors of the model should come from the same distribution (nei multicoll hefur ekkert með errori að gera, þýðir homoskedastity) ● That independent variables should not be correlated with each other (það er engin fylgni milli x-a og multicall þ.a.l. ekki vandamál) ● That the variation in the independent variables should explain most of the variation in the dependent variable (= R_squared er hátt, en það þarf meira til) ● That all coefficients in the model should be significant (svipað og c það þarf meira til)
That independent variables should not be correlated with each other (það er engin fylgni milli x-a og multicall þ.a.l. ekki vandamál)
Correlation coefficient equal to -0.9 between two independent variables in a regression model indicates: ●That there is multicollinearity problem with the predicted values of the model ● That there is multicollinearity problem with the errors of the model ● That there is multicollinearity problem in the model ●That the model is fine
That there is multicollinearity problem in the model
A value of Durbin - Watson statistic is d=0.03. This means: ● The assumption of independence of errors is violated. ● The assumption of serial correlation of errors is not violated. ● The null hypothesis of autocorrelation cannot be rejected. ● There is negative serial correlation.
The assumption of independence of errors is violated.
Anne wants to find out whether the quarterly number of sold/bought movie tickets follows a uniform distribution. She has collected quarterly data over the last 5 years and performed a chi-square goodness-of-fit test. The calculated chi-square value is 5.21. What is the correct decision for that test at 1% significance level? ● The critical value is equal to 13.277. Since it is larger than 5.21 we reject the null hypothesis; the number of sold tickets follows a uniform distribution. ● The critical value is equal to 11.345. Since it is larger than 5.21 we fail to reject the null hypothesis; the number of sold tickets follows a uniform distribution ● The critical value is equal to 11.345. Since it is larger than 5.21 we fail to reject the null hypothesis; the number of sold tickets does not follow a uniform distribution ● The critical value is equal to 13.277. Since it is larger than 5.21 we fail to reject the n
The critical value is equal to 13.277. Since it is larger than 5.21 we fail to reject the null hypothesis; the number of sold tickets follows a uniform distribution.
In testing the significance of a subgroup of coefficients in a multiple regression model, a large value of the F-test statistic indicates that: ● Most of the variation in y is unexplained by the regression model ● The constrained model provides a better fit than the unconstrained model ● The longer model (with the subgroup) is better than the constrained model ● The model provides a poor fit.
The longer model (with the subgroup) is better than the constrained model
Value of variance inflation factor above 12 does not indicate that: ● The model is correctly specified (model specification and VIF have nothing in common) ● (segir Helgi Magg, held öfugt) Some of the independent variables are highly correlated with each other ● That the R squared in the auxiliary model explaining independent variables has a high fit ● That there is multicollinearity problem in the model
The model is correctly specified (model specification and VIF have nothing in common)
An OLS model with a dummy dependent variable is characterised by: ● A high correlation of independent variables and the dependent variable ● A high quality of output ● The predicted values of the dependent variable not being limited to the 0 and 1 interval ● By having stationary results
The predicted values of the dependent variable not being limited to the 0 and 1 interval
Which of the below assumptions are one of the three assumptions required to perform the Durbin-Watson test? ● The correlation between any of the independent variables is below 0.8 ● The regression model includes a lagged dependent variable as an independent variable ● The regression model does not include a lagged dependent variable as an independent variable ● None of the above
The regression model does not include a lagged dependent variable as an independent variable
Total sum of squares is: ● The variation in Y explained by the variation in X. ● The variation of observed Y values from the regression line. ● The variation of the Y values around their mean. ● The variation in the slope of regression lines from different possible samples.
The variation of the Y values around their mean.
Value of variance inflation factor below 5 is an indication that: ● That jointly all coefficients in the model are significant (overall F test) ● There is evidence that independent variables are highly correlated with the dependent variable (þá væri VIF > 5) ● There is no evidence of multicollinearity in the model ● The all OLS assumptions are met
There is no evidence of multicollinearity in the model
The range of t-statistic values is: ● Unlimited ● -1 to 1 ● Only negative ● Only positive (range á chi squared og F er 0-endalaust) (range á R_squared er 0-1)
Unlimited
How can we correct for pure Serial Correlation? ● With a Granger causality test ● Use the Generalized least squares technique ● Add a new variable to the model ● All of the above
Use the Generalized least squares technique
A test statistic (for e.g. t-test) is built as a ratio of: ● Variance explained by the model divided by variance not explained by the model. ● Error of the model divided by the effect that we try to estimate. ● Variance not explained by the model divided by the error of the model. ● Variance explained by the mode divided by the effect we try to estimate.
Variance explained by the model divided by variance not explained by the model.
Which is the correct interpretation of the following model? log(wage)i=0.584+0,083i+0,02femalei where:wage is measured in dollars per week;education is measured in yearsand female is a dummy variable equal to 1 for a woman and 0 for a man. ● Holding gender constant, if education increases by 1%, wage will increase by 8.3%. ● Holding education constant, a female earns 0.02% more than a male. ● When comparing males and females with the same education, males earn 2% less than females. ● For both females and males an extra year of education increases their salary by 0.083%. ● The effect of education on wages is different for both sexes.
When comparing males and females with the same education, males earn 2% less than females.
When an F-test for a subgroup cannot be used? ● When there are seasonal dummies in the multiple regression model ● When there is an intercept in the multiple regression model (reg model without intercept is generally a very bad idea) ● When we are interested in an overall significance of the multiple regression model ● When we are interested in the significance of three out of seven variables in the multiple regression model
When we are interested in an overall significance of the multiple regression model
What is Perfect multicollinearity? ● Linear functional relationship between two or more independent variables so strong that it can significantly affect the estimations of coefficients. ● A violation of Classical Assumption V. ● Where the variation in one explanatory variable can be completely explained by movements in another explanatory variable. ● None of the above
Where the variation in one explanatory variable can be completely explained by movements in another explanatory variable.
Breusch Pagan test allows to check: ● Whether errors are stationary ● Whether residuals are normally distributed ● Whether independent variables are heavily correlated ● Whether errors have the same variance
Whether errors have the same variance
White test is used to check: ● Whether errors are normally distributed ● Whether coefficients are significant ● Whether errors have the same variance ● Whether errors are stationary
Whether errors have the same variance
A company producing chocolate bread spread came up with two new interesting tastes: Blueberry Nuttella and Blackberry Nuttella. They want to analyse which of these two flavours is more popular. For this purpose they ask a number of random people to taste and rate. Every person rates both tastes.Which statistical test should be used to evaluate their answers? ● One way ANOVA ● Wilcoxon Signed Rank Test (basically t-test með miðgildi en ekki meðaltali) ● Kruskal Wallis ● T-test
Wilcoxon Signed Rank Test
If the Durbin-Watson statistic d has values between 0 and dL this indicates: ● a positive first-order autocorrelation ●a negative first-order autocorrelation ●no first-order autocorrelation at all ●an inconclusive test.
a positive first-order autocorrelation
You are interested in modeling financial results of Google for the past 36 months. Which of the following problems would most likely affect your model? ● autocorrelation ● cointegration ● normality of residuals ● specification bias
autocorrelation
While testing stationarity of the series we get the following result: Dickey Fuller statistics is equal to -0.3This indicates that most probably the series is__ and we need to __. ● stationary: do nothing ● non-stationary; take first difference ● stationary; take first difference ● non-stationary; do nothing
non-stationary; take first difference
The range of values of Chi-squared statistic is: (White test, Breuch-Pagan, Kruskal Wallis) ● only positive (allar tölur í 2 veldi eru positive) ● unlimited ● from 0 to 100 ● only negative
only positive (allar tölur í 2 veldi eru positive)
First thing to be checked when we work with time series is: ● cointegration ● overall significance of the model ● Granger causality ● stationarity (0,5) auto correlation alltaf fyrst(0,7) usually we always check on overall significance 1. Stationary 2. Cointegration 3. Granger Þetta á bara við um time series
stationarity
In testing the validity of a multiple regression model, a large value of the F-test statistic indicates that: ● Most of the variation in the independent variables is explained by the variation in y. ● the model has significant explanatory power because at least one slope coefficient is not zero. ● Most of the variation in y is unexplained by the regression equation. ● the model provides a poor fit.
the model has significant explanatory power because at least one slope coefficient is not zero.
What are the consequences of multicollinearity? ● OLS estimators are not BLUE ● the variances and standard errors of the estimates will increase ● coefficients will be overestimated ● estimates will be biased
the variances and standard errors of the estimates will increase
Serial correlation does not cause: ● OLS no longer being the minimum variance method ● Biased OLS coefficients ● OLS no longer giving Best linear estimates ● Unreliable confidence intervals
Biased OLS coefficients
The best model to analyse the effect education has on becoming a vegetarian (coded as 1 for being a vegetarian and 0 for not being a vegetarian) is: ● Binomial logit ● OLS ● Linear probability model ● Time series
Binomial logit
What are some of the major consequences of multicollinearity? ● Estimates will remain unbiased, and the computed t-scores will fall ● The variances and standard errors of the estimates will decrease ● The variances and standard errors of the estimates will increase ● Estimates will become insensitive to changes in specification ● Both A and C are correct
Both A and C are correct
What is the correct interpretation of the following model?𝑙𝑜𝑔(𝑤𝑎𝑔𝑒)$ = 0.584 + 0.083𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛$ + 0.02𝑓𝑒𝑚𝑎𝑙𝑒 + 0.004(𝑒𝑑𝑢𝑐𝑎𝑡𝑖𝑜𝑛 ∗ 𝑓𝑒𝑚𝑎𝑙𝑒 ) where:wage is measured in dollars per week;education is measured in yearsand female is a dummy variable equal to 1 for a woman and 0 for a man. ● Holding gender constant, if education increases by 1%, wage will increase by 8.3%. ● An extra year of education is increasing female salary by 8.7%. ● Holding education constant, a female earns 0.02% more than a male. ● When comparing males and females with the same education, males are earning 2% less than females. ● For both females and males an extra year of education increases their salary by 0.083%.
When comparing males and females with the same education, males are earning 2% less than females.
A way to check whether multicollinearity problem is present in the model is to ● make a scatter plot of the errors of the model ● calculate confidence intervals ● set up an F-test testing the subset of the coefficients included in the model ● calculate Variance Inflation Factors
calculate Variance Inflation Factors
What is the current effect of sales on y in the period t and what is the long-term effect of a 1-unit increases in sales in period t? y=-43,8+6salest+0,25yt-1 ●current effect 6; long-term effect 0.25 ● current effect 0.25: long-term effect 6 ● current effect 0.25; long-term effect 43.8 ● current effect 6; long-term effect 8 Long term:B / (1-0,25) = 6 / 0,75 = 8
current effect 6; long-term effect 8