Quant Test #2
PARETO Chart is used...
...to display the "vital few" causes of problems - causes are displayed in a column or car chart sorted in order of frequency
The standard error of the regression
is based on squared deviations from the regression line.
Which is a characteristic of the variance inflation factor
it indicates the predictor's degree of multicollinearity
In correlation analysis, neither X nor Y is designated as the independent variable
true
Which statement is correct regarding forecasting using an exponential smoothing model?
As α increases, more weight is put on recent data.
Calculate the two-tail p-value
Use Excel's function =T.DIST.2T(t,deg_freedom)
How to determine if it is an outlier.
Use empirical rule. Example in photo.
Calculate the t test statistic. (tcalc)
Use equation.
R2adj can OR cannot exceed R2 even if there are several weak predictors?
CANNOT (R2adj is smaller than R2, and a large difference suggests unnecessary predictors)
Which data would be measured over an interval of time as opposed to at a point in time?
Costco
A log transformation might be appropriate to alleviate which problem(s)?
Heteroscedastic residuals
In a regression with n = 100 observations and k = 5 predictors, the criterion for high leverage is
Hi >= 0.12 Explanation: 2(k + 1)/n = 2(5 + 1)/100 = .12, so hi = .12 or more would indicate high leverage.
Does the picture below show strong evidence of heteroscedasticity against the predictor Wheelbase?
No, the plot appears mostly random against wheelbase.
In the following regression, which are the two best predictors?
NumCyl, HpMax
Which is not a characteristic of the logistic regression model?
Predictions from the fitting logit model are either 0 or 1
Which type of Excel chart is considered Novelty?
Pyramid Chart (these utilize the area trick and are hard to read)
Consider the following linear trend equation of an industry's sales: yt = 120 + 12t, where t is measured in years and sales are measured in millions of dollars. Which is the most reasonable conclusion?
We would forecast that sales will increase $12 million in the next year.
Standard Residuals
ei/se
In a sample of n = 27, the critical value of Student's t for a two-tailed test of significance for a simple bivariate regression at α = .05 is
±2.060
In a sample of n = 23, the critical value of Student's t for a two-tailed test of significance for a simple bivariate regression at α = .05 is
±2.080
The critical value for a two-tailed test of H0: β1 = 0 at α = .05 in a simple regression with 22 observations is
±2.086
For the fitted time-series trend model yt = 9.23e−0.0867t, it is correct to say that
the time series would be declining
For the fitted time-series trend model yt = 9.23e−0.0867t, it is correct to say that
the time series would be declining.
Prediction intervals for Y are narrowest when
the value of X is near the mean of Y
If the residuals from a fitted regression violate the assumption of homoscedasticity, we know that
their variance is not constant
LINE CHARTS are LEAST useful in visualizing categorical data
they are only used to display numerical data over time
When would we use a logistic regression model?
to predict an event that occurs or does not occur
A negative value for the correlation coefficient (r) implies a negative value for the slope (b1).
true
In a simple linear regression, the p-value of the slope will always equal the p-value of the F-statistic
true
In a simple regression, the correlation coefficient r is the square root of R^2
true
A high leverage observation will have
unusual values of one or more X values.
To initialize the forecasts in an exponential smoothing process, it is acceptable to
use the average of the first six observed data values.
The correlation coefficient r measures the strength of the linear relationship between two variables
TRUE
Sturge's Rule
K=1+3.3log(x)
In a simple regression, which would suggest a significant relationship between X and Y
Large t statistic for the slope
If the F statistic is insignificant, the t statistics for the predictors also are insignificant at the same a.
TRUE
In a simple regression, the correlation coefficient r is the square root of R2.
TRUE
In least squares regression, the residuals e1, e2, . . . , en will always have a zero mean.
TRUE
Moving averages are most useful for irregular data with no clear trend.
TRUE
Judgement sampling is sometimes preferred over random sampling when
"time is short and the sampling budget is limited"
A researcher's regression results are shown below using n=8 observations. Variable Coeff. Std. err Intercept 0.1667 2.8912 slope 1.8333 0.2307 Which is the 95% confidence interval for the slope?
(1.268, 2.398) d.f. = 8-2 = 6 t.025 = 2.447 1.8333 =+/- (2.447)(0.2307
An estimated regression for a random sample of observations on an assembly line is Defects = 4.4 + 0.055 Speed, where Defects is the number of defects per million parts and Speed is the number of units produced per hour. The estimated standard error is se = 1.11. Suppose that 125 units per hour are produced and the actual (observed) defect rate is Defects = 4.3. (a) Calculate the predicted Defects. (b) Calculate the residual. (c) Calculate the standardized residual using se. (d) Is this observation an outlier?
(a) 4.4 + 0.055(125) = 11.28 (b) 4.3-11.28= -6.98 (c) -6.98/1.11= -6.288 (d) Yes
The fitted regression CarTheft = 1,636 − 38.6 MedianAge, where CarTheft is the number of car thefts per 100,000 people by state and MedianAge is the median age of the population. (a-1) If MedianAge = 1 year, then CarTheft = (a-2) Choose the correct statement. (b) If MedianAge = 40, then CarTheft = (c) Choose the right option.
(a-1) 1,636-38.6(1) = 1,597.4 (a-2) An increase in median age decreases car thefts. (b) 1,636-38.6(40)= 92 (c)The intercept would not be meaningful because you would not have a median age of zero for any state.
In a sample of n=20, the critical value of the correlation coefficient for a two-tailed test at a=.05 is
+-.444, Use rcrit = t.025/(t.0252 + n − 2)1/2 = (2.101)/(2.1012 + 20 − 2)1/2 = .4437 for d.f. = 20 − 2 = 18.
The critical value for a two-tailed test of Ho: B1 = ) at a=.05 in a simple regression with 22 observations is
+/- 2.086 d.f= 22 -2 = 20 look at t chart
The regression equation Salary = 28,000 + 2700 YearsExperience + 1900 YearsCollege describes employee salaries at Ramjac Corporation. The standard error is 2400. Mary has 10 years' experience and 4 years of college. Her salary is $58,350. What is Mary's standardized residual (approximately)?
-1.771
What is the approximate slope of a linear trend for the value of Bob's beer can collection?
-10
In a sample of n = 36, the Student's t test statistic for a correlation of r = −.450 would be
-2.938 (Use equation in photo)
A fitted regression Profit = -570 + 30Sales was estimated from a random sample of 20 pharmacies. For a pharmacy with sales = 10, we predict that Profit will be
-270 profit = -570 + 30(10)
The compound growth rate in the fitted trend equation yt = 228e−.0982t is
-9.82
The compound growth rate in the fitted trend equation yt = 228e−.0982t is
-9.82 percent
Simple Regression analysis means...
...that we only have one explanatory variable.
The two-tailed abortion correlation test statement that is NOT correct is...
...the first column of the table shows evidence of multicollinearity.
Find t.025 for a two-tailed test for zero correlation at α=0.05.
1. Calculate degrees of freedom: 5-2=3 2. Look it up in T-chart
Obtain the regression equation. (Example)
1. Open Excel and run Regression Data Analysis 2. Look at intercept and x coefficients 3. Form equation. (first x then intercept) 4. y=4.5283 + -42,387.3735
A fitted regression for an exam in Prof. Hardtack's class showed Score = 20 + 7 Study, where Score is the student's exam score and Study is the student's study hours. The regression yielded R2 = 0.50 and SE = 8. Bob studied 9 hours. The quick 95 percent prediction interval for Bob's grade is approximately
67 to 99
If yt=544e^o.o7t, which forecast for period 7 is correct?
888
Part of a regression output is provided below. Some of the information has been omitted. The approximate value of F is
89.66
Chi-Square test-statistic
= (abs-mean)squared / mean + ..... + =
Standard Deviation
= Square root of the sum of x minus x bar squared over n-1
A local trucking company fitted a regression to relate the travel time (days) of its shipments as a function of the distance traveled (miles). The fitted regression is Time = −7.126 + 0.0214 Distance. If Distance increases by 50 miles, the expected Time would increase by
1.07 days
Find the slope of the simple regression of Y on X.
1.833. We do this by calculating n*SSxy-SSx*SSy/n*SSx2-SSx2
If yt = 50e0.07t, which forecast for period 10 is correct?
100.7
If a fitted trend equation is yt = 227e−.098t, which is the forecast for period 5?
139
If a fitted trend equation is yt = 184e−.047t, which is the forecast for period 4?
152
A fitted regression Profit = −570 + 30 Sales (all variables in thousands of dollars) was estimated from a random sample of pharmacies. From this regression, in order to break even (Profit ≥ 0), a pharmacy's Sales would have to be at least
19. 570/30= 19
In a sample of n=36, the critical value of student's t for a two-tailed test of significance of the slope for a simple regression at a=0.05 is
2.032; from appendix D, t0.025=+/-2.032 for d.f.=n-2=36-2=34
In a multiple regression with 6 predictors in a sample of 67 US cities, what would be the crtical value for an F-test of overall significance at a =0.05
2.25
In a sample of n = 23, the Student's t test statistic for a correlation of r = .500 would be
2.646
For these data, what is the three-period centered moving average for period 4?
33.00 or 34.00
Using exponential smoothing, if Ft = 33, yt = 41, and α = .20, what is the new forecast Ft+1?
34.6
A local trucking company fitted a regression to relate the travel time (days) of its shipments as a function of the distance traveled (miles). The fitted regression is Time = −7.126 + 0.0214 Distance, based on a sample of 20 shipments. The estimated standard error of the slope is 0.0053. Find the value of tcalc to test for zero slope.
4.04
A local trucking company fitted a regression to relate the travel time (days) of shipments as a function of the distance traveled (miles). The fitted regression is Time= -7.126 + 0,0214Distance, based on a sample of 20 shipments. The estiamted standard error of the slope is 0.0052. Find the value of tcalc to test for zero slope
4.04 = slope/std. error
In a regression with 7 predictors and 62 observations, degrees of freedom for a t test for each coefficient would use how many degrees of freedom?
54
In a simple bivariate regression with 60 observations, there will be how many residuals?
60 residuals
Multiplicative models are most appropriate over longer periods of time or when the data magnitude is growing rapidly over time
TRUE
Concerning a seasonal index for monthly data, which statement is incorrect?
Additive indexes are adjusted so they always sum to 12.
Calculate the number of unusual residuals and number of outliers.
An observation with a standardized residual that is larger than 3 (in absolute value) is deemed by some to be an outlier. Find unusual residuals with your regression analysis.
A realtor is trying to predict the selling price of houses in Greenville (in thousands of dollars) as a function of Size (measured in thousands of square feet) and whether or not there is a fireplace (FP is 0 if there is no fireplace, 1 if there is a fireplace). The regression output is provided below. Some of the information has been omitted. Which of the following conclusions can be made based on the F-test?
At least one of the predictors is useful in explaining Y
Pearson's correlation coefficient (r) requires that both variables be interval or ratio data.
TRUE
Coefficient of Variation
Calculate mean and standard deviation. Then, CV=100x(standard deviation/mean)
If the residuals violate the assumption of normality, we expect that
Confidence intervals may be unreliable.
Which data would be measured over an interval of time as opposed to at a point in time?
Costco's sales for fiscal year 2017.
When the predictor units of measurement differ greatly in magnitude, which action might be useful?
Decimal transformations to improve data conditioning
Calculate the residual.
Example in photo.
Calculate the standardized residual using se.
Example in photo.
In a CHI-SQUARE test for independence, observed and expected frequencies must sum across to the same row totals and down to the same column totals.
Expected frequencies reallocate the row (or column) total, so they must sum to the total.
A sample correlation r = .40 indicates a stronger linear relationship than r = −.60.
FALSE
Evans' rule says that if n=50 you need at least 5 predictors to have a good model
FALSE
If there is a binary predictor (x=0,1) in the model, the residuals may not sum to zero.
FALSE
In the fitted regression Y = 12 + 3X1 − 5X2 + 27X3 + 2X4 the most significant predictor is X3.
FALSE
In the fitted regression y= 12 +3x1 -5x2 +27x3 +2x4 the most significant predictor is X3
FALSE
Multiplicative models are avoided in business because they are too complicated.
FALSE
The linear model's R2 may exceed the quadratic model's R2 fitted to the same data.
FALSE
The model Y = β0 + β1X + β2X2 cannot be estimated by Excel because of the nonlinear term.
FALSE
The ordinary least squares method ensures that the residuals will be normally distributed.
FALSE
The shape of the fitted quadratic model yt = 324 − 42t − 1.3t2 is declining at first, then rising.
FALSE
The smoothing constant α indicates the weight assigned to the most recent forecast.
FALSE
Unlike other predictors, a binary predictor has a t-value that is either 0 or 1.
FALSE
Using the least squares formulas, the regression line must pass through the origin.
FALSE
positive autocorrelation results in too many centerline crossings in the residual plot over time.
FALSE
For a certain firm, the regression equation Bonus = 2,000 + 257 Experience + 0.046 Salary describes employee bonuses with a standard error of 125. John has 10 years' experience, earns $50,000, and earned a bonus of $7,000. John is an outlier.
FALSE explanation: John's standardized residual is (yactual − yestimated)/se = (7000 − 6870)/(125) = 1.04, which is not unusual.
Evans' Rule says that if n=50 you need at least 5 predictors to have a good model
FALSE (Evans rule is intended to prevent having too many predictors)
Confidence intervals for predicted Y are less precise when the residuals are very small
FALSE, Small residuals imply a small standard error and thus a narrower prediction interval.
In a chi-square goodness of fit test, a small p-value would indicate a good fit to the hypothesized distribution.
FALSE. (When the p-value is small, we are inclined to reject the hypothesized distribution)
In the model Sales = 268 + 7.37 Ads (both variables in dollars) an additional $1 spent on ads will increase sales by 7.37 percent.
FALSE. The slope coefficient is in the same units as Y (dollars, not percent)
The R2 statistic can only increase (or stay the same) when you add more predictors to a regression.
TRUE
A high variance inflation factor (VIF) indicates a significant predictor in the regression.
False
If SSE is near zero in a regression, the statistician will conclude that the proposed model probably has too poor a fit to be useful.
False
If a regression model's F test statistic is Fcalc = 43.82, we could say that the explained variance is approximately 44 percent.
False
In a regression model of student grades, we would code the nine categories of business courses taken (ACC, FIN, ECN, MGT, MKT, MIS, ORG, POM, QMM) by including nine binary (0 or 1) predictors in the regression.
False
The total sum of squares (SST) will never exceed the regression sum of squares (SSR).
False
There is one residual for each predictor in the regression model.
False
Plotting the residuals against a binary predictor (X = 0, 1) reveals nothing about heteroscedasticity.
False Explanation You can still spot wider or narrower spread at the two points X = 0 and X = 1.
When do you reject the null hypothesis of zero correlation?
If the absolute value of the t-value is greater than the critical value, you reject the null hypothesis. If the absolute value of the t-value is less than the critical value, you fail to reject the null hypothesis. ex. 3.419>3.182 --> reject the null hypothesis
Standard error of regression is based on squared deviations from the regression line.
In a simple regression, the standard error is the square root of the sum of the squared residuals divided by (n-2).
The coefficient of determination is the percentage of the total variation in the response variable Y that is explained by the predictor X.
TRUE
Fluctuations caused by strikes and floods are ______ fluctuations
Irregular fluctuations (unpredictable events in Y=TxCxSxI cannot be modeled as T, C, or S, so they must be I.
Which is a characteristic of an additive (as opposed to multiplicative) time-series model?
It is appropriate for short-term data with a steady dollar growth.
Characteristic of Standard Deviation
It is measured in the SAME UNITS AS THE MEAN. (Square deviations around the mean, but take square root of sum to get back to original units of X. Also, IS affected by outliers and its interpretation may not be nonintuitive.)
Which of the following is not a characteristic of the F distribution?
Its DOF vary, depending on a
The regression equation salary = 45,000 + 1500YearsExperience + 2800YearsCollege descrives employee salaries at Terminus Fissile Labs. The standard error is 2500. Lars has 15 years' experience and 4 years of college. His salary is $70.500. If this regression is valid, we conclude that
Lars is underpaid and his salary is an outlier
Calculate R2
Look at R2 in your regression analysis on excel. or square r. Or: SSR/SST
What is the regression equation?
Look at photo.
Additive models ARE OR ARE NOT most appropriate over longer periods of time or when data magnitude grows rapidly over time.
NOT
Data IS or IS NOT omitted unless it is proven to be an error
NOT
Regression analysis can be used for forecasting monthly time-series data using a trend variable and 11 binary predictors (one for each month except omitting one month).
TRUE
The MAD measures the average absolute size of the forecast error.
TRUE
If the trend model yt = a + bt + ct2 is fitted to a time series, we would get
R2 that is at least as high as the linear model.
A researcher's results are shown below using Femlab (labor force participation rate among females) to try to predict Cancer (death rate per 100,000 population due to cancer) in the 50 U.S. states. (df: 1,48,49) (SS: 5377.836, 49367.389, 54745.225) (MS: 5377.836, 1028.487) (F: 5.228879)
R2= .0982
Using a sample of 63 observations, a dependent variable Y is regressed against two variables X1 and X2 to obtain the fitted regression equation Y=76.40-6.388X1+0.870X2. The standard error of b1 is 3.453 and the standard error of b2 is 0.611. At a=0.05, we could....
REJECT H0:B2(less than or equal to)0 and conclude H0:B1(greater than)0. [For β1 we have tcalc = (−6.388)/(3.453) = −1.849, which is less than t.05 = −1.671 for d.f. = 60 in a left-tailed test. For β2 we have tcalc = (0.870)/(0.611) = +1.424, which does not exceed t.05 = +1.671 for d.f. = 60 in a right-tailed test. For a two-tailed test, t.025 = ±2.000, so neither coefficient would differ significantly from zero at α = .05. Evans' Rule is not violated, because n/k = 63/3 = 21.]
Which of the following measures of fit is unit free?
R^2 (Coefficient of determination)
Bivariate
Refers to the number of VARIABLES not observations
Cycles are usually ignored because there is no general theory to describe them.
TRUE
If Y1 = 216 and Y5 = 332, then the simple index number for period 5 is I5 = 153.7.
TRUE
If a qualitative variable has c categories, we would only use c-1 binaries as predictors
TRUE
Which is not a correct way to find the coefficient of determination?
SSR/SSE
Suppose the estimated quadratic model yt = 500 + 20t − t2 is the best-fitting trend of sales of XYZ Inc. using data for the past 20 years (t = 1, 2, . . . , 20). Which statement is incorrect?
Sales are increasing by about 20 units per year.
The four components of a time series are which of the following?
Seasonal, cycle, irregular, trend
The four components of a time series are which of the following?
Seasonal, cycle, irregular, trend. Correct
Which of the following best describes the decomposition modeling approach to forecasting?
Series are separated into trend, seasonal, irregular, and cyclical components.
To find which predictors are most helpful in increasing R2, we might consider
Stepwise regression.
A binary predictor has the same t test as any other predictor.
TRUE
A firm's income statement contains data measured over a period of time, as opposed to being measured at a point in time.
TRUE
A multiple regression with 60 observations should not have 13 predictors.
TRUE
A regression with 60 observations and 5 predictors does not violate Evans' Rule.
TRUE
A squared predictor is used to test for nonlinearity in the predictor's relationship to Y.
TRUE
Binary predictors shift the intercept of the fitted regression.
TRUE
The ill effects of heteroscedasticity might be mitigated by redefining totals (e.g., total number of homicides) as relative values (e.g., homicide rate per 100,000 population).
TRUE
The least squares regression line is obtained when the sum of the squared residuals is minimized.
TRUE
The quadratic model can never have more than one turning point (peaks or troughs).
TRUE
The random error term in a regression model reflects all factors omitted from the model.
TRUE
The shape of the fitted exponential model yt = 256e−.07t is always declining.
TRUE
The t test shows the ratio of an estimated coefficient to its standard error.
TRUE
Using the first observed data value is a common way of initializing the forecasts in the exponential smoothing model.
TRUE
The poisson goodness of fit test is inappropriate for continuous data.
TRUE (poisson data are integers)
For a regression with 200 observations, we expect that about 10 residuals will exceed two standard errors.
TRUE, +-.329
Which statement is most nearly correct regarding time-series trend models?
The exponential model would be linear if we take the natural log of yt.
Bob thinks there is something wrong with Excel's fitted regression. What do you say?
The estimated equation is obviously incorrect.
If we fit a linear trend to data that are growing exponentially, which is least likely?
The fit will be poor to the most recent data.
If you rerun a regression, omitting a predictor X5, which would be unlikely?
The numerator Degrees of Freedom for the F test will increase
In regression, the dependent variable is referred to as the response variable.
True
The larger the absolute value of the t statistic of the slope in a simple linear regression, the stronger the linear relationship that exists between X and Y.
True
Which statement is correct for a simple index number?
The simple relative index for period t = 5 is calculated as Y5/Y1.
Calculate SSxx, SSyy, and SSxy.
The sum of all items in the column.
Which statement is most defensible regarding the time series shown below?
There appears to be strong seasonality.
Which is not an assumed property of the errors in a population regression model?
They have zero variance
Which is not a characteristic of a trend?
Trend is often due to mixing two batches of materials (mixing two processes might produce excessive variation but not steady change)
A different confidence interval exists for the mean value of Y for each different value of X.
True
A widening pattern of residuals as X increases would suggest heteroscedasticity.
True
Confidence intervals for Y may be unreliable when the residuals are not normally distributed.
True
Given that the fitted regression is Y = 76.40 − 6.388X1 + 0.870X2, the standard error of b1 is 1.453, and n = 63, at α = .05, we can conclude that X1 is a significant predictor of Y.
True
High leverage for an observation indicates that X is far from its mean.
True
If SSR is 1800 and SSE is 200, then R2 is .90.
True
If the probability plot of residuals resembles a straight line, the residuals show a fairly good fit to the normal distribution.
True
If the residuals in your regression are nonnormal, a larger sample size might help improve the reliability of confidence intervals for Y.
True
In a multiple regression with 3 predictors in a sample of 25 U.S. cities, we would use F3,21 in a test of overall significance.
True
Two-tailed t-tests are often used because any predictor that differs significantly from zero in a two-tailed test will also be significantly greater than zero or less than zero in a one-tailed test at the same α.
True
When using the least squares method, the column of residuals always sums to zero.
True
The correlation coefficient r always has the same sign as b1 in Y = b0 + b1X.
True Explanation The t-test for the slope in simple regression gives the same result as the t-test for r. The t-test for the slope in simple regression gives the same result as the t-test for r.
The fitted intercept in a regression has little meaning if no data values near x=0 have been observed
True (predicting Y for x=0 makes little sense if the observed data have no values near x=0)
Nonnormality of residuals is not usually considered a major problem unless there are outliers.
True Explanation Serious nonnormality can make the confidence intervals unreliable.
A regression of Y using four independent variables X1, X2, X3, X4 could also have up to four nonlinear terms (X2) and six simple interaction terms (XjXk) if you have enough observations to justify them.
True Explanation We must count all the possible squares and two-way combinations of four predictors.
Calculate the sample correlation coefficient (r).
Use Excel's function =CORREL(array1, array2)
The fitted sales trend over the last 12 years is yt = 14.7e0.063t. We can say that
a continuously compounded model was used.
The fitted sales trend over the last 12 years is yt = 14.7e0.063t. We can say that
a continuously compounded model was used. Correct
In a simple bivariate regression with 25 observations, which statement is most nearly correct?
a leverage statistic of 0.16 or more would indicate high leverage explanation: 4/n -> 4/25 = 0.16
Which is indicative of an inverse relationship between X and Y?
a negative correlation coefficient
A financial regression yielded a standard error of 12 dollars, so a residual of 23 dollars would be
a rather poor prediction.
A standardized residual equal to −2.205 indicates
a rather poor prediction.
independent variable
a variable (often denoted by x ) whose variation does not depend on that of another. The experimental factor that is manipulated; the variable whose effect is being studied. Ex. Median Income
dependent variable
a variable (often denoted by y ) whose value depends on that of another. The measurable effect, outcome, or response in which the research is interested. The outcome factor; the variable that may change in response to manipulations of the independent variable. Ex. Median Price
In a sample of size n = 23, a sample correlation of r = .400 provides sufficient evidence to conclude that the population correlation coefficient exceeds zero in a right-tailed test at
a=0.05 but not a=0.01
If the residuals violate the assumption of autocorrelation, we know that they
are not independent
In a sample of size n = 36, a sample correlation of r = −.450 provides sufficient evidence to conclude that the population correlation coefficient differs significantly from zero in a two-tailed test at
both a=0.01 and a=0.05
A news network stated that a study had found a positive correlation between the number of children a worker has and his or her earnings last year. You may conclude that
causation is in serious doubt
=CORREL(Xdata, Ydata) is a measure of association that is unit-free
covariance depends on the units of measurement of x and y while the correlation coefficient is always on a scale from -1 to 1
Which trend would you choose to forecast the number of tractors sold in 2010?
either
The unexplained sum of squares measures variation in the dependent variable Y about the:
estimated Y values.
The fitted annual sales trend is Yt = 187.3e−.047t. On average, sales are
falling by a declining absolute amount each year.
If a regression model's F test statistic is Fcalc = 43.82 we could say that the explained variance is approx. 44%
false
Plotting the residuals against a binary predictor (x=0.1) reveals nothing about the heteroscedasticity
false
Suppose you fit a (linear or nonlinear) trend regression to a monthly time series and discover that the R2 is only 18 percent. The best conclusion is that
fitting a seasonal component could raise the R2.
An observation with extreme values in one or more independent variables (predictors)
has increased influence on the estimated coefficients.
Multiple Regression has more than one... ?
independent variable (predictor)
Which trend would you choose to forecast the number of tractors sold in 2010?
linear
Calculate degrees of freedom
n-2
For a given set of values for x1, x2, . . . , xk the confidence interval for the conditional mean of Y is
narrower than the prediction interval for the individual Y value.
The ordinary least squares (OLS) method of estimation will minimize`
neither the slope or intercept
The error term εi in the population regression model yi = β0 + β1 xi + εi is assumed to be
normally distributed
Small Sample Generalization
one instance proves little
Using a sample of 63 observations, a dependent variable Y is regressed against two variables X1 and X2 to obtain the fitted regression equation Y = 76.40 − 6.388X1 + 0.870X2. The standard error of b1 is 3.453 and the standard error of b2 is 0.611. At α = .05, we could
reject H0: β2 ≤ 0 and conclude H0: β1 > 0.
The backward elimination of stepwise regression
sometimes misses the best model for a given number of predictors
A fitted regression Profit = 262 + 1.51 Sales (all variables in thousands of dollars) was estimated from a random sample of 15 small coffee kiosks. We can say that
the intercept does not seem reasonable.
When comparing the 90 percent prediction and confidence intervals for a given regression analysis
the prediction interval is wider than the confidence interval
Multiplicative models are used when
when data magnitudes are changing significantly
When homoscedasticity exists, we would expect that a plot of the residuals versus the fitted Y
will show no pattern at all.
interaction term
x1+x2
Which estimated multiple regression contains an interaction term?
y = 88 + 11x1 + 7x1x2 + 5x2
Which estimated multiple regression has nonlinearity tests?
y = − 92 − 5x1 + 6x12 + 18x2 − 12x22