Applied stat exam 1
If x and y in a regression model are totally unrelated, _______. (on previous test)
the coefficient of determination would be 0
The assumptions underlying simple regression analysis include _______
the error terms are independent
The coefficient of determination is the proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x).
true
The difference between the actual y value and the predicted y value found using a regression equation is called the residual.
true
The process of constructing a mathematical model or function that can be used to predict or determine one variable by another variable is called regression analysis.
true
To determine whether the overall regression model is significant, the F-test is used.
true
Accoding to the following graphic, X and Y have ______. (on previous test)
virtually no correlation
7)Annie's regression model can be written as: ____ (on previous test)
y = -.14156 + .105195x
Which of the following equations represents a linear relationship between y and x? (Note: m and b are constants.)
y = m x + b
If the standard error of the estimate for a regression model fitted to a large number of paired observations is 1.75, approximately 68% of the residuals would lie within ______
−1.75 and +1.75
If x and y in a regression model are totally unrelated, ____
the coefficient of determination would be 0
Louis Katz, a cost accountant at Papalote Plastics, Inc. (PPI), is analyzing the manufacturing costs of a molded plastic telephone handset produced by PPI. Louis's independent variable is production lot size (in 1,000's of units), and his dependent variable is the total cost of the lot (in $100's). Regression analysis of the data yielded the following tables. For a lot size of 10,000 handsets, Louis' model predicts total cost will be __
$757.60 (add the two coefficients and multiply by 10
One of the assumptions made in simple regression is that _________
-the model is linear. -The error terms have constant variances -The error terms are independent. -The error terms are normally distributed.
The coefficient of correlation in a simple regression analysis is = - 0.6. The coefficient of determination for this regression would be ___
.36
Louis Katz, a cost accountant at Papalote Plastics, Inc. (PPI), is analyzing the manufacturing costs of a molded plastic telephone handset produced by PPI. Louis's independent variable is production lot size (in 1,000's of units), and his dependent variable is the total cost of the lot (in $100's). Regression analysis of the data yielded the following tables. The correlation coefficient between Louis's variables is ______________
.73 (figure out how to compute correlation coefficient)
In the regression equation, y = 49.56 + 0.97x, the slope is ______
.97
Suppose your fit a least squares line to 25 data points and the calculated value of SSE is 8.56, what is the estimation of the variance (σ2) of the random error term?
0.372
In a regression analysis if SST = 200 and SSR = 200, r 2 = _______
1.00
Louis Katz, a cost accountant at Papalote Plastics, Inc. (PPI), is analyzing the manufacturing costs of a molded plastic telephone handset produced by PPI. Louis's independent variable is production lot size (in 1,000's of units), and his dependent variable is the total cost of the lot (in $100's). Regression analysis of the data yielded the following tables. Louis's sample size (n) is ________________.
13 Figure out how to do this
A researcher has developed a regression model from fourteen pairs of data points. He wants to test if the slope is significantly different from zero. He uses a two- tailed test and α = 0.01. The critical table t value is ____
3.055 (figure this one out)
The following regression model was fitted to sample data with 12 observations: ŷ= 30 + 4.50x. What is the change in the predicted value of y for a unit change in the value?
4.50
Annie's sample size is ______. (on previous test)
6
If a B-school wants to predict or forecast the graduating GPA of entering freshmen (an unknown variable of interest), it might look at, among other things, their SAT scores (a known variable). Which of the following statements does not represent this situation?
GPA is a fixed variable and does not vary among graduating students
If a B-school wants to predict or forecast the graduating GPA of entering freshmen (an unknown variable of interest), it might look at, among other things, their SAT scores ( a known variable). Which of the following statement does NOT represent this situation? (on previous test)
GPA is a fixed variables and does not vary among graduating student
If the plot of the residuals is cone shaped, which assumption is violated? (on previous test)
Homoscedasticity
Which of the following statement(s) is/are correct? i. The width of the prediction interval for the predicted value of Y is dependent on the standard error of the estimate, the value of X for which the prediction is being made, and the sample size. ii. The range of the prediction interval is always wider than the range of the confidence interval for the same X value. iii. Confidence interval is an estimate of a single value of Y for a given X. (on previous test)
I and II
Which of the following statements are right? i. The first step in simple regression analysis usually is to construct a scatter plot. ii. In simple regression analysis, the error terms are assumed to be linear, independent, and normally distributed with zero mean and constant variance. iii. The proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x) is called the coefficient of correlation. (on previous test)
I and ii
Which of the following description is WRONG? (on previous test)
If the correlation coefficient between two variables is -1, it means that the two variables are not related
Which of the following description is WRONG? (on previous test)
It is best to use Pearson correlation to measure the strength of the non-linear association between two variables
What is the estimated mean chance in the dependent variable if the independent variable goes up 1 unit? (on previous test)
It is greater than 0.15 but less than or equal to 0.22
Using alpha = .05, Annie should_____. (on previous test)
Reject H0: B1 = 0
The least squares method minimizes which of the following? (on previous test)
SSE
A standard deviation of the error of the regression model is called the (on previous test)
Standard error of estimate
Which of the following statement is WRONG? (on previous test)
The degree of freedom in regression is equal to the number of observation
Which of the following statement is wrong (on previous test)**
The degree of freedom in regression is equal to the number of observation
Which of the following is not an assumption of the regression model?
The error terms decrease as x values increase
Which if the following description is WRONG (on previous test)
The standard error of the estimate, denoted Se, is the square root of the sum of the squares of the vertical distance between the actual Y values and the predicted values of Y
Suppose we are making predictions of the dependent variable y for specific values of the independent variable x using a simple linear regression model holding the confidence level constant. Let Width (C.I) = the width of the confidence interval for the average value y for a given value of x, and Width (P.I) = the width of the prediction interval for a single value y for a given value of x. Which of the following statements is true?
Width (C.I) < Width (P.I)
The following residuals plot indicates_____. (on previous test)
a nonlinear relation
A graphical tool to illustrate the relationship between matched observations of two variables (e.g. y = the annual cost of operating a commercial airliner and x = the annual number of passengers served by the airline), is _____. (on previous test)
a scatter plot
The proportion of variability of the dependent variable accounted for or explained by the independent variable is called the _____
coefficient of determination
A quality Manager is developing a regression model to predict the total number of defects as a function of the day of the week the item is produced. Production runs are done 10 hours a day, 7 days a week. The explanatory (independent) variable is (on previous test)
day of the week
In a simple regression the coefficient of correlation is the square root of the coefficient of determination.
false
In the simple regression model, y = 21 − 5x, if the coefficient of determination is 0.81, we can say that the coefficient of correlation between y and x is 0.90.
false
The strength of a linear relationship in simple linear regression change if the units of the data are converted, say from feet to inches.
false
The variability in the estimated slope is smaller when the x-values are more spread out.
false
Regression output from Excel software directly shows the regression equation. (on previous test)
false, Not excel but possibly Minitab?
Prediction intervals get narrower as we extrapolate outside the range of the data.
false, intervals get wider
If the correlation coefficient between two variables is -1, it means that the two variables are not related.
false, it is a perfect negative correlation
The F-value to test the overall significance of a regression model is computed by dividing the sum of squares regression (SSreg) by the sum of squares error (SSerr).
false, it is computed by testing the same thing as the test in simple regression.
The proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x) is called the coefficient of correlation.
false, it is the coefficient of DETERMINATION
In regression, the predictor variable is called the dependent variable.
false, not dependent but INdependent
Data points that lie apart from the rest of the points are called deviants.
false, not deviants but Outliers
One of the assumptions of simple regression analysis is that the error terms are exponentially distributed
false, not exponentially distributed but NORMALLY distributed
In regression, the variable that is being predicted is usually referred to as the independent variable.
false, not independent but dependent
The standard error of the estimate, denoted se, is the square root of the sum of the squares of the vertical distances between the actual Y values and the predicted values of Y.
false, not standard error of estimate (Se) but SUM of Square Errors (SSE)
The range of admissible values for the coefficient of determination is −1 to +1.
false, range is 0 to 1
The slope of the regression line, y = 21 − 5x, is 21.
false, slope is -5
The slope of the regression line, y = 21 − 5x, is 5.
false, slope is -5
A cost accountant is developing a regression model to predict the total cost of producing a batch of printed circuit boards as a linear function of batch size (the number of boards produced in one lot or batch). The intercept of this model is the ______.
fixed cost
A hospital administrator developed a regression line, y = 30 + 2x, to predict the number of full-time employees (FTE) needed using the number of beds. The slope of this regression line suggests this: ___ (on previous test)
for a unit increase in the number of beds, the number of FTEs is predicted to increase by 2
A hospital administrator developed a regression line, y = 30 + 2x, to predict y = the number of full-time employees (FTE) needed using x = the number of beds. The slope of this regression line suggests this: ________
for a unit increase in the number of beds, the number of FTEs is predicted to increase by 2
If the error variances of a regression model are not constant, it is called ____
heteroscedasticity
the assumption of constant error variance in regression analysis is called ___
homoscedasticity
Which of the following statements are wrong? i. Regression output from Excel software directly ANOVA table and the regression equation ii. In order to obtain the best model to explain the relationship between variables, we can use regression analysis which assumes least sum of the error. iii.The greater the SSE, the greater the standard error of the estimate is. (on previous test)
i and ii
Which of the following statements are right? i.The proportion of variability of the dependent variable (y) accounted for or explained by the independent variable (x) is called the coefficient of correlation ii. In order to obtain best linear unbiased estimator, the error terms are assumed to be normally distributed, independent, linear, and homoscedesticity. iii. the direction of a scatter plot between two variables shows the sign of the correlation coefficient. (on previous test)
ii and iii
For a certain data set the regression equation is y = 29 - 5x. The correlation coefficient between y and x in this data set _______.
is negative
For certain data set the regression equation is y = 29x - 5. The correlation coefficient between y and x in this data set is (on previous test)
is negative
The values of b0 and b1 in a regression equation are determined using sample data through a process called ______
least squares analysis
A value of -1 for the coefficient of correlation between two variables means that the two variables are ___
perfectly related
For the following scatter plot and regression line, at x = 48 the residual is ____. (on previous test)
positive
If there is positive correlation between two sets of numbers, then _______
r > 0
if there is perfect negative correlation between two sets of numbers, then _______. (on previous test)
r= -1
Louis Katz, a cost accountant at Papalote Plastics, Inc. (PPI), is analyzing the manufacturing costs of a molded plastic telephone handset produced by PPI. Louis's independent variable is production lot size (in 1,000's of units), and his dependent variable is the total cost of the lot (in $100's). Regression analysis of the data yielded the following tables. Using α = 0.05, Louis should _____
reject H0: β1 = 0
Least squares regression line is one that _____
results in the smallest sum of errors squared
A standard deviation of the error of the regression model is called the _____
standard error of the estimate
The total of the squared residuals is called the __
sum of squares of error
A t-test is used to determine whether the coefficients of the regression model are significantly different from zero.
true
Correlation is a measure of the degree of linear relationship between two variables.
true
For the regression line, y = 21 − 5x, 21 is the y-intercept of the line.
true
Given x, a 95% prediction interval for a single value of y is always wider than a 95% confidence interval for the average value of y.
true
In simple regression analysis the error terms are assumed to be independent and normally distributed with zero mean and constant variance.
true
One of the major uses of residual analysis is to test some of the assumptions underlying regression.
true
Regression output from Excel software includes an ANOVA table. (on previous test)
true