Ch. 12

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

A standardized residual that is outside the range ______ or _______ would be considered unusual or an outlier.

(-2, +2) (-3, +3)

Which of the following sample correlation coefficients shows the strongest association between X and Y?

-0.95

Given that SSxy = 492.45, SSxx = 234.5, b1 =

2.1

If the sample regression equation is found to be ŷ = 10 + 2x, what is the estimated value of Y if x = 5?

20

Given that b1 = 2.1, xx = 10 and yy = 65.8, b0 =

44.8

In a study, SST = 1,000, SSE = 200. Find the coefficient of determination.

80%

Fill in the missing symbols between the sums of squares to express the relationship: SST_____SSR_____SSE

=; +

Given the following output from a regression analysis: ŷŷ = 125 + 7.4x, R2 = .15, p-value = .453 for the zero slope test (α = .05), one would conclude

Approximately 15% of the variation in Y is explained by X. This is a poor fit because R2 is closer to zero than one. The slope is not significantly different from zero.

Which of the following is a use of the standard error of estimate in regression analysis?

As a goodness of fit measure.

True or false: If a relationship exists between a response variable Y and a predictor variable X it is appropriate to say that X causes variation in Y.

False

The Excel regression analysis is found in

Data > Data Analysis > Regression

If the trend line equation is ŷ = 15 + 5x, which of the following is the correct interpretation of 5?

For every unit increase in X, Y on the average will increase by 5 units.

In hypothesis tests about the population correlation coefficient, ρ, the null hypothesis for a two-tailed test is

H0: ρ = 0

In hypothesis tests about the population correlation coefficient, ρ, the alternative hypothesis for a two-tailed test is

H1: ρ ≠ 0

What are the steps to take to deal with unusual observations?

If a data point is an error, it may be discarded. Note any unusual observations when reporting results. Try to determine if the observation is influential.

How does the coefficient of determination help as a goodness of fit tool in regression analysis?

It gives the percentage of the variation in Y explained by the sample regression equation.

Which of the following choices describe the information we look for on a scatterplot?

Pattern of relationship Direction of relationship Presence of outliers

What type of relationship exists between X and Y if as X increases Y increases?

Positive

What type of linear relationship appears to exist between two variables if rx y= 0.83?

Positive Strong

For which of the following scenarios would a simple regression model be appropriate?

Predicting units sold from advertising spending. Predicting salary from years of education. Predicting height from age.

To test for overall significance of a regression, we compare which two sums of squares?

SSR to SSE

The coefficient of determination or R² is calculated as

SSR/SST

What does SSR represent in regression analysis?

The amount of variation in Y that is explained.

What does SSE represent in regression analysis?

The amount of variation in Y that is left unexplained.

The three regression assumptions are

The errors are independent. The errors are normally distributed. The errors have constant variance.

If the trend line equation is ŷ = 15 + 5x, which of the following is the correct interpretation of 15?

The line crosses the y axis at y = 15

Given the following output from a regression analysis: ŷŷ = -54 + 2.3x, R2 = .75, p-value = .003 for the zero slope test (α = .05), one would conclude

The slope is significantly different from zero. Approximately 75% of the variation in Y is explained by X. For each unit increase in X, Y increases by 2.3 units, on average.

The use of the standard error of regression, se, as a measure of goodness of fit of a model is best expressed by which of the following statements?

The smaller the value, the better the fit.

True or false: Regression calculations are typically done on a computer because the calculations can be quite tedious and lengthy.

True

We can check for normality of errors by looking at

a histogram of residuals. a normplot of residuals.

A _____________ interval for Y, the response variable, predicts the mean of Y whereas a ____________ interval for Y predicts the individual value for Y.

confidence; prediction

High leverage residuals are of interest because they indicate a point that

could have a strong influence on the regression estimates.

If the error terms in a regression analysis are not independent we say the errors are

autocorrelated

The only data points that could possibly be discarded in a regression analysis are those that are

errors

Total variation in Y = Variation in Y ___________ by X + __________ variation

explained; unexplained

When using a simple regression equation for predicting a response variable, ____________ or predicting outside the range of observed x values, should be approached with caution.

extrapolation

In order to fit a regression line on an Excel scatterplot:

from the ribbon: click on Chart Tools > Layout > Trendline highlight the data points on the scatterplot. Right click and choose Add Trendline.

If the error terms in a regression analysis do not have constant variance we say the errors are

heteroscedastic

Given the following regression equation: profit = $20,000 + $45advertising expenditure, we could conclude

if nothing is spent on advertising the average profit will be $20,000 for each one unit increase in advertising expenditures we see a $45 increase in profit

With the population linear model: Y = β0 + β1X + ε, β0 represents the

intercept population parameter

The _________________ the value of the F statistic, the better the fit of the regression. But we still must compare F calc to a _______________ value for F in order to conclude statistical significance.

larger; critical

A residual that has a high leverage statistic is a point that is far from the _____________ of the x variable.

mean

The confidence interval for Y will be ___ prediction interval for Y.

narrower than the

We can check for dependent errors by

plotting the residuals in sequence and looking for a nonrandom trend.

The formula for calculating the two-tailed critical value of r, the sample correlation coefficient is:

rcritical = tα/2 sq(t2α/2+n−2)

We test the three regression assumptions using the

residuals.

Both the standard error of the slope estimate and the standard error of the intercept estimate are calculated using

se

In statistics, a straight-line model of the relationship between two variables, X and Y, is called a

simple regression equation

With the population linear model: Y = β0 + β1X + ε, β1 represents the

slope population parameter

Simple regression describes the relationship between two variables, X and Y, using the ___________ and _______________ form of a linear equation

slope; intercept

If the fit of the regression line is good the value of SSE will be relatively ________________ compared to SST.

smaller

We calculate a ________________ residual in order to spot unusual or outlier residual values.

standardized

Confidence intervals for the slope and intercept of a simple regression line are calculated using the

t statistic

If the error terms in a regression analysis are not normally distributed

the confidence intervals for the parameters could be untrustworthy.

Non-normality of errors is considered a mild violation unless

the data has major outliers.

Non-constant error variance is considered a serious violation because

the significance of the regression could be overstated.

The test for zero slope will give the same result as the test for zero correlation because

the tcalc value will be the same in both tests.

Dependent errors are often found in

time-series data.

True or false: Once your data have been plotted on an Excel scatterplot you can fit a regression line to the points using the Trendline option.

true

To calculate the critical value of the correlation coefficient one will need to find the critical t statistic using n - _____________ degrees of freedom

two

A high residual value means the observation is far from the regression line in the ____________ direction

vertical

Given that the 95% confidence interval for β1 is (-34.5, 26.8) we can say that because the interval contains _____________ it is possible that the slope is 0.

zero

The test for zero correlation is the same as the test for

zero slope


Kaugnay na mga set ng pag-aaral

Chapter 1: Introductory to the Human Body

View Set

Chapter 1 Operations Management (OPSY 5315)

View Set

Priority Setting Framework- Beginner

View Set

Infrastructure Services e-Learning Test

View Set

Lesson 2 - What's your name 你叫什么名字 - PART C

View Set

NURS 505 Exam 3 PrepU (25, 27) (39, 40, 41)

View Set

Chapter 02: Organization Strategy and Project Selection (1X2 & true/false)

View Set

Trauma, Crisis, Disaster, and related Disorders Assessment

View Set

Ch. 21: Respiratory Care Modalities

View Set