RA 6&7

Ace your homework & exams now with Quizwiz!

The least-squares line always passes through which point?

(x-, y-)

The formula for a 100%(1 - α) prediction interval for β0 + β1x is given by ______.

(β̂β̂0 + β̂β̂1x) ± tn-2,α/2s√1+1n+(x-x)2∑i=1n(xi-x)2

Six blueberry shrubs were planted in a garden and then fertilized and watered for two years. Monthly fertilizer amounts (in milliliters) and shrub heights at two years (in centimeters) were recorded as follows: Fertilizer 0 5 10 15 20 25 Height 39 49 57 57 65 66 Here is the output from software used to find the least-squares line for predicting height from fertilizer: Coef SE Coef T P Predictor 1.046 0.153 6.825 0.002 Constant 42.429 2.320 18.292 0.000 Find a 99% confidence interval for the slope β1, which is the increase in height for an increase in fertilizer of 1 milliliter.

1.046 ± (4.604)(0.153)

Which of the following is a correct formula for r, the correlation coefficient between x and y?

1/n-1 E (xi-x/sx)(yi-y/sy)

Six blueberry shrubs were planted in a garden and then fertilized and watered for two years. Monthly fertilizer amounts (in milliliters) and shrub heights at two years (in centimeters) were recorded as follows: Fertilizer 0 5 10 15 20 25 Height 39 49 57 57 65 66 Predicting the height from the fertilizer amount via the least-squares line would be unreliable for which fertilizer amount?

35

Six blueberry shrubs were planted in a garden and then fertilized and watered for two years. Monthly fertilizer amounts (in milliliters) and shrub heights at two years (in centimeters) were recorded as follows: Fertilizer 0 5 10 15 20 25 Height 39 49 57 57 65 66 Here is the output from software used to find the least-squares line for predicting height from fertilizer: Coef SE Coef T P Predictor 1.046 0.153 6.825 0.002 Constant 42.429 2.320 18.292 0.000 Given that sŷ��̂is 1.311, find a 95% confidence interval for the true mean response height for a monthly fertilizer amount of 12 milliliters.

42.429 + (1.046)(12) ± (2.776)(1.311)

Given that spred = 3.463, find a 95% prediction interval for the height for a new shrub whose monthly fertilizer is 12 milliliters.

42.429 + (1.046)(12) ± (2.776)(3.463)

Which statements about confidence bands and prediction bands are correct?

A confidence band consists of two curves, one joining the upper bounds of confidence intervals for many x, and the other joining the lower bounds. A prediction band consists of two curves, one joining the upper bounds of prediction intervals for many x, and the other joining the lower bounds.

The correlation coefficient should be used for only which one of the data sets described below?

A scatterplot of weight of cars versus gas mileage shows an approximately linear relationship.

Suppose two variables x and y are positively correlated. Which of the following statements are true?

An increase in x is associated with an increase in y. There may be a third confounding variable z that is the cause of both x and y. A decrease in x is associated with a decrease in y.

Which of the following is a numerical measure of the strength of the linear relationship between two variables?

Correlation coefficient

Select all that apply The least-squares line should be used to summarize which kinds of data?

Data that show a strong linear trend, but are scattered about the line Data that fit a line tightly

Which of the following operations leave the correlation coefficient r between x and y unchanged?

Dividing each x by a positive constant Interchanging the values of x and y Multiplying each x by a positive constant Adding a constant to each x

True or false: If the correlation coefficient between x and y is 0, there is no relationship between x and y.

False

True or false: When two variables are highly correlated, a change in the value of one will cause a change in the value of another.

False

Match each pair of hypotheses about a linear model on the left with the correct test statistic and P-value on the right. Instructions

H0: β0 ≤ b vs. H1: β0 > b t = β̂0-b/sβ̂0β̂0-��β̂0, P-value = P(tn-2 > t) H0: β1 = b vs. H1: β1 ≠ b H0: β0 ≥ b vs. H1: β0 < b t = β̂1-bsβ̂1β̂1-��β̂1, P-value = 2P(tn-2 > |t|) H0: β0 ≥ b vs. H1: β0 < b Choice t = β̂0-bsβ̂0β̂0-��β̂0, P-value = P(tn-2 < t) H0: β1 ≤ b vs. H1: β1 > b t = β̂0-bsβ̂0β̂0-��β̂0, P-value = P(tn-2 < t)

A test of which hypothesis is most important when deciding whether to use a simple linear regression model to predict y from x?

H0: β1 = 0

Sketch the points (-2, 4.2), (-1, 0.9), (0, 0.1), (1, 1.2), and (2, 3.8). Is it appropriate to use a least-squares line to fit these points?

No

Consider a controlled experiment to determine how the life of a table saw blade depends on its rotation speed (in hertz) and on the force (in newtons) with which wood is pushed against the blade. Which set of values for the two explanatory variables is preferable?

Pressure N : 5, 10 , 5, 10

The correlation coefficient, r, is calculated between x, the winning men's shot put distance (in meters), and y, the air temperature (in degrees Celsius), over many track and field competitions. Which of the following would change r?

Replace each temperature y with the wind speed. Replace each distance x with ln(x), its natural logarithm.

Under the assumptions of simple linear regression, the quantities βˆ0−β0sβˆ0β̂0-β0sβ̂0 and βˆ1−β1sβˆ1β̂1-β1sβ̂1 follow ______.

Student's t distributions with n - 2 degrees of freedom

True or false: In a simple linear model, it is assumed that the errors ε1�1, ..., εn�� are random and independent, each satisfying εi ~ N(0, σ2)�� ~ �(0, phi^2).

True

In the context of simple linear regression, which of the following expressions are estimates?

^β̂0 ei ^β̂1 yˆŷi = βˆβ̂0 + βˆβ̂1xi

In the linear model yi = β0 + β1xi + εi, the number β0 is called ______.

a regression coefficient

In a simple linear model, it is assumed that the errors ε1, ..., εn

are random and independent all have the same variance σ2�2 all have mean 0 are normally distributed

Confidence bands and prediction bands based on the least-squares line are more precise near the ______ of a scatterplot

center

The estimated standard deviation of an estimated mean response yˆŷ at x given by syˆsŷ = s√1n+(x-x)2∑i=1n(xi-x)21n+(x-x)2∑ni=1(xi-x)2 is smallest for an x near the ______ of the data set.

center

Let x and y be variables of interest. A variable that is correlated with both x and y is called a BLANK variable.

confounding

While the factors in an observational study are likely to be correlated and therefore confounded with one another, in a blank experiment, factor values can often be chosen so that the factors are uncorrelated.

controlled or control

A numerical measure of the strength of the linear relationship between two variables is the BLANK coefficient.

correlation

The precision of the least-squares line, as measured by sβ̂0�β̂0and sβ̂1�β̂1, can be improved by ______.

decreasing σ2, the error variance increasing n, the number of observations increasing the spread of the x coordinates

In the linear model yi = β0 + β1xi + εi, yi is called the ______ variable.

dependent

In the linear model yi = β0 + β1xi + εi, εi is called the ______

error

The coefficients βˆβ̂0 and βˆβ̂1 in the least squares line y = βˆβ̂0 + βˆβ̂1 x are ______.

estimated from the data

To BLANK is to use a least-squares line for prediction outside the range of the data. It is unreliable because the linear relationship may not hold outside the range of the data.

extrapolate

Six blueberry shrubs were planted in a garden and then fertilized and watered for two years. Monthly fertilizer amounts (in milliliters) and shrub heights at two years (in centimeters) were recorded as follows: Fertilizer 0 5 10 15 20 25 Height 39 49 57 57 65 66 Here is the output from software used to find the least-squares line for predicting height from fertilizer: Coef SE Coef T P Predictor 1.046 0.153 6.825 0.002 Constant 42.429 2.320 18.292 0.000 Which equation gives the least-squares line?

height = 42.429 + 1.046(fertilizer)

In the linear model yi = β0 + β1xi + εi, xi is called the ______ variable.

independent

In the context of least-squares regression, r2 ______.

is the proportion of variance in y explained by regression is the coefficient of determination =regressionsumofsquarestotalsumofsquaresregressionsumofsquarestotalsumofsquares is the square of the correlation coefficient

Increasing the spread of the x coordinates improves the precision of the least-squares line by decreasing sβ̂0�β̂0and sβ̂1�β̂1. But it should not be done past the range where the Blank model holds.

linear

The distributions of β̂β̂0 and β̂β̂1 are

normal with means β0 and β1

The correlation coefficient is often misleading for a data set containing an BLANK.

outlier

Suppose that SAT and ACT college admissions test scores for a random sample of 400 students who took both tests had sample correlation r = 0.81. Which calculations (among others) are required to find a 95% confidence interval for the population correlation ρ?

ow= sqrt(1/400-3) w=1/2ln(1+0.81/1-0.81)

Which of the following are properties of the correlation coefficient, r, calculated from a bivariate data set (x1,y1),...,(xn,yn)?

r > 0 for a least-squares line with positive slope. -1 ≤ r ≤ 1

Which of the following are properties of the correlation coefficient, r, calculated from a bivariate data set (x1,y1),...,(xn,yn)?

r does not depend on units or scale. r = 0 implies that x and y have no linear correlation.

The slope of the least-squares line, β̂β̂1 = r sYsXsYsX, indicates that a change of one standard deviation in x corresponds to a change of ______ in y.

r standard deviations

To see the effect of an outlier on the correlation coeffecient, consider the points (1, 1), (2, 2), (3, 3), (4, 4), and (5, 5), which are on the line y = x. Find their correlation, r1. Replace (5, 5) with the outlier (5, 0) and find the new correlation, r2. (Sketch the points if you can't visualize them.) The two values are ______.

r1 = 1 and r2 = 0

Match each expression on the left with its description on the right. Instructions

r^2 Coefficient of determination (yi-y-)^2 total sum of squres (yi-yi)^2 Error sum of squares

The vertical difference ei = yi − yiˆei = yi - yi^ between observed yiyi and predicted yiˆyi^ associated with the point (xi, yi)(xi, yi) in a least-squares line is called the BLANK.

residual

Match each standard deviation expression on the left to its value on the right.

s^B0 s√1/n+ x2/∑i=1n(xi-x)2 S^B1 s/√∑i=1n(xi-x)2 S √(1-r2)∑i=1n(yi-y)2/n-2

The least-squares line is chosen to minimize the sum of the _____ from the data points to the line.

squared vertical distances

The least-squares coefficients β̂β̂0 and β̂β̂1 are ______. Multiple select question.

statistics random variables

The estimated standard deviation of an estimated mean response yˆŷ at x is given by syˆsŷ = ______.

s√1n+(x-x)2∑i=1n(xi-x)2

Six blueberry shrubs were planted in a garden and then fertilized and watered for two years. Monthly fertilizer amounts (in milliliters) and shrub heights at two years (in centimeters) were recorded as follows: Fertilizer 0 5 10 15 20 25 Height 39 49 57 57 65 66 Here is the output from software used to find the least-squares line for predicting height from fertilizer: Coef SE Coef T P Predictor 1.046 0.153 6.825 0.002 Constant 42.429 2.320 18.292 0.000 Find the test statistic for a test of H0: β1 = 0 versus H1: β1 > 0, where β1 is the slope of the line relating height to fertilizer.

t = 1.046-0/0.153

Six blueberry shrubs were planted in a garden and then fertilized and watered for two years. Monthly fertilizer amounts (in milliliters) and shrub heights at two years (in centimeters) were recorded as follows: Fertilizer 0 5 10 15 20 25 Height 39 49 57 57 65 66 Here is the output from software used to find the least-squares line for predicting height from fertilizer: Coef SE Coef T P Predictor 1.046 0.153 6.825 0.002 Constant 42.429 2.320 18.292 0.000 Given that sŷ��̂is 1.311, find the test statistic for a test that the true mean response height is 54 for a monthly fertilizer amount of 12 milliliters, against the alternative that it is more than 54.

t = 42.429+(1.046)(12)-54/1.311

Which of the following is a correct formula for r, the correlation coefficient between x and y?

the answer is all but 1/n-1 E (xi-x/sx)(yi-y/sy)

Even supposing that ice cream sales per week are correlated with the number of drowning swimmers per week, it is unreasonable to say that ice cream causes drowning. Potential confounding variables include ______.

the number of swimmers per week the average temperature during the week

The correlation of x and y is positive when ______.

values of x greater than the mean x tend to be associated with values of y greater than the mean y. values of x less than the mean x tend to be associated with values of y less than the mean y.

Select all that apply The correlation of x and y is negative when ______.

values of x less than the mean x tend to be associated with values of y greater than the mean y. values of x greater than the mean x tend to be associated with values of y less than the mean y.

Suppose that in the most common test for a simple linear regression slope, H0: β1 = 0, the null hypothesis is not rejected. Which statements are correct?

y and x have no linear relationship. The linear model should not be used.

Match the notation from the least-squares line on the left with its description on the right.

yiˆ = βˆo + β1ˆxyi^ = β^o + β1^x matches Fitted value ei = yi − yiˆei = yi - yi^ matches Residual βoˆβo^ matches The estimated intercept β1ˆβ1^ matches The estimated slope

The formula for the intercept of the least-squares line is ˆβ0 = ______.

yy - βˆβ̂1x

Match each quantity related to simple linear regression on the left with its description on the right.

βˆβ̂0 matches The y-intercept of the line estimated from data βˆβ̂1 matches The slope of the line estimated from data β0 matches The (usually unknown) y-intercept of the line relating y to x in the population β1 matches The (usually unknown) slope of the line relating y to x in the population εi matches The (usually unknown) vertical error in measuring yi ei matches The difference between yi and the y-coordinate of the point on the least-squares line at x = xi

Regarding the least squares line y = βˆβ̂0 + βˆβ̂1 x, which statements are true?

βˆβ̂0 is an estimate of β0, the true intercept. βˆβ̂1 is an estimate of β1, the true slope.

A level 100(1 - α)% confidence interval for β0 is given by ______.

βˆβ̂0 ± tn-2,α/2sβˆ0sβ̂0, where sβˆ0sβ̂0 = s × √1n+x2∑i=1n(xi-x)2

Which of these quantities typically change from sample to sample?

β̂β̂1 β̂β̂0 ei εi

Match each expression on the right with its distribution under the assumptions of simple linear regression.

εi matches N(0, σ2) βˆβ̂1 N(β1,σβˆ1)β1,σβ̂1 βˆβ̂0 - β0 N(0,σβˆ0)0,σβ̂0 βˆ0-β0sβˆ0β̂0-β0sβ̂0 matches Choice, Student's t with n- 2 degrees of freedom Student's t with n - 2 degrees of freedom

Suppose that SAT and ACT college admissions test scores for a random sample of 400 students who took both tests had sample correlation r = 0.81. Which formulas are required as steps in finding the test statistic z for testing H0: ρ ≤ 0.75 against H1: ρ > 0.75?

σW = √1400-31400-3 μW = 1212ln(1+0.751-0.75)1+0.751-0.75 W = 1212ln(1+0.811-0.81)1+0.811-0.81 z = W-μWσW

Match each expression related to the least-squares line on the left with its equivalent on the right.

∑i=1n(xi-x)2∑ni=1(xi-x)2 matches Choice, ∑ni=1xi2-nx2 ∑i=1nxi2∑ni=1xi2 - nxx2 ∑i=1n(yi-y)2∑ni=1(yi-y)2 matches Choice, ∑ni=1yi2-ny2 ∑i=1nyi2∑ni=1yi2 - nyy2 ∑i=1n(xi-x)(yi-y)∑ni=1(xi-x)(yi-y) matches Choice, ∑ni=1xiyi-nxy ∑i=1nxiyi∑ni=1xiyi - nxxy

The formula for the slope of the least-squares line is βˆβ̂1 = ______.

∑j=1n(xi-x)(yi-y)∑j=1n(xi-x)2

The coefficients βˆβ̂0 and βˆβ̂1 in the least squares line y = βˆβ̂0 + βˆβ̂1 x are chosen to minimize ______.

∑ni=1ei2∑i=1nei2 = ∑ni=1(yi-βˆ0-βˆ1xi)2


Related study sets

End-Tidal Carbon Dioxide Monitoring

View Set

Strategies of Teaching Early Childhood Education

View Set

Chapter 22: Imperialism and Colonialism, 1870-1914

View Set