Chapter 10

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

There is a certain geyser that erupts on a regular basis. Researchers are interested in the relationship between the duration of a current eruption of the geyser​ (duration) and the time between when that eruption ends and the next eruption begins​ (interval). Review the accompanying scatterplot of 222 eruptions of the geyser. What is the correlation​ coefficient? (Try to figure out the correct answer without calculating the correlation​ coefficient.)

+0.88

The value of​ r, the linear correlation​ coefficient, that represents the strongest negative correlation between two variables is

-1

The value of​ r, the linear correlation​ coefficient, that represents no correlation between two variables is

0

The value of​ r, the linear correlation​ coefficient, that represents the strongest positive correlation between two variables is

1

In​ regression, what is the difference between an observed value of the response variable and its predicted value​ called?

A residual

What is a​ residual?

A residual is a value of y−y​, which is the difference between an observed value of y and a predicted value of y.

What is a scatterplot and how does it help​ us?

A scatterplot is a graph of paired​ (x, y) quantitative data. It provides a visual image of the data plotted as​ points, which helps show any patterns in the data.

Which of the following is NOT one of the three common errors involving​ correlation?

Correlation does not imply causality

Which of the following is not equivalent to the other​ three?

Dependent variable

Which of the following is NOT a requirement in determining whether there is a linear correlation between two​ variables?

If r>​1, then there is a positive linear correlation.

Which of the following is NOT true for a hypothesis test for​ correlation?

If |r|>critical value, we should fail to reject the null hypothesis and conclude that there is not sufficient evidence to support the claim of a linear correlation.

Suppose the equation of a​ least-squares regression line is y=−3.17−2.4x. What can be said about the​ y-intercept?

It is −3.17.

Suppose the equation of a​ least-squares regression line is y=−3.17−2.4x. What can be said about the correlation​ coefficient?

It is​ negative, but its exact value cannot be determined from the given information.

In​ regression, what can be said about the sum of the residuals of all the​ observations?

It will always be 0.

When analyzing two quantitative​ variables, what is the first thing that should be​ done?

Make a scatterplot.

Twenty different statistics students are randomly selected. For each of​ them, their body temperature ​(°​C) is measured and their head circumference​ (cm) is measured. If it is found that r=​0, does that indicate that there is no association between these two​ variables?

No, because while there is no linear​ correlation, there may be a relationship that is not linear.

If we find that there is a linear correlation between the concentration of carbon dioxide in our atmosphere and the global​ temperature, does that indicate that changes in the concentration of carbon dioxide cause changes in the global​ temperature?

No. The presence of a linear correlation between two variables does not imply that one of the variables is the cause of the other variable.

Data were collected on many different variables of a fast food​ chain's sandwiches several years ago. Two variables were the serving size​ (in ounces) of a sandwich and the number of calories in the sandwich. A hungry customer wanted to estimate the number of calories in a sandwich based on its serving size. With this in​ mind, which variable would go on the​ y-axis in the​ scatterplot?

Number of calories goes on the​ y-axis, since it is the response variable.

There is a certain geyser that erupts on a regular basis. Researchers are interested in the relationship between the duration of a current eruption of the geyser​ (duration) and the time between when that eruption ends and the next eruption begins​ (interval). Review the accompanying scatterplot of 222 eruptions of the geyser. The​ least-squares regression equation is y=33.967+​11.358x, where y is the interval from the end of the current eruption to the beginning of the next eruption and x is the duration of current eruption. For a duration of 4​ minutes, y=75.4 minutes. What does this​ mean?

The average​ wait-time until the next eruption for all eruptions that last 4 minutes is 75.4 minutes. For eruptions that last 4​ minutes, it is estimated that a visitor will have to wait 75.4 minutes after the current eruption ends before the next eruption begins.

What is the definition of the correlation​ coefficient?

The correlation coefficient is a measure that describes the direction and strength of the linear relationship between two quantitative variables.

What is the difference between the following two regression​ equations? ^y=b0+b1x y=

The first equation is for sample​ data; the second equation is for a population.

How is the​ best-fitting line between the points in a scatterplot​ defined?

The line that gives the smallest sum of the squared vertical distances between each point and the line

Which of the following is NOT a property of the linear correlation coefficient​ r?

The linear correlation coefficient r is robust. That​ is, a single outlier will not affect the value of r.

Which of the following is not a requirement for regression​ analysis?

The method for regression analysis line is not robust. It is seriously affected by a small departure from a normal distribution.

In what sense is the regression line the straight line that​ "best" fits the points in a​ scatterplot?

The regression line has the property that the ▼ sum of squares of the residuals is the ▼ lowest possible sum.

There is a certain geyser that erupts on a regular basis. Researchers are interested in the relationship between the duration of a current eruption of the geyser​ (duration) and the time between when that eruption ends and the next eruption begins​ (interval). Review the accompanying scatterplot of 222 eruptions of the geyser. The​ least-squares regression equation is y=33.967+​11.358x, where y is the interval from the end of the current eruption to the beginning of the next eruption and x is the duration of current eruption. In this​ equation, what is​ 11.358?

The slope of the​ least-squares regression line

In this section we use r to denote the value of the linear correlation coefficient. Why do we refer to this correlation coefficient as being​ linear?

The term linear refers to a straight​ line, and r measures how well a scatterplot fits a​ straight-line pattern.

What is the relationship between the linear correlation coefficient r and the slope b1 of a regression​ line?

The value of r will always have the same sign as the value of b1.

Which of the following statements best describes this​ scatterplot? 12

There are two clusters of points. The relationship between X and Y for each cluster is strong and positive.

Gina calculated a correlation coefficient between hours studied and grade point average as​ +0.75. Which of the following is a correct statement based on this correlation​ coefficient?

There is a fairly strong positive relationship between hours studied and grade point​ average, indicating that grade point averages tend to be higher for students who study more.

Data were collected on many different variables of a fast food​ chain's sandwiches several years ago. Two variables were the serving size​ (in ounces) of a sandwich and the number of calories in the sandwich. Review the accompanying scatterplot of serving size versus number of calories. Which of the following best describes the relationship between these two​ variables?

There is a fairly strong positive relationship with no extreme outliers.

Which of the following statements best describes this​ scatterplot? 10

There is a​ negative, moderately strong relationship between X and Y with one outlier.

Which of the following statements best describes this​ scatterplot? 11

There is a​ non-linear relationship between X and Y with two outliers.

What does a correlation coefficient of 0​ indicate?

There is no linear relationship between the two quantitative variables.

In​ regression, a residual can be negative. Is this statement true or​ false?

True

When making predictions based on regression​ lines, which of the following is not listed as a​ consideration?

Use the regression line for predictions only if the data go far beyond the scope of the available sample data.

Which of the following statements about correlation is​ true?

We say that there is a positive correlation between x and y if the​ x-values increase as the corresponding​ y-values increase.

What is a variable other than x and y that simultaneously affects both variables​ called?

a lurking variable

The point circled in red corresponds to an eruption that lasted​ _______ and had a time until the next eruption began of​ _______ after the eruption ended.

about 3 minutes; about 72 minutes

A​ __________ exists between two variables when the values of one variable are somehow associated with the values of the other variable.

correlation

A high correlation coefficient indicates that the relationship between the two quantitative variables must be linear.

false

Determine if the following statement is true or false. A correlation coefficient close to 1 is evidence of a​ cause-and-effect relationship between the two variables.

false

Paired sample data may include one or more​ ___________, which are points that strongly affect the graph of the regression line.

influential points,

A straight line satisfies the​ __________________ if the sum of the squares of the residuals is the smallest sum possible.

least-squares property

The​ ______________ measures the strength of the linear correlation between the paired quantitative​ x- and​ y-values in a sample.

linear correlation coefficient r

When performing a linear regression​ analysis, it is important that the relationship between the two quantitative variables be​ _______.

linear.

In working with two variables related by a regression​ equation, the​ _________________ in a variable is the amount that it changes when the other variable changes by exactly one unit.

marginal change

In a​ scatterplot, a(n)​ ______________ is a point lying far away from the other data points.

outlier

Given a collection of paired sample​ data, the​ ____________________ y=b0+b1x algebraically describes the relationship between the two​ variables, x and y.

regression equation

For a pair of sample​ x- and​ y-values, the​ ______________ is the difference between the observed sample value of y and the​ y-value that is predicted by using the regression equation.

residual

A​ ______________ is a scatterplot of the​ (x,y) values after each of the​ y-coordinate values has been replaced by the residual value y−y.

residual plot

A​ _______ is a plot of paired data​ (x,y) and is helpful in determining whether there is a relationship between the two variables.

scatterplot

When determining whether there is a correlation between two​ variables, one should use a​ ____________ to explore the data visually.

scatterplot

The line that fits best between the points in a scatterplot is the line that gives the​ _______ sum of the squared​ _______ distances between each point and the line.

smallest; vertical

A correlation coefficient can be 0.

true


Ensembles d'études connexes

Lección 9: Recapitulación y Flash cultura

View Set

Understanding Questions DALL-E 2

View Set

CHA: blood labs, Sickle Cell, Anemia

View Set

Licensure Practice Exam Questions Part 2

View Set