Math 221 - Chapter 4

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Since outliers can greatly affect the regression line they are also called​ _______ points.

Since outliers can greatly affect the regression​ line, these types of observations are called influential points because their presence or absence has a big effect on conclusions.

When describing​ two-variable associations, a written description should always include​ trend, shape,​ strength, and which of the​ following? The number of pairs in the data set The context of the data The name of the person who gathered the data All of the above

The context of the data

What happens to the correlation coefficient when a constant is added to each​ number?

The correlation coefficient remains the same when a constant is added to each number.

What happens to the correlation coefficient when numbers are multiplied by a positive​ constant?

The correlation coefficient remains the same when the numbers are multiplied by a positive constant.

Which of the following is not something that one looks for when studying​ scatterplots? Trend Shape Variation Strength

Variation note: Variation is something one does not look for when studying scatterplots.

When writing a regression​ equation, which of the following is not a name for the​ x-variable? Predictor variable Independent variable Dependent variable Explanatory variable

When writing a regression​ equation, the dependent variable is not another name for the​ x-variable. The dependent variable is the​ y-variable.

Statisticians often write the word​ _______ in front of the​ y-variable in the equation of the regression line.

Statisticians often write the word​ "predicted" in front of the​ y-variable in the equation of the regression line to emphasize that the line consists of predictions for the​ y-variable, not actual values.

The correlation coefficient is always a number between​ _______.

−1 and +1.

The intercept of a regression line tells a person the predicted mean​ y-value when the​ x-value is​ _______.

0.

A large amount of scatter in a scatterplot is an indication that the association between the two variables is​ _______.

A large amount of scatter in a scatterplot is an indication that the association between the two variables is weak.

What type of effect can outliers have on a regression​ line? Choose the correct answer below. A. A big effect B. Outliers are never included in a regression line. C. A small and insignificant effect D. No effect

A. A big effect note: A regression line is a line of​ means, and outliers have a big effect on the regression line.

The scatterplot shows the actual weight and desired weight change of some students.​ Thus, if they weighed 220 and wanted to weigh​ 190, the desired weight change would be negative 30. Explain what you see. In​ particular, what does it mean that the trend is​ negative? A. The more people​ weigh, the more weight they tend to want to lose. B. The more people​ weigh, the less weight they tend to want to lose. C. The less people​ weigh, the more weight they tend to want to lose.

A. The more people​ weigh, the more weight they tend to want to lose. note: Since there is a negative​ trend, it appears that the more people​ weigh, the more weight they tend to want to lose.

One important use of the regression line is to do which of the​ following? A. To determine the strength of a linear association between two variables B. To determine if a distribution is unimodal or multimodal C. To make predictions about the values of y for a given​ x-value D. Both A and B are correct

An important use of the regression line is to make predictions about the values of y for a given​ x-value.

When one has influential points in their​ data, how should regression and correlation be​ done? Choose the correct answer below. A. Always include the influential points in your data set when doing regression and correlation B. Do regression and correlation with and without these points and comment on the differences C. Remove the influential points from your data set before doing regression and correlation D. ​Don't use regression or correlation on data sets containing influential points

B. Do regression and correlation with and without these points and comment on the differences note: When one has influential points in their​ data, they should do the regression and correlation with and without these points and comment on the differences.

A doctor is studying cholesterol readings in his patients. After reviewing the cholesterol​ readings, he calls the patients with the highest cholesterol readings​ (the top​ 5% of readings in his​ office) and asks them to come back to discuss​ cholesterol-lowering methods. When he tests these patients a second​ time, the average cholesterol readings tended to have gone down somewhat. Explain what statistical phenomenon might have been partly responsible for this lowering of the readings. A. The cholesterol going down might be partly caused by​ extrapolation, since the second measurement is closer to the mean. B. The cholesterol going down might be partly caused by regression toward the​ mean, since the second measurement is farther from the mean. C. The cholesterol going down might be partly caused by​ extrapolation, since the second measurement is farther from the mean. D. The cholesterol going down might be partly caused by regression toward the​ mean, since the second measurement is closer to the mean.

D. The cholesterol going down might be partly caused by regression toward the​ mean, since the second measurement is closer to the mean. note: Regression towards the mean is the event where if a variable is extreme on its first​ measurement, it will tend to be closer to the average on a second measurement.​ Also, if it is extreme on a second​ measurement, it will tend to have been closer to the average on the first measurement. The cholesterol going down might be partly caused by regression toward the mean.

Which do you think has a stronger relationship with value of the land long dash—the number of acres of land or the number of rooms in the​ homes? Why? A. The number of acres of land has a stronger relationship with the value of the​ land, as shown by the fact that the points are more scattered in a vertical direction. B. The number of rooms in the homes has a stronger relationship with the value of the​ land, as shown by the fact that the points are less scattered in a vertical direction. C. The number of acres of land has a stronger relationship with the value of the​ land, as shown by the fact that the points are less scattered in a vertical direction. D. The number of rooms in the homes has a stronger relationship with the value of the​ land, as shown by the fact that the points are more scattered in a vertical direction.

answer: C. note: Weak associations result in a large amount of scatter in the scatterplot. A large amount of scatter means that points have a great deal of spread in the vertical direction. The number of acres of land has a stronger relationship with the value of the​ land, as shown by the fact that the points are less scattered in a vertical direction.

The value that measures how much variation in the response variable is explained by the explanatory variable is called the​ _______.

answer: coefficient of determination The coefficient of determination is the correlation coefficient​ squared; r2. In​ fact, this statistic is often called​ r-squared. This value measures how much variation in the response variable is explained by the explanatory variable.

Attempting to use the regression equation to make predictions beyond the range of the data is called​ _______.

answer: extrapolation Extrapolation means that one uses the regression line to make predictions beyond the range of the data. This practice can be​ dangerous, because although the association may have a linear shape for the range one is​ observing, that might not be true over a larger range.

The​ _______ is a number that measures the strength of the linear association between two numerical variables.

correlation coefficient

What is an influential​ point? A. An influential point is a point that changes the regression equation by a large amount. B. An influential point is used in the regression line to make predictions beyond the range of the data. C. An influential point is a point that measures the strength of the linear association between two numerical variables.

An influential point is a point that changes the regression equation by a large amount. When there are influential points in the​ data, it is good practice to try the regression and correlation with and without these points and to comment on the difference.

Another name for the regression line is the​ _______ line.

Another name for the regression line is the least squares line because it is chosen so that the sum of the squares of the differences between the observed​ y-value and the value predicted by the line is as small as possible.

If there is a positive correlation between number of years studying grammar and thumb length ​(for children), does that prove that longer thumbs cause more studying of grammar​, or vice​ versa? Can you think of a hidden variable that might be influencing both of the other​ variables? A. It proves​ causation, because the two variables are related implying causation. Longer thumbs cause an increase in years of studying. The hidden variable is age. B. It does not prove​ causation, because older children have longer thumbs and have studied grammar longer. Longer thumbs do not cause an increase in years of studying. The hidden variable is age. C.It does not prove​ causation, because older children have longer thumbs and have studied grammar longer. Longer thumbs do not cause an increase in years of studying. There is no hidden variable. D. It proves​ causation, because the two variables are related implying causation. Longer thumbs cause an increase in years of studying. There is no hidden variable.

B. It does not prove​ causation, because older children have longer thumbs and have studied grammar longer. Longer thumbs do not cause an increase in years of studying. The hidden variable is age. note: Remember that correlation does not imply causation. Older children have longer thumbs and have studied grammar longer.​ However, longer thumbs do not cause an increase in years of studying. Both are affected by age.

When computing the correlation​ coefficient, what is the effect of changing the order of the variables on​ r? Choose the correct answer below. A. It has no effect on r. B. It changes both the sign and magnitude of r. C. It changes the magnitude of r. D. It changes the sign of r.

C. It has no effect on r. Changing the order of the variables does not change r. Note that in the equation for​ r, it does not matter which variable is called x and which is called y.

It has been noted that people who go to church frequently tend to have lower blood pressure than people who​ don't go to church. Does this mean you can lower your blood pressure by going to​ church? Why or why​ not? Explain. A. Going to church may not cause lower blood pressure. Just because two variables are related does not show that one caused the other. B. Since the two variables are not​ related, going to church may not cause lower blood pressure. C. Since the two variables are​ related, going to church may not cause lower blood pressure.

Correlation does not imply causation. Going to church may not cause lower blood pressure. Just because two variables are related does not show that one caused the other. It could be that healthy people are more likely to go to​ church, or there could be other confounding factors.

Under what conditions can extrapolation be used to make predictions beyond the range of the​ data? Choose the correct answer below. A. When there is a strong positive linear association in the data. B. When the correlation coefficient is close to −1 or +1. C. When the data set contains a large number of pairs of data. D. Never

Extrapolation can never be used to make predictions beyond the range of the data.

If you were trying to predict the value of a parcel of land in this area​ (on which there is a​ home), would you be able to make a better prediction by knowing the acreage or the number of rooms in the​ house? Explain. A. The number of rooms because the association is stronger between the value of land and the number of rooms than with the acreage because the vertical spread is less. B. The acreage because the association is stronger between the value of land and acreage than with the number of rooms because the vertical spread is less. C.Neither because the association is the same between the value of land and the acreage and the value of land and the number of rooms.

answer: c note: The stronger the​ association, the better the model is for prediction. The scatterplots show that the association is stronger between the value of land and acreage than the between the value of land and the number of rooms because the vertical spread is less.​ Therefore, knowing the acreage is a better way to predict the value of the land than knowing the number of rooms in the house.

When can a correlation coefficient based on an observational study be used to support a claim of cause and​ effect? A. When the correlation coefficient is close to −1 or +1. B. When the scatterplot of the data has little vertical variation. C. When the correlation coefficient is equal to −1 or +1. D. Never

never note: A correlation coefficient based on an observational study can never be used to support a claim of cause and effect.

For what types of associations are regression models​ useful? Non-linear Linear Both linear and​ non-linear For all types of associations

note: Regression models are useful only for linear associations. If the association is not​ linear, a regression model can be misleading and deceiving.

Since, in​ general, the longer a car is owned the more miles it travels one can say there is a​ _______ between age of a car and mileage.

note: Since the longer a car is owned the more miles it​ travels, there is a positive association because this indicates that there is an increasing trend.

Fill in the blank. The​ _______ is a tool for making predictions about future observed values and is a useful way of summarizing a linear relationship.

regression equation

The correlation coefficient makes sense only if the trend is linear and the​ _______.

variables are numerical.


Kaugnay na mga set ng pag-aaral

Chapter 14: Developmental Considerations and Chronic Illness in the Nursing Care of Adults

View Set

Life insurance learn as you go pt 1

View Set

Lecture 36 - Common fibular, tibial, sural, and saphenous nerves

View Set

Chapter 8 - Multiple Choice Questions

View Set

Rational/Irrational Numbers LIVE

View Set

Chapter 4 - REVIEW QUESTIONS- Section Four

View Set

Funkcje językowe - udzielanie informacji

View Set