Statistics Chapter 4

¡Supera tus tareas y exámenes ahora con Quizwiz!

Statisticians often write the word​ _______ in front of the​ y-variable in the equation of the regression line.

"predicted"

The correlation coefficient is always a number between​ _______.

-1 and 1

When computing the correlation​ coefficient, what is the effect of changing the order of the variables on​ r?

It has no effect on r.

The​ _______ is a number that measures the strength of the linear association between two numerical variables.

correlation coefficient

Another name for the regression line is the​ _______ line.

least squares because it is chosen so that the sum of the squares of the differences between the observed​ y-value and the value predicted by the line is as small as possible.

When testing the IQ of a group of adults​ (aged 25 to​ 50), an investigator noticed that the correlation between IQ and age was negative. Does this show that IQ goes down as we get​ older? Why or why​ not? Explain.

No, correlation does not mean causation.

Some investors use a technique called the​ "Dogs of the​ Dow" to invest. They pick several stocks that are performing poorly from the Dow Jones group​ (which is a composite of 30​ well-known stocks) and invest in these. Explain why these stocks will probably do better than they have done before.

Part of the poor historical performance could be due to​ chance, and if​ so, regression toward the mean predicts that stocks turning in a​ lower-than-average performance should tend to perform closer to the mean in the future. In other​ words, they should increase.

The correlation between height and arm span in a sample of adult women was found to be r=0.941. The correlation between arm span and height in a sample of adult men was found to be r=0.864. Which association—the association between height and arm span for​ women, or the association between height and arm span for men—is ​stronger? Explain.

The association between height and arm span for women is stronger because the value of r is farther from 0.

If the correlation between height and weight of a large group of people is 0.67​, find the coefficient of determination​ (as a​ percent) and explain what it means. Assume that height is the predictor and weight is the​ response, and assume that the association between height and weight is linear.

The coefficient of determination is 44.89​%. ​Therefore, 44.89​% of the variation in weight can be explained by the regression line.

How is the coefficient of determination related to the​ correlation, and what does the coefficient of determination​ show?

The coefficient of determination is the square of the​ correlation, and it shows the proportion of the variation in the response variable that is explained by the explanatory variable.

The correlation between house price​ (in dollars) and area of the house​ (in square​ feet) for some houses is 0.91. If you found the correlation between house price in thousands of dollars and area in square feet for the same​ houses, what would the correlation​ be?

The new correlation would be 0.91. Changing units or multiplying the numbers for a variable by a positive constant does not change the correlation.

​Since, in​ general, the longer a car is owned the more miles it travels one can say there is a​ _______ between age of a car and mileage.

a positive association

Attempting to use the regression equation to make predictions beyond the range of the data is called​ _______.

extrapolation

Since outliers can greatly affect the regression line they are also called​ _______ points.

influential

For what types of associations are regression models​ useful?

linear

Under what conditions can extrapolation be used to make predictions beyond the range of the​ data?

never

When can a correlation coefficient based on an observational study be used to support a claim of cause and​ effect?

never

The​ _______ is a tool for making predictions about future observed values and is a useful way of summarizing a linear relationship.

regression equation

When describing​ two-variable associations, a written description should always include​ trend, shape,​ strength, and which of the​ following?

The context of the data

What is an influential​ point?

An influential point is a point that changes the regression equation by a large amount.

The value that measures how much variation in the response variable is explained by the explanatory variable is called the​ _______.

coefficient of determination

What is extrapolation and why is it a bad idea in regression​ analysis?

Extrapolation is prediction far outside the range of the data. These predictions may be incorrect if the linear trend does not​ continue, and so extrapolation generally should not be trusted.

It has been noted that people who go to church frequently tend to have lower blood pressure than people who​ don't go to church. Does this mean you can lower your blood pressure by going to​ church? Why or why​ not? Explain.

Going to church may not cause lower blood pressure. Just because two variables are related does not show that one caused the other.

Suppose that the growth rate of children looks like a straight line if the height of a child is observed at the ages of 24​ months, 28​ months, 32​ months, and 36 months. If you use the regression obtained from these ages and predict the height of the child at 21​ years, you might find that the predicted height is 20 feet. What is wrong with the prediction and the process​ used?

Growth rates slow as people get older. One should not extrapolate. That​ is, one should not predict outside the range of the data.

If there is a positive correlation between number of years studying math and thumb length ​(for children), does that prove that longer thumbs cause more studying of math​, or vice​ versa? Can you think of a hidden variable that might be influencing both of the other​ variables?

It does not prove​ causation, because older children have longer thumbs and have studied math longer. Longer thumbs do not cause an increase in years of studying. The hidden variable is age.

The correlation coefficient makes sense only if the trend is linear and the​ _______.

variables are numerical

A large amount of scatter in a scatterplot is an indication that the association between the two variables is​ _______.

weak

Suppose a doctor telephones those patients who are in the highest​ 10% with regard to their recently recorded blood pressure and asks them to return for a clinical review. When she retakes their blood​ pressures, will those new blood​ pressures, as a group​ (that is, on​ average), tend to be higher​ than, lower​ than, or the same as the earlier blood​ pressures, and​ why?

The new blood pressures will tend to be lower. Part of the high reading might be due to​ chance, and regression toward the mean predicts that a repeated measurement will be closer to the typical value.

One important use of the regression line is to?

To make predictions about the values of y for a given​ x-value

The intercept of a regression line tells a person the predicted mean​ y-value when the​ x-value is​ _______.

0

What type of effect can outliers have on a regression​ line?

A big effect

Does a correlation of −0.4 or +0.5 give a larger coefficient of​ determination? We say that the linear relationship that has the larger coefficient of determination is more strongly correlated. Which of the values shows a stronger​ correlation?

A correlation of +0.5 gives a larger coefficient of determination and shows a stronger correlation.

When writing a regression​ equation, what is not a name for the​ x-variable?

Dependent variable

When one has influential points in their​ data, how should regression and correlation be​ done?

Do regression and correlation with and without these points and comment on the differences


Conjuntos de estudio relacionados

Earth Science Chapter 3 - Rocks and their origins

View Set

Energy Balance and Body Composition

View Set

Google Project Management Coursera

View Set

AGB 144 Sudbrock Modules 8-14 Quizzes

View Set