MATH-164 - Chapter 4

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

The linear correlation coefficient is always between

-1 and 1 The linear correlation coefficient is always between negative −1 and 1​, inclusive.

In a scatter​ diagram, the _______ variable is plotted on the horizontal axis and the _______ variable is plotted on the vertical axis.

Explanatory, response

The closer r is to +​1, the _______ the evidence is of ________ association between the two variables.

Stronger, Positive

Two variables that are linearly related are negatively associated when above-average values of one variable are associated with below-average values of the other variable.

That is, two variables are negatively associated if, whenever the value of one variable increases, the value of the other variable decreases.

What does it mean to say that two variables are negatively​ associated?

There is a linear relationship between the​ variables, and whenever the value of one variable​ increases, the value of the other variable decreases.

What does it mean to say that two variables are positively​ associated?

There is a linear relationship between the​ variables, and whenever the value of one variable​ increases, the value of the other variable increases.

lurking variable

a variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two. is an explanatory variable that was not considered in the study, but affects the value of the response variable. In addition, lurking variables are typically related to explanatory variables considered in the study.

positive linear correlation coefficient

means that the sum of the products of the z-scores for x and y must be positive.

response (dependent) variable

variable of interest (measures the outcome of a study) is the variable whose value can be explained by the value of the explanatory (or predictor or independent) variable.

Confounding

​ in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory and response variable may be due to some other variable or variables not accounted for in the study.

Match the linear correlation coefficient to the scatter diagram. The scales on the​ x- and​ y-axis are the same for each scatter diagram. (a) r=−1​, (b) r=−0.049​ (c) r=−0.0810​

​(a) Scatter diagram I. ​(b) Scatter diagram II. ​(c) Scatter diagram III.

Confounding Variables (CV)

AKA the extraneous variables, these variables cannot be controlled by the researcher and could influence any change in the Dependent Variables (DV). This is the third variable the mediator variable that can adversely affect the relation between the independent variable and dependent variable which then causes a bias to the experiment. is an explanatory variable that was considered in the study whose effect cannot be distinguished from a second explanatory variable in the study.

Two variables that are linearly related are positively associated when above-average values of one variable are associated with above-average values of the other variable (or below-average values of one variable are associated with below-average values of the other variable).

That is, two variables are positively associated if, whenever the value of one variable increases, the value of the other variable also increases.

A pediatrician wants to determine the relation that may exist between a​ child's height and head circumference. She randomly selects 8​ children, measures their height and head​ circumference, and obtains the data shown in the table. ​(a) If the pediatrician wants to use height to predict head​ circumference, determine which variable is the explanatory variable and which is the response variable.

a) The explanatory variable is height and the response variable is head circumference. Draw scatter diagram. x-height, y is head circumference

The accompanying data represent the number of days​ absent, x, and the final exam​ score, y, for a sample of college students in a general education course at a large state university. Number of absences, x Final exam score, y 0 88.6 1 85.6 2 82.7 3 80.7 4 77.5 5 73.6 6 63.5 7 71.2 8 66.1 9 66.7 Complete parts ​(a) through​ (e) below. (a) Find the​ least-squares regression line treating number of absences as the explanatory variable and the final exam score as the response variable. ​(b) Interpret the slope and the​ y-intercept, if appropriate. Choose the correct answer below and fill in any answer boxes in your choice. ​(c) Predict the final exam score for a student who misses five class periods. ​(d) Draw the​ least-squares regression line on the scatter diagram of the data. Choose the correct graph below. ​(e) Would it be reasonable to use the​ least-squares regression line to predict the final exam score for a student who has missed 15 class​ periods? Why or why​ not?

(a) Find the​ least-squares regression line treating number of absences as the explanatory variable and the final exam score as the response variable. y= − 2.707x+87.8 b. For every additional​ absence, a​ student's final exam score drops 2.707 ​points, on average. The average final exam score of students who miss no classes is 87.8. c.=74.27 ​(Round to two decimal places as​needed.) y= − 2.707(5)+87.8 Compute the residual. ​-.67 Round to two decimal places as​ needed.) number of absences from data given 73.6-74.27 (calculated data) Is the final exam score above or below average for this number of​ absences? Below (73.6 is below the calculated average for 5 absences) e. ​No, because 15 absences is outside the scope of the model. (given data only goes up to 9 absences)

scatter diagram (scatterplot)

A plot of paired (x,y) data with a horizontal x-axis and a vertical y-axis. Data is paired in a way that matches each value form one data set with a corresponding value from a second data set. Helps to determine whether there is some relationship between two variables. is a graph that shows the relationship between two quantitative variables measured on the same individual. Each individual in the data set is represented by a point in the scatter diagram. The explanatory variable is plotted on the horizontal axis (X), and the response variable is plotted on the vertical axis (y).

bivariate data

Consists of two variables, an explanatory and a response variable, usually quantitative. data in which two variables are measured on an individual. For example, we might want to know whether the amount of cola consumed per week is related to a person's bone density. The individuals would be the people in the study, and the two variables would be the amount of cola consumed weekly and bone density.

Determine whether the following statement is true or false. If r is close to​ 0, then little or no evidence exists of a relation between the two quantitative variables.

False. A value of r close to zero does not imply no​ relation, just no linear relation.

T/F Which of the following is true of the​ least-squares regression line y=b1x+b0​?

The sign of the linear correlation​ coefficient, r, and the sign of the slope of the​ least-squares regression​ line, b1​, are the same. The predicted value of​ y, y​, is an estimate of the mean value of the response variable for that particular value of the explanatory variable. The​ least-squares regression line always contains the point x,y. The​ least-squares regression line minimizes the sum of squared residuals.

The least-squares regression line minimizes the sum of the squared errors (or residuals).

This line minimizes the sum of the squared vertical distance between the observed values of y and those predicted by the line, yˆ (read "y-hat"). We represent this as "minimize∑residuals2

Remember the idea of the slope of a line from algebra?

Two variables that are positively associated can be described by a line with positive slope; two variables that are negatively associated can be described by a line with negative slope.

A student at a junior college conducted a survey of 20 randomly selected​ full-time students to determine the relation between the number of hours of video game playing each​ week, x, and​ grade-point average, y. She found that a linear relation exists between the two variables. The​ least-squares regression line that describes this relation is y = -0.0572x + 2.9205a) a. Predict the​ grade-point average of a student who plays video games 8 hours per week. b) Interpret the slope. c) If​ appropriate, interpret the​ y-intercept. ​(d) A student who plays video games 7 hours per week has a​ grade-point average of 2.65. Is the​ student's grade-point average above or below average among all students who play video games 7 hours per​ week?

a) y = (-0.0572*8) + 2.9205 = 2.46 b) For each additional hour that a student spends playing video games in a​ week, the​ grade-point average will decrease by 0.0572 points, on average. c) The​ grade-point average of a student who does not play video games is 2.9205. (x=0) d)The​ student's grade-point average is above average for those who play video games 7 hours per week. (y = (-0.0572*7) + 2.9205 = 2.52

Is there a relation between the age difference between​ husband/wives and the percent of a country that is​ literate? Researchers found the​ least-squares regression between age difference​ (husband age minus wife​ age), y, and literacy rate​ (percent of the population that is​ literate), x, is y=−0.0424x+8.2. The model applied for 17≤x≤100. Complete parts​ (a) through​ (e) below. ​(a) Interpret the slope. Select the correct choice below and fill in the answer box to complete your choice. (b) Does it make sense to interpret the​ y-intercept? Explain. Choose the correct answer below. ​(c) Predict the age difference between​ husband/wife in a country where the literacy rate is 43 percent. ​(d) Would it make sense to use this model to predict the age difference between​ husband/wife in a country where the literacy rate is 11​%? ​(e) The literacy rate in a country is 98​% and the age difference between husbands and wives is 2 years. Is this age difference above or below the average age difference among all countries whose literacy rate is 98​%? Select the correct choice below and fill in the answer box to complete your choice.

a. For every unit increase in literacy rate, the age difference falls by 0.0424 ​units, on average. b. No—it does not make sense to interpret the​y-intercept because an​x-value of 0 is outside the scope of the model. c. 6.4 years y=−0.0424(43)+8.2 y=−1.8232+8.2 y=6.3768 or 6.4 d. No—it does not make sense because an​x-value of 11 is outside the scope of the model. (11% is more than 8.2) e. Below—the average age difference among all countries whose literacy rate is 98​% is 4.0 years. y=−0.0424(98)+8.2

A pediatrician wants to determine the relation that exists between a​ child's height,​ x, and head​ circumference, y. She randomly selects 11 children from her​ practice, measures their heights and head circumferences and obtains the accompanying data. Complete parts​ (a) through​ (e). ​(a) Find the​ least-squares regression line treating height as the explanatory variable and head circumference as the response variable. (b)Use the regression equation to predict the head circumference of a child who is 25 inches tall. ​(c​) Compute the residual based on the observed head circumference of the 25​-inch-tall child in the table. Is the head circumference of this child above average or below​ average? ​(d) Draw the​ least-squares regression line on the scatter diagram of the data and label the residual from part​ (c). Choose the correct graph below. Is the head circumference of this child above average or below​ average? ​(e) Notice that two children are 26.75 inches tall. One has a head circumference of 17.3​ inches; the other has a head circumference of 17.5 inches. How can this​ be?

a. The​ least-squares regression line is y=0.1863x+12.3728 b. The predicted value of the head circumference of a child who is 25 inches tall is 17.03 inches. c. The residual based on the observed head circumference of the 25​-inch-tall child is −.13 inches. (to calculate look in the table for data for 25 inches tall data, then subtract 17.03) d. select the correct graph according to statcrunch. Below average (if the residual results are negative, the answer will be below average) e. For children who are 26.75 inches​ tall, head circumference varies.

Lyme disease is an inflammatory disease that results in a skin rash and flulike symptoms. It is transmitted through the bite of an infected deer tick. The following data represent the number of reported cases of Lyme disease and the number of drowning deaths for a rural county. Complete parts ​(a) through ​(c) below. ​(a) Draw a scatter diagram of the data. Choose the correct graph below. ​(b) Determine the linear correlation coefficient between Lyme disease and drowning deaths. ​(c) Does a linear relation exist between the number of reported cases of Lyme disease and the number of drowning​ deaths?

a. use applets to graph, X is Lyme disease, y is drowning b. The linear correlation coefficient between Lyme disease and drowning deaths is r=0.957 ​c. The variables Lyme disease and drowning deaths are positively associated because r is positive and the absolute value of the correlation​ coefficient, 0.957​, is greater than the critical​ value, 0.576. An increase in Lyme disease does not cause an increase in drowning deaths. The temperature and time of year are likely lurking variables.

A pediatrician wants to determine the relation that exists between a​ child's height,​ x, and head​ circumference, y. She randomly selects 11 children from her​ practice, measures their heights and head​ circumferences, and obtains the accompanying data. Height (inches), x Head Circumference (inches), y 27.5 17.8 24.5 17.3 25.5 17.3 26 17.8 24.25 17.1 28 17.9 26.5 17.6 27.25 17.8 26 17.5 26 17.7 28 17.8 Complete parts​ (a) through​ (g) below ​(a) Find the​ least-squares regression line treating height as the explanatory variable and head circumference as the response variable. ​(b) Interpret the slope and​ y-intercept, if appropriate. First interpret the slope. Select the correct choice below​ and, if​ necessary, fill in the answer box to complete your choice. Interpret the​ y-intercept, if appropriate. Select the correct choice below​ and, if​ necessary, fill in the answer box to complete your choice. ​(c) Use the regression equation to predict the head circumference of a child who is 24.25 inches tall. ​(d) Compute the residual based on the observed head circumference of the 24.25​-inch-tall child in the table. Is the head circumference of this child above or below the value predicted by the regression​ model? ​(e) Draw the​ least-squares regression line on the scatter diagram of the data and label the residual from part​ (d). Choose the correct graph below. ​(f) Notice that two children are 26 inches tall. One has a head circumference of 17.5 ​inches; the other has a head circumference of 17.7 inches. How can this​ be? (g) Would it be reasonable to use the​ least-squares regression line to predict the head circumference of a child who was 32 inches​ tall? Why?

a. y=0.183x+12.8 b. For every inch increase in​ height, the head circumference increases by 0.183 ​in., on average. It is not appropriate to interpret the​ y-intercept. c. y=17.24 in. y=0.183(24.25)+12.8 d. The residual for this observation is −.14​, meaning that the head circumference of this child is below the value predicted by the regression model. f. For children with a height of 26 ​inches, head circumferences vary. No—this height is outside the scope of the model. (look at the data all subjects were under 28-inch height)

Lyme disease is an inflammatory disease that results in a skin rash and flulike symptoms. It is transmitted through the bite of an infected deer tick. The following data represent the number of reported cases of Lyme disease and the number of drowning deaths for a rural county. Cases_of_Lyme_Disease Drowning_Deaths Month 3 0 J 1 1 F 3 2 M 4 1 A 5 2 M 15 10 J 22 16 J 13 5 A 6 3 S 5 3 O 4 1 N 1 0 D Critical Values for Correlation Coefficient n 3 0.997 4 0.950 5 0.878 6 0.811 7 0.754 8 0.707 9 0.666 10 0.632 11 0.602 12 0.576 13 0.553 14 0.532 15 0.514 16 0.497 17 0.482 18 0.468 19 0.456 20 0.444 21 0.433 22 0.423 23 0.413 24 0.404 25 0.396 26 0.388 27 0.381 28 0.374 29 0.367 30 0.361 Complete parts ​(a) through ​(c) below. ​(a) Draw a scatter diagram of the data. Choose the correct graph below. ​(b) Determine the linear correlation coefficient between Lyme disease and drowning deaths. ​(c) Does a linear relation exist between the number of reported cases of Lyme disease and the number of drowning​ deaths? Do you believe that an increase of Lyme disease causes an increase in drowning​ deaths? What is a likely lurking variable between cases of Lyme disease and drowning​ deaths?

b. The linear correlation coefficient between Lyme disease and drowning deaths is r=0.964 c. The variables Lyme disease and drowning deaths are positively associated because r is positive and the absolute value of the correlation​ coefficient, 0.964​, is greater than the critical​ value, 0.576. ​(Round to three decimal places as​ needed.) (look up the critical value for the sample size from the data given) d. An increase in Lyme disease does not cause an increase in drowning deaths. The temperature and time of year are likely lurking variables.

The linear correlation coefficient, or Pearson product moment correlation coefficient

is a measure of the strength and direction of the linear relation between two quantitative variables. The Greek letter ρ (rho) represents the population correlation coefficient, and r represents the sample correlation coefficient. We present only the formula for the sample correlation coefficient.

Suppose the line y=2.8333x−22.4967 describes the relation between the​ club-head speed​ (in miles per​ hour), x and the distance a golf ball travels​ (in yards), y. ​(a) Predict the distance a golf ball will travel if the​ club-head speed is 100 mph. ​(b) Suppose the observed distance a golf ball traveled when the​ club-head speed was 100 mph was 265 yards. What is the​ residual?

​(a) The golf ball will travel 260.8 yards. y=2.8333x-22.4967 y=2.8333(100)-22.4967 y=283.33-22.4967 y=260.8 (b) The residual is 4.2 residual=observe y - predicted y y=265-260.8 y=4.2


Kaugnay na mga set ng pag-aaral

Chapter 5 Intro to Bussiness Tes

View Set

Uncomplicated Pregnancy, Labor & Childbirth

View Set

Unit 5 Genetics study guide concept 3

View Set

4 things all Cells have in common

View Set

EMT all flashcards which term need to be studied

View Set