Chapter 4 Scatterplots and Correlation
What does each dot in a scatterplot represent?
each ( X,Y ) pair
Outline:
(1) plot data with scatterplot; look for overall pattern of relationship and deviations from that pattern, especially outliers (2) compute a summary # called correlation coefficient as a measure of strength of linear relationship (3) Model linear relationships with a straight line equation called the regression equation
What is the inequality that correctly identifies the set of values that a correlation 𝑟 can possibly take.
-1 < r < 1
Correlation is not
A complete summary of two verbal data
Values of r near 0 indicate
A very weak relationship
Scatterplots..
Are most useful for displaying the relationship between two quantitative variables
Variables can be:
Both quantitative, categorical, or one of each
Sometimes explanatory variables and response variables
Cannot be designated
r has no units and does not change when we
Change the units of measurement of X, Y or both
What best describes correlation?
Correlation measures the strength of the linear relationship between two quantitative variables
When examining a relationship between two quantitative variables, we look at...
Form, direction, strength, and outliers.
Two variables are not associated if
Knowing the value of one variable does not give you any information about the value of the other variable
Negative r (correlation) indicates
Negative association
Correlation does not describe curved relationships between variables
No matter how strong the relationship is
What is the first rule in analyzing bivariate data when both variables are quantitative?
Plotting data
Positive r (correlation) indicates
Positive association between the variables
The points on a scatterplot lie very close to a straight line. The correlation between 𝑥 and 𝑦 is close to
either −1 or 1, we can't say which
What type of data is graphed with a scatterplot?
Two quantitative variables measured on each individual
Changing the units will not affect the correlation because...
Units do not affect correlation
To add a categorical variable to an existing scatterplot....
Use a different color or symbol for each category
When should you model with a straight line?
When the relationship between X and Y is linear and X and Y are quantitative
Which axis is the explanatory variable?
X-axis
Which axis is the response variable?
Y-axis
Explanatory variables
are denoted by X and explain or influence changes in a response variable a.k.a. the independent variable
Response variables
are denoted by Y and are the outcome of a study a.k.a. the dependent variable
Changing units does not...
change the correlation.
correlation makes no distinction between
explanatory and response variables
If the correlation between two variables is close to 0 , you can conclude that a scatterplot would show
no straight‑line pattern, but there might be a strong pattern of another form.
Correlation requires that both variables be
quantitative
Correlation is not resistant,
r is strongly affected by a few outlying observations
positive association
values of one variable tend to be higher when values of the other variable are higher
negative association
values of one variable tend to be lower when values of the other variable are higher