STA215: Data and Correlation

¡Supera tus tareas y exámenes ahora con Quizwiz!

categorical variable

a variable with qualitative data

Which symbol represents slope in a statistical model?

b

What kind of variables can the response and explanatpry variable be?

both quantatative, both categorical, one of each

Fill in the blank: The closer 𝑟2 is to 100%, the

closer the data points are to the regression line.

Which of the following makes no distinction as to which variable is 𝑋 and which is 𝑌?

correaltion

why study bivariate data

focuses on whather a relationship exists between the explanatpry variable and response variable

Explanatory Variable

may or may not explain changes or influences change in the repsonse variable - denoted X - independent

correlation

measures direction and strength of linear association between x and y

Response variable

measures outcome on each individual- denoted Y - dependent variable

regression

models the linear relationship between x and y and uses the model to predct the value for y for a specific value of x

if r is negative

more negative areas

if r is positive

more positive areas

If we change the unit of measure for 𝑋 from centimeters to feet, will the value of 𝑟 change?

no

If we change 𝑋 from diameter to volume and we change 𝑌 from volume to diameter, will the value of 𝑟 change?

no

bivariate data is

quantitative data that has two variables; often represented using a scatterplot

Which of the following requires an important distinction between 𝑋 and 𝑌?

regression

quantitative variable

variable with nuerical data

Fill in the blank: 𝑟2 tells us the percentage of the ___ that is(are) explained by the least-squares regression line

variation in 𝑦

With 𝑟=0.941, how should we describe strength?

very strong`

When should you model with a straight line?

when the relationship between 𝑋 and 𝑌 is linear and 𝑋 and 𝑌 are quantitative

Suppose the correlation between number of beers consumed and blood alcohol content is 𝑟=0.9. What percentage of the variation in blood alcohol content can be explained by number of beers consumed?

81%

Which of the following values for 𝑟2 indicate perfect fit (i.e., all data points are on the regression line.)

100%

Suppose two different explanatory variables have a linear relationship with a response variable, 𝑦. The regression line using Variable #1 has an 𝑟2 of 47% and the regression line using Variable #2 has an 𝑟2 of 89%. Which explanatory variable explains the most variation in 𝑦?

2

What is a Statistical Model?

An equation that fits the pattern between a response variable ans possible explanatory variables, accounting for deviations from the models -simplest case: one quantitative response and one quantatiatve explanatory variable

True or False: If 𝑟2 is really close to 100%, then there is a lot of unexplained variation.

False

True or false: Multiplying the 𝑧‑score for 𝑋 by the 𝑧‑score for 𝑌 always gives positive products.

False

If there is no relationship between 𝑋 and 𝑌 and 𝑟2=0, what shape should we expect the data points in the scatterplot to resemble?

a hamburger parallel to the 𝑥-axis

What does correlation give us?

a measure of direction and strength of the linear relationship between 𝑥 and 𝑦

how can we comapre the strength of linear relationships more precisely than using words such as weakand strong

a measure to quantify

What does regression give us?

a model of the relationship between 𝑥 and 𝑦

Why is plotting the data so important before computing 𝑟?

To check whether the relationship is linear or non‑linear.

True or false: Because the value for 𝑟 is negative, we can say that the direction of the relationship is negative

True

True or false: The closer the data points are to the line, the closer 𝑟 is to either -1.0 or +1.0.

True

strength of relationship on scatterplot

determined by how closely the points follow a clear form; strong or weak correlation -strong: close -weak: far away

What does each dot in the scatterplot represent?

each (𝑋,𝑌) pair

Which variable may explain changes in the outcome?

explanatory variable

True or False: If the prediction errors (i.e., residuals) are large, then 𝑟2 is close to 100%.

false

True or false: 𝑟 is resistant to outliers.

false

True or False: If correlation (𝑟) is negative, then slope could be positive or negative—we cannot predict which.

false; If correlation, 𝑟, is negative, then slope will always be negative.

True or False: Knowing only the value of slope, you can determine the value of correlation.

false; Knowing the value of slope only tells us the direction of 𝑟; the value of slope tells us nothing about the value of 𝑟.

True or False: 𝑋 and 𝑌 can be interchanged in both correlation and linear regression.

false; 𝑋 and 𝑌 can be interchanged in correlation, but not in the formula for a regression line 𝑦̂=𝑎+𝑏𝑥.

True or False: 𝑟2 measures the fraction of 𝑦 values that are exactly predicted by the 𝑥 values.

false; 𝑟2 is a measure of the fraction of variation in the 𝑦's that is explained by 𝑥. It does not tell us the fraction of 𝑦 values that are exactly predicted as most are not, even when 𝑟2 is close to 100%.

Which variable is the outcome variable?

response variable

Why is it hard to depict between explanatory and response variable

sometimes they cannnot be designated

what does r give

strength

What does the symbol 𝑦̂ represent?

the predicted 𝑦 value

What does total variation in the 𝑦's measure?

the variability of the 𝑦's about their mean 𝑦¯

purpose for r

to measure size of joint variation in x and y for each point (each product gives area of rectangle

why do we compute deviations fo x and y

to measure the variation in x's and y's

True or False: Both correlation and linear regression require a straight line relationship between 𝑋 and 𝑌.

true

True or False: If correlation (𝑟) is positive, then slope is always positive.

true

True or False: If correlation (𝑟) is zero, then slope is always zero.

true

True or False: If 𝑟2 is really close to 100%, then the sum of squared residuals is very small.

true

True or False: Slope and correlation (𝑟) always have the same sign.

true

True or False: The regression line always passes through the point (𝑥¯,𝑦¯).

true

True or False: 𝑟2 is a measure of how successfully the regression line explains the variation in 𝑦.

true

True or false: A value of 𝑟=-1.5 has to be an error.

true

True or false: The formula for computing 𝑟 includes 𝑧‑scores for both 𝑋 and 𝑌.

true

True or false: The sign on 𝑟 always denotes the direction of the relationship.

true

True or false: Using color and symbols, we can clearly see the three linear relationships for types of hotdogs displayed in the scatterplot.

true

what is bivariate data?

two measurements (two variables) on each individual in a study - study relationship between variables

What type of data is graphed with a scatterplot?

two quantitative variables measured on each individual


Conjuntos de estudio relacionados

Answering Multiple Choice Questions

View Set

Mental Health CH 1 Pre, Post, NCLEX, Interactive Review & Chapter

View Set

Bible Doctrine: Memory Verse Quiz

View Set

Adult Gero Exam Cardiovascular ATI

View Set

五年级 2021 数学 单元一:整数与运算 (试卷一) (b)

View Set

Ch. 6 Blood and Lymphatic and Immune System

View Set

eInvestigation Lab 3A: Solar and Terrestrial Radiation

View Set

Chapter 02: Assignment: The Environment and Corporate Culture

View Set