PSYC 210- Correlation & Regression

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

lurking variable/confounding variable

- 3rd variable that is responsible for both Ex. ice cream sales & shark attacks = temps (lurking) Ex. high temp and violent crime = time spent indoors (lurking)

r= -1 r = 1 r = 0 What do these mean?

-1 = perfectly negative relationship 1 = perfectly positive relationship 0 = no relationship at all

r squared = 0 r squared = 1

0 no good prediction 1 perfectly predicts

Let's use our height to weight regression equation: y = .7x - 63. What is the approximate weight (in kg) of someone with a height of 150 cm?

42 kg Explanation: All we do here is plug in 150 for x in our equation. So, y = 0.7(150) - 63, which equals 42 kg

regression equation

A formula for a line that models a linear relationship between two quantitative variables. -Used to make predictions about the variables

regression line (line of best fit)

A line, segment, or ray drawn on a scatter plot to estimate the relationship between two sets of data.

regression analysis

A method of predicting sales based on finding a relationship between past sales and one or more independent variables, such as population or income

R-squared value

A statistical measurement of the strength of the correlation between two variables and ranges from 0 to 1. -the proportion of variance that the two variables share How much of the variance in the outcome can be explained by our model (predictors), compared to if we had just used the mean (average value)? How well does our model (regression equation) "fit the data? How well does our regression line do in minimizing the residuals?

spurious correlation

significant correlation that exists b/n two variables that actually have no relationship with each other -Example of type 1 error Ex. ice cream sales and shark attacks

1-r squared

the error: not accounted proportion of variance Difficult or impossible to get perfect r squared

I am looking at whether a record company's advertising budget relates to how many albums they sell. I find that r = -0.4. What does this correlation coefficient tell us? Select all that apply.

Companies that spend a lot on advertising also don't have a lot of sales (and vice versa) There is a moderate relationship between advertising and sales

regression residual

is the distance from the each of the data points to the regression line -this difference b/n the real and predicted data

Correlation does not equal causation

just b/c 2 variables are related this does not mean that one has an impact on the other

correlation

measure of a linear relationship b/n 2 continuous variables -No nominal data -Should be described by a line not curve Ex. height and weight, high school gpa & college gpa, cardiovascular health and athletic performance, sleep amounts and grumpiness levels

How do we find the right line?

How do we find the right line? Ordinary least squares regression: estimation process

Which analysis should I use in this case: I want to see whether drinking more glasses of water a day leads to lower levels of acne.

I want to use a regression model here because I have a specific predictor variable (drinking water) that I think might cause a specific outcome (lower acne levels).

correlation coefficient

measure of a linear relationship b/n 2 continuous variables (from -1 to 1) The degree to which x and y covary relative to how much they each vary by themselves How much are they related to each other and how varied are they on their own?

sum of squared and total squared equals

no relationship

What is the primary assumption made of correlation?

normality

What is the appropriate null and alternative hypothesis for this question: I expect advertising budget to be related to record sales.

null: r = 0, alternative: r not equal to 0 Explanation: Since I do not state a direction in my research question, the appropriate alternative hypothesis here is that there is *some* relationship, which is r not equal to 0. Then, the corresponding null hypothesis is r = 0.

direction of correlation coefficient

Positive- large values for one variable are associated with large values for the other variable Small values for one are associated with small values for the other Negative- Large values for one variable are associated with small values for the other (vice versa)

If r calculated is greater than r in critical range, then...

reject the null

Why do we use z-scores in the correlation coefficient equation?

They're easy to calculate Z-scores give us information about the variability of each variable Standardizing scores allows us to compare across variables with different units

Which analysis should I use here: I think there might be a relationship between the number of students in the incoming freshman class and amount of financial aid available.

This might be better suited for a correlation analysis, because I don't know if more students paying for college means more money for financial aid or if more if more financial aid means more students can afford to come to school. Since I have no predictions of which comes first, I should just look at whether a relationship exists without trying to make a predictive model.

Finding a best fit regression line

Want to minimize the total size of the residuals as much as possible Want to minimize the sum of the square of residuals

We can use a regression equation to calculate

an *approximate* value for our outcome variable given a specific predictor variable value Explanation: Because our regression line is a best fit of our data, it does not accurately represent specific values we might see. But it does give us an approximation of our outcome (dependent) variable!

Which value represents the slope of a regression line?

b1 Explanation: Our regression equation can be written as y = b1(x) + b0, where b1 is the slope and b0 is the y-intercept.

bad regression line

data points too far away from best fit line -residuals too large

What does correlation coefficients tell us?

direction (sign) and the magnitude (size)

magnitude of correlation coefficient

how strong the correlation is 0.1-0.3 weak 0.3-0.5 moderate 0.5-1 strong *0.3 is pretty good for psychology!

total sum of squares

how varied our data is around the middle value is the formula to compared r^2 with mean average -clean formula sum of squared


Kaugnay na mga set ng pag-aaral

Chapter 8 Study Guide for AP Human Geography (Political Geography)

View Set

MCB 181R - ch. 7 homework questions

View Set