Week 4 - Correlation and Regression

¡Supera tus tareas y exámenes ahora con Quizwiz!

Your boss gives you the following regression equation. X = square feet and Y = selling price Selling price = $5,240 + $33.80 (Number of Square Feet). Does it make sense to interpret the Y-intercept for this equation?

False

Suppose the correlation between X =price of a gallon of gasoline and Y = price of a gallon of milk is r = .30 Should we go on and try to make predictions for milk prices using gasoline prices using a straight line?

No

SSE

Sum of squared for errors

A researcher is trying to predict the linear relationship between January revenue and yearly revenue for her company. The correlation turns out to be .60. How does she interpret this correlation?

There is a moderate positive linear relationship between January revenue and yearly revenue.

Correlation is affected by outliers

True

Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.

a neg

If a residual is negative, then that data point lies _________________ the regression line.

below

Interpreting correlation

exactly -1 indicates perfect downhill linear -1 strong down downhill linear 0 no linear relationship +1 strong uphill linear exactly +1 indicates perfect uphill linear

Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8.

false

Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing is .6 x 12 (since you multiply yards by 12 to convert to feet).

false

Residual

observed y - predicted y

b1= slope formula

r (correlation) * sdY/sdX

What does SSE stand for?

sum of squares for error

bo= y-intercept

y (mean) - bx(mean)

Best line formula

y=bo + b1x

1. all the points on a scatterplot lie perfectly on a straight line going uphill 2. the mean of X and the mean of Y are both 2 3 the standard deviations of X and Y are exactly the same. Can you find the equation of the best fitting line with this information? (Hint: Think of the '5 number' way of finding the best-fitting line.)

yes

coefficient of determination

% of variability in y due to x (r^2)

Criteria for best line: small SSE

- Find the line w/ the smallest SSE - In the equation, find the values of bo and b1 that minimize SSE - From this bo and b1 you create the 'least squares regression line'

Examining residuals

- No pattern - No systematic change as X increases - No unusually large values of residual (outliers in Y direction) - No influential points (outliers in X direction)

Properties of Correlation

- Quantitative variables only - Linear relationships only - Has not units - It doesn't matter if you switch x and y - Affected by outliers and skewness

Interpreting a scatterplot

- Simplest general pattern - Direction (pos, neg) - Strength (how closely ppt follow a line pattern

What should the residual plot look like if the regression line fits the data well?

- points fall around the horizontal line Y = 0 - random patterns - no fan shapes

Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?

-0.90

Correlation formula

= E(x-x)(y-y)/(std x * std y)

The personnel department keeps records on all employees in a company. Here is the information they keep in one of their data files: -Employee identification number -Last name - First name - Middle initial - Department - Number of years with the company - Salary ($) - Education Level (high school, some college, or college degree) - Age (years) Which of the following combinations of variables would be appropriate to examine with a scatterplot?

Age and Salary.

Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation?

As square feet increase by 1, selling price increases by $33.80

Bob is interested in examining the relationship between the number of bedrooms in a home and its selling price. After downloading a valid data set from the internet, he calculates the correlation. The correlation value he calculates is only 0.05. What does Bob conclude?

Bob continues his research because even though there is no linear relationship here, there could be a different relationship.


Conjuntos de estudio relacionados

Chapter 13: Creating Innovative Organizations

View Set

Medication Administration Post-Test

View Set

Chapter 9 muscle fibers and tissue

View Set

Endocrinology, Anterior Pituitary and Hypothalamus

View Set

past quizzes and turning point ?s

View Set

Chapter 22, Nurse Leader, Manager, and Care Coordinator

View Set