Correlation and regression analysis

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

spurious correlation

EG price of petrol shows a positive correlation with divorce rate over time

regression line

a straight line equation used to model the relationship between the dependant and independent variable

coefficient of determination R2

all points wont be along the line but a straight line can summaries the pattern of the data the proportion of the total variabilityof the dependant variable Y explained by the regression of X is called R2 often quoted as a measure of goodness to fit of the regression line to data equal to the square of the correlation coefficient r

correlation analysis

correlation measures the strength of the linear relationship between 2 quantitative variables

simple linear regression

describe quantitatively the linear relationship between a dependant variable Y and the independent variable X regression is used when it is thought one variable affects or predicts the other us value of X to predict Y may want to quantify the relationship between 2 quantitative variables by a regression line

spearmans rank correlation

if 2 random variables are not normally distributed then use this test ordinal scales and rank data outliers do not effect it

line of best fit

line comes closer to all points than any other line least squares used to fit the line of best fit method choses a line so that the square of the vertical between the line and the point is minimised

multiple linear regression

method can be extended and have more predictor varibles in the regression equation you can investigate the effect of both height and age on shoe size multiple prefictor values

pearson correlation coefficient

r, is a measure of linear association between 2 continuous numerical variables normally distributed

regression assumptions

relationship between x and y is approximately linear and this can be checked in a scatter plot don't fit a straight line into a non linear relationship dependant is normally distributed and this can be checked by a histogram or boxplot

when not to use correlation coefficient

there is a non linear relationship between variables correlation can miss a strong non linear relationship there is presence of outliers there are distinct sub groups

equation of a straight line

y=mx+c y is dependant, x is independent, c is the intercept of the Y when X=0 and m is the gradient gradient shows change in y for a unit change in x when value of Y increases as X increases it will be positive or if y decreases as x decreases it will be negative equation of regression line gives values of y for different values of x


Ensembles d'études connexes

Earth's Resources: Human Impact on Resources

View Set

FINC 409 Exam 3 (Ch. 10, 11, 12, 6 multiple choice questions)

View Set

Health Assessment Chapter 1 PrepU (Week 1)

View Set