Stats: Quiz 2: Correlation and Regressions

Ace your homework & exams now with Quizwiz!

Goodness of Fit Parameter Interpretation (r2)

- Before Interpretation of parameters, we need to check if it is proper to reduce the point cloud to a single line - A descriptive measure for determining the goodness of fit in regression models is the coefficient of determination = r squared *** How much of the variability in Y is explained by the predictor(s) - Lies between 0 (worst fit) and 1 (best fit, 100% variance explained)

Spearman Correlation

- Captures monotonic relationships. If certain assumptions for Pearson are not fulfilled (e.g. ordinal, strong normality violation) -uses Pearson formula applied to the ranks (i.e. order) of the data - ORDINAL

Spearman Assumptions (3)

- Data at least on ordinal scale (not too many ties → don't use it for two Likert items) - No normal distribution required - Linearity not absolutely necessary (Spearman can capture monotonic relationships

Pearson Test Stat (Hypothesis, test type, and 3 assumptions )

- Null - there is no correlation between X and Y - A: The correlation between X and Y is not 0 Corresponding test statistic is, under null, t distribution with df = n-2 - T-distribution - standard normal distribution (mean = 0 sd = 1) - Assumptions for the test to work 1. X and Y metric variables 2. Linear relationship between X and Y 3. In small samples each variable should follow a normal distribution

Regression versus correlation

- Regression analysis we aim to model a functional (i.e. linear) relationship between X and Y. In correlation analysis we characterize the association between two variables by a single number - Implies that in regression we are able to make PREDICTIONS - cannot do that with correlations - Regressions it matters whether we regress Y on X or vice versa, correlation (X,Y) and (Y,X) does not matter - Regression can be extended to multiple predictors (multiple regression) whereas our correlations are limited to two variables

Error assumptions

- residuals - actual minus predicted - Residuals should average 0 on average - should have constant variance - Uncorrelated - normal

Regression Work Flow (5 steps)

1. Descriptive analysis: histogram, box plot, scatter plot, means, sd, correlations 2. Fit the model 3. Check the assumptions (normality check for small samples, residual plots) if no violation, continue 4. Examine goodness of fit (i.e. R Squared) 5. Interpret the regression parameters (slope parameters)

Testing for Normality (3 + one comment)

1. Histogram 2. Quantile-quantile-plot (Q-Q-plot) - slice observed variable into quantiles and do the same thing for an artificial/simulated variable that reflects a normal distribution. Plot the quantiles against each other. If normality holds, they should be on a pretty much straight line 3. Kolmogoroff-smirnov and/or Shapiro-Wilks test: stat tests for normality Note: Always look at the three tools in combination! Don't decide regarding normality based on the stat test only

Regression Analysis Hypothesis

H0: X has no effect on Y (B1=0) H1: X has an effect on Y (B1 does not = 0)

Pearson Correlation

Number -1 to 1 that describes linear relationship - Descriptive tool used is the scatterplot - characterize dependency in a single number - Covariance and Correlation --> Standardizing the covariance, we get the Pearson correlation

B0, B1

Regression parameters: intercept and slope

Regression Equation

Yi=β0+β1Xi+εi

Multiple Regression Equation

i=β0+β1Xi1+β2Xi2+···+βpXip+εi NO MULTICOLINEARITY

ei

random error term ith observation

Xi

value of predictor (independent) variable for the ith observation

Yi

value of the response (dependent) variable for the ith observation. (metric!)


Related study sets

Model Evaluation , Metrics (Week 8)

View Set

What is ChatGPT? The new AI wonder tool explained (hun)

View Set

🔬bio midterm (quizizz) 2020🔬

View Set

Intermediate Accounting II - Ch. 15 Conceptual (Stockholders' Equity)

View Set

NU226 CH 39 Oxygenation and perfusion

View Set

1) Chapter 26: Management of Patients With Dysrhythmias and Conduction Problems

View Set