Biostat correlation & regression
True
TrueWhen a correlation exists between two variables, regression predicts unknown data ex; gpa
Pearson Correlation
Used to quantify the association between two intervals or ratio variables.
Positive correlation coefficient
Variables proceed in the same direction ex; high X are associated with high Y scores and Low X scores are with low Y scores.
True
Variables that are not correlated decrease R2 and AR2
True
Variables that are not correlated decrease R2 and Ar2
True
Violation of homoscedasity can underestimate the strength of a correlation
True
When a correlation exists between two variables, regression predicts unknown data
Negative correlation coefficient (inverse relationship)
variable proceed into opposite direction ex; low X scores are associated with high Y scores and high X scores are with low Y scores.
Residual variation (error variance)
variance in DV not related to changes in IV
Regression variation
variance in DV related to changes IV
True
violating normality can distort a correlation coefficient
Rsquared
used for small size sample most scientist provide the score
Explanatory variable
(Independent variables, Predictor) explains or influences changes in a response variable ex: number of hours spent studying
the point where a regression line intercepts the y-axis the value of y when is x zero
(Intercept, Regression)
Response Variable
(Dependent Variable) Measures an outcome of a study ex: test scores
Weak
0.1-0.3 (Absolute Coefficient)
True
A correlation of -.90 has the same degree of the strength as +.90
Correlation Coefficient
A quantitative value from (-1.0 to +1.0)
Correlation
A relationship between two variables used to measure the strength of the association
True
AR2 is always less than or equal to R2
True
Adjusted R-squared (AR2) is for small sample size because a small sample will give a deceptively large R2
Linear Regression
An approach for modeling a relationship between an IV (x) and a DV (y)
Assumption of correlation
An inspection of scatterplot can give an impression of whether two variables are related and their direction of their relationship.
False
Chisquare is a parametric test
false
Correlation alone does guarantee causality
R
Correlation coefficient a value between -1.0 and 1.0
True
Correlation does not equal causation
1. The strength and direction of the relationship
Correlation provides two information
Correlation, describing the linear relationship between two variables Regression, predict the relationship between more than two variables and can use it to identify the outcome
Correlation vs. Regression
True
Covariance is the proportion of the total variance that is shared by X and Y
Normality
Data points are normally distributed ex; X(time spent on the internet) and Y(time spent watching tv) scores are normally distributed
True
Highly correlated variables increase R2 and ARs
True
If r is positive, there is a positve association between two variables, b is positve
Linearity
If violated misleading conclusion occurs (stop doing correlation)
false
In linearity rate of change between two variable are not constant
An approach for modeling a relationship between an independent variable (X) and a dependent variable (y)
Linear regression
Model used to predict a binary response from one or more independent variables
Logistic Regression
Categorical
Logistic regression
0.4-0.6 (Absolute Coefficient)
Moderate
True
Negative AR2 can be interpreted as 0
Spearman correlation
Non-parametric of statistical dependence between two variables
Simple logistic Regression
One categorical (binary, dichotomous) DV one continuous or categorical IV
False
Pearson Correlation is a non-parametric test
Variance in DV related to changes in IV
Regression variation
Variance in DV not related to changes in IV
Residual variation
False
Spearman correlation uses mean
Linearity
Straight line relationship between the Ivs and the Dvs
0.7-0.9 (Absolute Coefficient)
Strong
True
The closer a set of data points falls to a regression line (straight line) , the stronger the correlation (r=+-1.0)
True
The closer data points fall to a regression line, the more that the values of two factors vary together
True
The sign of the coefficient indicates the only the direction of the correlation
Homoscedasticity
There is an equal variance or scatter (scedascity) of data points dispersed along the regression line.
1.Cause must precede the effect in time 2. cause and effect must be correlated with each other 3. correlation between a cause and an effect cannot be explained by confounding variable
Three criteria for causation
Variance
average of squared deviation about a mean
Multivariate logistic regression
more than one DV categorical more than one continuous or categorical IV
Multivariate logistic regression
more than one categorical DV more than one continuous or categorical
Multivariate Linear regression
more than one continuous Dv more than one continuous or categorical IV
0 (Absolute Coefficient)
no relationship
Multiple (Multivariable) logistic regression
one categorical DV More than one continuous or categorical IV
Multinominal (polychotomous) Logistic regression
one categorical DV/more than two levels more than one continuous or categorical IV
Simple Bivariate (linear regression)
one continuous DV (body weight) one continuous or categorical IV (sex)
Multiple multivariable (linear regression)
one continuous Dv more than one continuous or categorical IV
Standardized coefficient (beta coefficient)
original data are converted into z-scores to standardize coefficient Interpreted like Pearson r
1 (Absolute Coefficient)
perfect
Causation (causality)
relation between a cause and an effect
Unstandardized coefficient (slope)
relationship are expressed in terms of original data used for prediction
b (slope, regression coefficient)
the amount of change in y(dv) as x(iv) changes
Covariance
the extent to which the values of two factors (X and Y) vary together