Chapter 6 Analyzing the Relationship between Two Variables
specificity
= TN / (TN + FP) measures the proportion of true negatives divided by the number of true positives
Sensitivity
= TP / (TP + FN) measures the proportion of subjects with the indicator present that also has a positive test or number of true positives and false negatives
Correlation
A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other.
What are the descriptive statistics used for?
Contingency tables - Used to display and analyze the relationship between two categorical variables
p-value
The probability level which forms basis for deciding if results are statistically significant (not due to chance).
simple linear regression
is used to characterize the linear relationship between a dependent variable and one independent variable. expressed in slope intercept form.
coefficient of determination
measures the amount of variance in the dependent variable that is explained by the independent variable
Pearson r correlation coefficient
measures the strength of the linear relationship e two continuos variables; range from -1 to +1, a value of -1 indicates a perfect negative correlation whereas a value of +1 indicates a perfect positive correlation
t-test for correlation
r X √(n-2) / √(1-r^2) Used to test the null hypothesis that the correlation coefficient is zero
least squares regression line
the line that makes the sum of the squared residuals as small as possible
slope-intercept form
y=mx+b, where m is the slope and b is the y-intercept of the line.
what are the steps in the chi-squared test of independence?
1. Determine the null and alternative hypotheses 2. Set the acceptable type I error or alpha level 3. Select the appropriate test statistic
standardized residuals
Calculated by subtracting the average residual from each residual and dividing by the standard deviation of the residuals.
residual
Difference between the actual value of the dependent variable and the value predicted using the regression equation
Regression Hypothesis Tests Must test the following assumptions regarding the residuals:
Independence - Normally distributed - Mean of zero
Spearman's rho
Measures the linear association between two ordinal variables or one ordinal and one continuous variable, A positive value means that both variables increase/decrease together - Example: patient severity level and charges • A negative value means that one variable increases as the other decreases - Example: Grade in elementary school and time to run 100 yards
R-squared statistic
The amount of variation in Y explained by X. *KEY WORD=EXPLAINED How much variation in the output variable is explained by the input variable.
critical value
The dividing point between the region where the null hypothesis is rejected and the region where it is not rejected.
Chi-square
a common statistic used to analyze nominal and ordinal data to find differences between groups
