Test 2
r family
"variance accounted for" measures
covariance
# that reflects the extent to which 2 variable covary with one another aka S(hat)xy
calculating n for independent samples t
(2*(noncentrality^2))/(d^2)
Calculating N for one-sample t
(noncentrality/d)^2
Kendall's Tau
*Dr. Woodard's preference, can calculate significance X is ordinal, Y is ordinal or higher Step1: put scores on X in order from lowest to highest P is # of consistent score (if 1st # < 2nd #) Q is # of inconsistent score (if 1st > 2nd #) Tau = P - Q / P + Q
Spearman's rho
*special case of pearson's r X is ordinal, Y is ordinal or higher
ANOVA tree
--->SSbg (variation due to treatment (df=k-1) CF --> SStotal(total variation (df = N-1) --->SSwg (variation due to error(df = N-k)
planned comparisons
-based on theory -do not need significant omnibus F -limited to k-1 comparisons -don't need to control αFW -must be orthogonal
Orthogonal contrasts
-contrasts are NOT correlated -one contrast gives us no information about another contrast
post-hoc comparisons
-not based on theory, exploratory -need significant F -not limited to k-1 -should control αFW
cohen's d rule of thumb
0.2 = small, 0.5 = medium, 0.8 = large
Fcontrast calculation
= (SScontrast1/dfcontrast1)/MSwg SScontrast = ((∑(𝑐𝑖𝑇𝑖)))^2/(𝑛∑((𝑐𝑖)^2 ) 𝑐𝑖 is the number you assigned to the contrast (eg. 1, -1, 0) 𝑇𝑖 is the sum of all the x's in that group
harmonic mean (effective sample size) for unequal sample sizes
= 2(n1)(n2)/n1+n2
eta squared
= SSbg / SStotal simpliest effect size for ANOVA, aka correlation ratio -positively biased -represents proportion of the DV accounted for by group membership
SStotal computation
= Sum(x^2)/1 - CF df= N-1
Correction Factor
= T^2/N
ANOVA
Analysis of Variance
ANOVA via multiple regression
Basic approach: through coding, we break down each level of IV in ANOVA to become each predictor in MR. -In this case, we call each predictor a "vector" because it is no longer a stand alone variable.
Tukey's HSD
Basic rationale: obtain difference between any pair of group means, and compare it to critical difference. compares all pairwise comparisons -controls αFW -2nd most stringent -3rd most powerful
Dunn-Bonferroni
Chosen PW OR PW&OW (make choice a priori) -controls αFW -3rd stringent -2nd powerful
Coding: Orthogonal/contrast
Coding scheme: like orthogonal coding for planned comparisons in ANOVA Constant: grand mean of all levels of IV Slope: use to calculate mean for each level or differences among levels.
correlation
Correlation represents how much two variables (X&Y) covary together in a linear fashion.
Using R2 to Calculate F
For Omnibus F: 𝐹(𝑝,𝑁−𝑝−1)= (𝑅^2⁄𝑝)/(((1−𝑅^2))⁄((𝑁−𝑝−1))) For each comparison: 𝐹(1,𝑁−𝑝−1)=(𝑟𝑐𝑜𝑚𝑝𝑎𝑟𝑖𝑠𝑜𝑛^2)/(((1−𝑅^2))⁄((𝑁−𝑝−1))) p = k-1
One-way ANOVA
IV = 1 nominal, with 2 or more levels DV = 1 interval/ratio
Standard Error of Estimate/prediction
On average, the predicted Y scores deviate from actual Y scores by this much unit =SQRT(SSresidual/N-2)
Dunnett
PW only against control group
Coefficient of Determination for Regression
R^2 = SSregression/SStotal Amount of variance in Y accounted for by X
multiple regression
Regression using multiple predictors and one criterion variable. equation = Y' = b0 + b1X1 + b2X2 + .... -Adding more predictors increases overall variance explained (R2) in Y b0: constant other b/β weights or slopes: unique contribution of each X (without redundantly taking overlap between Xs into account) If Xs are correlated, standardized regression weights (β) are no longer bivariate relationship (r) between X and Y. Predictors/IVs: usually interval/ratio, can be dichotomous Criterion/DV: interval/ratio, continuous
SSwithin
SStotal - SS between DF = dftotal - dfbetween
ANOVA matrix
Set up as follows: x1 x1^2 x2 x2^2 x3 x3^2 sum each column separately Sum of non-square totals = T
Coding: Effect
Similar to dummy coding except for the last group The last level gets all -1 Constant: grand mean of all levels of IV Slope: difference between each level and the grand mean
SSbetween
Sum(Total per group/ n) - CF df = k-1
power
The probability of correctly rejecting a false null hypothesis
One-sample t-test for correlation
Used for testing r to 0 One-sample t-test for correlation t(N-2) = r*SQRT(n-2)/SQRT(1-r^2)
familywise error rate
When k>2, we are making a set/family of comparisons Type I error (α) for every comparison will accumulate/inflate αFW = 1- (1-αPC)^c c = #of comparisons aPC = alpha per comparison
Linear Additive Model for an Individual Score (X)
X = Meu (population mean) + Tau (treatment effect) + E (Error)
General Linear Model
Y = b0 + bX + e ANOVA is a special case of (multiple) linear regression, which itself is a special case of the General Linear Model (GLM) GLM for MR: Y = b0 + b1X1 + b2X2 + ... + e GLM for ANOVA: Yij = μ + τj + eij
Regression equation
Y' = a + bX X: predictor (variable) Y: criterion (variable) Y': predicted value of Y a: intercept (constant) b: slope (constant)
Regression Formula with Z scores
Zypredicted = (rxy)(Zx) rxy= slope intercept = 0
effect size
a simple way of quantifying the difference between two groups that has many advantages over the use of tests of statistical significance alone
biserial
artificial dichotomous x, continuous y
tetrachoric
artificially dichotomous x and y
pearson's r
both x and y are interval/ratio & normally distributed = covxy/sxsy
Correlation Z-score formula
bottom formula used for understanding how z-scores of X & Y affect the correlation.
slope
change in predicted y for a one-unit change in x Sum(x-xbar*y-ybar)/variance of x
fixed effects
chose IVs systematically, generalize only to those chosen IV levels -can also be when all possible levels used (e.g. male/female for sex)
Scheffe
compares all pairwise and otherwise -controls αFW -1st most stringent -4th powerful
Fisher's LSD
compares all pairwise comparisons -doesn't control αFW -least stringent -most powerful
coefficient of determination for correlation
correlation coefficient squared, effect size
noncentrality parameter for one-sample t
d*SQRT(N)
noncentrality for independent samples t
d*SQRT(n/2)
SSregression/SSpredictor
degree of improvement over just using the mean score as a predictor Sum(Ypredicted - Ymean)^2
Omega squared
effect size for ANOVA should be 0 when there is no treatment effect, between 0 and 1 when there is -proportional amount of population variance that is attributed to variance among experimental treatments -the least bias effect size for ANOVA -not good to use when levels of IV are extreme
estimated epsilon squared
effect size for ANOVA, better estimate than eta squared =SSbg - (dfbg)(MSwg)/SStotal aka adjusted R-squared
Cohen's f
extensions of cohen's d, standard deviation among group means/standard error of estimate measure of effect size for ANOVA Small: f = .1 Medium: f = .25 Large: f = .4
Coding: Dummy
for each vector, only one level gets 1, all the other levels get 0. Each level takes turns to get the 1 until one level is left. The last level gets 0s on all vectors. Constant: mean of the level which gets all 0 Slope: difference between each level and the last level
noncentrality parameter
helps define your noncentral distribution and represents the degree to which the mean of the sampling distribution of the test statistic departs from its mean when the null is true Noncentrality Parameter when power =0.8 is 2.8!
Assumptions for ANOVA
independence of error scores normality of error scores homogeneity of variance
point-biserial
kind of pearson correlation x is dichotomous, y is continuous
phi
kind of pearson correlation x and y both dichotomous
Glass's Delta
measure of effect size, modified Cohen's d by replacing the common SD with SD of control group (can't use if no control group)
SSresidual
minimized squared error Sum(Y-Ypredicted)^2
Kruskal-Wallis Test
non-parametric test for ANOVA -each sample is independent -is group has at least 5 participants -uses ranks -good to use when DV is ordinal
Trend Analysis
patterns of means (quantitative IV levels) purpose is to establish FUNCTIONAL relationship IV and DV -used for non-linear trends -weights taken from table of orthogonal polynomials -can apply weights to SScontrast formula SScontrast = ((∑(𝑐𝑖𝑇𝑖)))^2/(𝑛∑((𝑐𝑖)^2 ) -used to obtain F
random effects
randomly chosen levels generalizable beyond chosen IV levels
ANOVA tree for planned comparisons
see notes, too complicated for flashcard
Beta
standardized regression coefficient (aka standardized slope) symbolic representation of the slope Independent of M and SD of X & Y can directly compare to Beta's of other distributions =rxy ONLY IN BIVARIATE REGRESSION
d family
standardized units of difference, the difference between two populations divided by the standard deviation of either population
b
symbolic representation of slope raw unstandardized units affected by M and SD of X & Y not directly comparable
SStotal
total variation in the criterion =Sum(Y-Ymean)^2 OR =SSresidual + SSregression
Confidence interval for an r (to z)
transform into a z CI = Zr +/- Zcrit(SQRT(1/n-3))
testing two correlations - z test
use fisher's r to z transformation Z= Z1 -Z2/SQRT((1/n1-3) - (1/n2-3))
Fisher's r-z transformation
used to see if there is a significant difference between two correlations or for conducting a confidence interval around a correlation
F ratio
variance due to treatment/variance due to error AKA MSbetween groups/MSwithin groups AKA (SSbg/dfbn)/(SSwithin/dfwithin)
Modified Bonferroni Procedure
when you do more than k-1 planned comparison as backed up by theory αPC = αFW/c = (k-1)α/c
Ordinary Least Squares Regression
with the goal of minimizing the sum of the squares of the differences between the observed responses in the given dataset and those predicted by a linear function
one-sample z-test for correlation
z= Zr - Zp/SQRT(1/n-3)
Bonferroni correction for α per comparison
αPC = α/c