Important Statistics Formulas
One-sample t-test: DF = n - 1 Two-sample t-test: DF = (s12/n1 + s22/n2)2 / { [ (s12 / n1)2 / (n1 - 1) ] + [ (s22 / n2)2 / (n2 - 1) ] } Two-sample t-test, pooled standard error: DF = n1 + n2 - 2 Simple linear regression, test slope: DF = n - 2 Chi-square goodness of fit test: DF = k - 1 Chi-square test for homogeneity: DF = (r - 1) * (c - 1) Chi-square test for independence: DF = (r - 1) * (c - 1)
Degrees of Freedom -The correct formula for degrees of freedom (DF) depends on the situation (the nature of the test statistic, the number of samples, underlying assumptions, etc.).
P(X = x) = b*(x; r, P) = x-1Cr-1 * Pr * (1 - P)x - r
Negative Binomial formula:
nh = n * ( Nh * σh ) / [ Σ ( Ni * σi ) ]
Neyman allocation (stratified sampling):
(Critical value) * (Standard deviation of statistic) or (Critical value) * (Standard error of statistic)
Margin of error (2)
P(x; μ) = (e-μ) (μx) / x!
Poisson formula: P(x; μ) = (e-μ) (μx) / x!
Rule of addition: P(A ∪ B) = P(A) + P(B) - P(A ∩ B) Rule of multiplication: P(A ∩ B) = P(A) P(B|A) Rule of subtraction: P(A') = 1 - P(A)
Probability
SEx = sx = s/sqrt(n)
Standard error of the mean
P(X = x) = b(x; n, P) = nCx * Px * (1 - P)n - x = nCx * Px * Qn - x
Binomial formula
DF = k - 1
Chi-square goodness of fit test:
Χ2 = [ ( n - 1 ) * s2 ] / σ2
Chi-square statistic
DF = (r - 1) * (c - 1)
Chi-square test for homogeneity:
DF = (r - 1) * (c - 1)
Chi-square test for independence:
Χ2 = Σ[ (Observed - Expected)2 / Expected ]
Chi-square test statistic
nCr = n! / r!(n - r)! = nPr / r!
Combinations of n things, taken r at a time:
Sample statistic + Critical value * Standard error of statistic
Confidence interval:
Pearson product-moment correlation = r = Σ (xy) / sqrt [ ( Σ x2 ) * ( Σ y2 ) ] Linear correlation (sample data) = r = [ 1 / (n - 1) ] * Σ { [ (xi - x) / sx ] * [ (yi - y) / sy ] } Linear correlation (population data) = ρ = [ 1 / N ] * Σ { [ (Xi - μX) / σx ] * [ (Yi - μY) / σy ] }
Correlation
n factorial: n! = n * (n-1) * (n - 2) * . . . * 3 * 2 * 1. By convention, 0! = 1. Permutations of n things, taken r at a time: nPr = n! / (n - r)! Combinations of n things, taken r at a time: nCr = n! / r!(n - r)! = nPr / r!
Counting
Binomial formula: P(X = x) = b(x; n, P) = nCx * Px * (1 - P)n - x = nCx * Px * Qn - x Mean of binomial distribution = μx = n * P Variance of binomial distribution = σx2 = n * P * ( 1 - P ) Negative Binomial formula: P(X = x) = b*(x; r, P) = x-1Cr-1 * Pr * (1 - P)x - r Mean of negative binomial distribution = μx = rQ / P Variance of negative binomial distribution = σx2 = r * Q / P2 Geometric formula: P(X = x) = g(x; P) = P * Qx - 1 Mean of geometric distribution = μx = Q / P Variance of geometric distribution = σx2 = Q / P2 Hypergeometric formula: P(X = x) = h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ] Mean of hypergeometric distribution = μx = n * k / N Variance of hypergeometric distribution = σx2 = n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ] Poisson formula: P(x; μ) = (e-μ) (μx) / x! Mean of Poisson distribution = μx = μ Variance of Poisson distribution = σx2 = μ Multinomial formula: P = [ n! / ( n1! * n2! * ... nk! ) ] * ( p1n1 * p2n2 * . . . * pknk )
Discrete Probability Distributions
Confidence interval: Sample statistic + Critical value * Standard error of statistic Margin of error = (Critical value) * (Standard deviation of statistic) Margin of error = (Critical value) * (Standard error of statistic)
Estimation
E(X) = μx = Σ [ xi * P(xi) ]
Expected value of X
E(X - Y) = E(X) - E(Y)
Expected value of difference between random variables
E(X + Y) = E(X) + E(Y)
Expected value of sum of random variables
P(X = x) = g(x; P) = P * Qx - 1
Geometric formula:
P(X = x) = h(x; N, n, k) = [ kCx ] [ N-kCn-x ] / [ NCn ]
Hypergeometric formula:
Standardized test statistic = (Statistic - Parameter) / (Standard deviation of statistic) One-sample z-test for proportions: z-score = z = (p - P0) / sqrt( p * q / n ) Two-sample z-test for proportions: z-score = z = z = [ (p1 - p2) - d ] / SE One-sample t-test for means: t-score = t = (x - μ) / SE Two-sample t-test for means: t-score = t = [ (x1 - x2) - d ] / SE Matched-sample t-test for means: t-score = t = [ (x1 - x2) - D ] / SE = (d - D) / SE Chi-square test statistic = Χ2 = Σ[ (Observed - Expected)2 / Expected ]
Hypothesis Testing
Mean of a linear transformation = E(Y) = Y = aX + b. Variance of a linear transformation = Var(Y) = a2 * Var(X). Standardized score = z = (x - μx) / σx. t-score = t = (x - μx) / [ s/sqrt(n) ].
Linear Transformations For the following formulas, assume that Y is a linear transformation of the random variable X, defined by the equation: Y = aX + b.
ρ = [ 1 / N ] * Σ { [ (Xi - μX) / σx ] * [ (Yi - μY) / σy ] }
Linear correlation (population data)
r = [ 1 / (n - 1) ] * Σ { [ (xi - x) / sx ] * [ (yi - y) / sy ] }
Linear correlation (sample data)
t-score = t = [ (x1 - x2) - D ] / SE = (d - D) / SE
Matched-sample t-test for means:
n = { z2 * σ2 * [ N / (N - 1) ] } / { ME2 + [ z2 * σ2 / (N - 1) ] }
Mean (simple random sampling):
μx = μ
Mean of Poisson distribution
E(Y) = Y = aX + b.
Mean of a linear transformation
μx = n * P
Mean of binomial distribution
μx = Q / P
Mean of geometric distribution
μx = n * k / N
Mean of hypergeometric distribution
μx = rQ / P
Mean of negative binomial distribution
μx = μ
Mean of sampling distribution of the mean
μp = P
Mean of sampling distribution of the proportion
P = [ n! / ( n1! * n2! * ... nk! ) ] * ( p1n1 * p2n2 * . . . * pknk )
Multinomial formula:
z-score = z = (X - μ)/σ
Normal random variable
t-score = t = (x - μ) / SE
One-sample t-test for means:
DF = n - 1
One-sample t-test:
z-score = z = (p - P0) / sqrt( p * q / n )
One-sample z-test for proportions:
nh = n * [ ( Nh * σh ) / sqrt( ch ) ] / [ Σ ( Ni * σi ) / sqrt( ci ) ]
Optimum allocation (stratified sampling):
Population mean = μ = ( Σ Xi ) / N Population standard deviation = σ = sqrt [ Σ ( Xi - μ )2 / N ] Population variance = σ2 = Σ ( Xi - μ )2 / N Variance of population proportion = σP2 = PQ / n Standardized score = Z = (X - μ) / σ Population correlation coefficient = ρ = [ 1 / N ] * Σ { [ (Xi - μX) / σx ] * [ (Yi - μY) / σy ] }
Parameters
r = Σ (xy) / sqrt [ ( Σ x2 ) * ( Σ y2 ) ]
Pearson product-moment correlation
nPr = n! / (n - r)!
Permutations of n things, taken r at a time:
p = (p1 * n1 + p2 * n2) / (n1 + n2)
Pooled sample proportion
sp = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ]
Pooled sample standard deviation
spooled = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ]
Pooled sample standard error
ρ = [ 1 / N ] * Σ { [ (Xi - μX) / σx ] * [ (Yi - μY) / σy ] }
Population correlation coefficient
μ = ( Σ Xi ) / N
Population mean
σ = sqrt [ Σ ( Xi - μ )2 / N ]
Population standard deviation
σ2 = Σ ( Xi - μ )2 / N
Population variance
n = [ ( z2 * p * q ) + ME2 ] / [ ME2 + z2 * p * q / N ] sqrt( ci ) ]
Proportion (simple random sampling):
nh = ( Nh / N ) * n
Proportionate stratified sampling:
Expected value of X = E(X) = μx = Σ [ xi * P(xi) ] Variance of X = Var(X) = σ2 = Σ [ xi - E(x) ]2 * P(xi) = Σ [ xi - μx ]2 * P(xi) Normal random variable = z-score = z = (X - μ)/σ Chi-square statistic = Χ2 = [ ( n - 1 ) * s2 ] / σ2 f statistic = f = [ s12/σ12 ] / [ s22/σ22 ] Expected value of sum of random variables = E(X + Y) = E(X) + E(Y) Expected value of difference between random variables = E(X - Y) = E(X) - E(Y) Variance of the sum of independent random variables = Var(X + Y) = Var(X) + Var(Y) Variance of the difference between independent random variables = Var(X - Y) = E(X) + E(Y)
Random Variables In the following formulas, X and Y are random variables, and a and b are constants.
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
Rule of addition:
P(A ∩ B) = P(A) P(B|A)
Rule of multiplication:
P(A') = 1 - P(A)
Rule of subtraction:
Mean (simple random sampling): n = { z2 * σ2 * [ N / (N - 1) ] } / { ME2 + [ z2 * σ2 / (N - 1) ] } Proportion (simple random sampling): n = [ ( z2 * p * q ) + ME2 ] / [ ME2 + z2 * p * q / N ] Proportionate stratified sampling: nh = ( Nh / N ) * n Neyman allocation (stratified sampling): nh = n * ( Nh * σh ) / [ Σ ( Ni * σi ) ] Optimum allocation (stratified sampling): nh = n * [ ( Nh * σh ) / sqrt( ch ) ] / [ Σ ( Ni * σi ) / sqrt( ci ) ]
Sample Size -The first two formulas find the smallest sample sizes required to achieve a fixed margin of error, using simple random sampling. The third formula assigns sample to strata, based on a proportionate design. The fourth formula, Neyman allocation, uses stratified sampling to minimize variance, given a fixed sample size. And the last formula, optimum allocation, uses stratified sampling to minimize variance, given a fixed budget.
r = [ 1 / (n - 1) ] * Σ { [ (xi - x) / sx ] * [ (yi - y) / sy ] }
Sample correlation coefficient
x = ( Σ xi ) / n
Sample mean
s = sqrt [ Σ ( xi - x )2 / ( n - 1 ) ]
Sample standard deviation
s2 = Σ ( xi - x )2 / ( n - 1 )
Sample variance
Mean of sampling distribution of the mean = μx = μ Mean of sampling distribution of the proportion = μp = P Standard deviation of proportion = σp = sqrt[ P * (1 - P)/n ] = sqrt( PQ / n ) Standard deviation of the mean = σx = σ/sqrt(n) Standard deviation of difference of sample means = σd = sqrt[ (σ12 / n1) + (σ22 / n2) ] Standard deviation of difference of sample proportions = σd = sqrt{ [P1(1 - P1) / n1] + [P2(1 - P2) / n2] }
Sampling Distributions
Simple linear regression line: ŷ = b0 + b1x Regression coefficient = b1 = Σ [ (xi - x) (yi - y) ] / Σ [ (xi - x)2] Regression slope intercept = b0 = y - b1 * x Regression coefficient = b1 = r * (sy / sx) Standard error of regression slope = sb1 = sqrt [ Σ(yi - ŷi)2 / (n - 2) ] / sqrt [ Σ(xi - x)2 ]
Simple Linear Regression
SEp = sp = sqrt{ p * ( 1 - p ) * [ (1/n1) + (1/n2) ] }
Standard error of difference for proportions
SEd = sd = { sqrt [ (Σ(di - d)2 / (n - 1) ] } / sqrt(n)
Standard error of difference of paired sample means
SEd = sd = sqrt[ (s12 / n1) + (s22 / n2) ]
Standard error of difference of sample means
ŷ = b0 + b1x
Simple linear regression line:
DF = n - 2
Simple linear regression, test slope:
sd = sqrt{ [p1(1 - p1) / n1] + [p2(1 - p2) / n2] }
Standard error of difference of sample proportions
SEp = sp = sqrt[ p * (1 - p)/n ] = sqrt( pq / n )
Standard error of proportion
Standard error of proportion = SEp = sp = sqrt[ p * (1 - p)/n ] = sqrt( pq / n ) Standard error of difference for proportions = SEp = sp = sqrt{ p * ( 1 - p ) * [ (1/n1) + (1/n2) ] } Standard error of the mean = SEx = sx = s/sqrt(n) Standard error of difference of sample means = SEd = sd = sqrt[ (s12 / n1) + (s22 / n2) ] Standard error of difference of paired sample means = SEd = sd = { sqrt [ (Σ(di - d)2 / (n - 1) ] } / sqrt(n) Pooled sample standard error = spooled = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ] Standard error of difference of sample proportions = sd = sqrt{ [p1(1 - p1) / n1] + [p2(1 - p2) / n2] }
Standard Error
σd = sqrt[ (σ12 / n1) + (σ22 / n2) ]
Standard deviation of difference of sample means
σd = sqrt{ [P1(1 - P1) / n1] + [P2(1 - P2) / n2] }
Standard deviation of difference of sample proportions
σp = sqrt[ P * (1 - P)/n ] = sqrt( PQ / n )
Standard deviation of proportion
σx = σ/sqrt(n)
Standard deviation of the mean
Z = (X - μ) / σ
Standardized score
z = (x - μx) / σx.
Standardized score = z = (x - μx) / σx.
(Statistic - Parameter) / (Standard deviation of statistic)
Standardized test statistic
Sample mean = x = ( Σ xi ) / n Sample standard deviation = s = sqrt [ Σ ( xi - x )2 / ( n - 1 ) ] Sample variance = s2 = Σ ( xi - x )2 / ( n - 1 ) Variance of sample proportion = sp2 = pq / (n - 1) Pooled sample proportion = p = (p1 * n1 + p2 * n2) / (n1 + n2) Pooled sample standard deviation = sp = sqrt [ (n1 - 1) * s12 + (n2 - 1) * s22 ] / (n1 + n2 - 2) ] Sample correlation coefficient = r = [ 1 / (n - 1) ] * Σ { [ (xi - x) / sx ] * [ (yi - y) / sy ] }
Statistics Unless otherwise noted, these formulas assume simple random sampling.
t-score = t = [ (x1 - x2) - d ] / SE
Two-sample t-test for means:
DF = n1 + n2 - 2
Two-sample t-test, pooled standard error:
DF = (s12/n1 + s22/n2)2 / { [ (s12 / n1)2 / (n1 - 1) ] + [ (s22 / n2)2 / (n2 - 1) ] }
Two-sample t-test:
z-score = z = z = [ (p1 - p2) - d ] / SE
Two-sample z-test for proportions:
σx2 = μ
Variance of Poisson distribution
σ2 = Σ [ xi - E(x) ]2 * P(xi) = Σ [ xi - μx ]2 * P(xi)
Variance of X
Var(Y) = a2 * Var(X).
Variance of a linear transformation
σx2 = n * P * ( 1 - P )
Variance of binomial distribution
σx2 = Q / P2
Variance of geometric distribution
σx2 = n * k * ( N - k ) * ( N - n ) / [ N2 * ( N - 1 ) ]
Variance of hypergeometric distribution
σx2 = r * Q / P2
Variance of negative binomial distribution
σP2 = PQ / n
Variance of population proportion
sp2 = pq / (n - 1)
Variance of sample proportion
Var(X - Y) = E(X) + E(Y)
Variance of the difference between independent random variables
Var(X + Y) = Var(X) + Var(Y)
Variance of the sum of independent random variables
f = [ s12/σ12 ] / [ s22/σ22 ]
f statistic
n! = n * (n-1) * (n - 2) * . . . * 3 * 2 * 1. By convention, 0! = 1.
n factorial: