STATS Symbols
GREEK with Latin: σp̂
"sigma-sub-p-hat" Spread of the Sampling Distribution of p̂ The standard deviation of the sample proportions. The standard deviation of the sampling distribution of p̂ has a special name: standard error of the proportion or SEP; its symbol is σp̂ ("sigma-sub-p-hat"). The standard error of the proportion for sample size n equals the square root of the population proportion, times 1 minus the population proportion, divided by the sample size: SEP or σp̂ = √[pq/n]. This is true regardless of the proportion in the original population and regardless of sample size.
(Ch 11) χ²
"chi-squared"
P80
80th percentile (Pk or Pk = k-th percentile)
median of a sample
Average
(CH 11) Chi-square
Involves categorical variables. Looks at 2 distributions of categorical data to see if they differ from each other. (Brownmath CH 9 & 12)
GREEK: μ
Population - mu, pronounced "mew" mean (average) of a population
GREEK: ρ
Population - rho, pronounced "roe" = linear correlation coefficient of a population.
f
frequency
E
margin of error, a/k/a maximum error of the estimate
(Ch 7) confidence interval
the range of values within which a population parameter is estimated to lie
GREEK with Latin: σx̅
"sigma-sub-x-bar" The standard deviation of the sampling distribution of x̅ has a special name: standard error of the mean or SEM; its symbol is σx̅. The standard error of the mean for sample size n equals the standard deviation of the population divided by the square root of n: SEM or σx̅ = σ/√n. This is true regardless of the shape of the original population and regardless of sample size.
x̅
"x-bar" = mean of a sample
x̃
"x-tilde" = median of a sample
ŷ
"y-hat" = predicted average y value for a given x, found by using the regression equation. predicted value of y
normal distribution
(CH 1) a bell-shaped curve, describing the spread of a characteristic throughout a population
M or Med
(CH 1) median of a sample
Frequency
(CH 2) The sizes of categories can be shown as raw counts, called frequencies, or percentages, called relative frequencies. (Relative frequencies can also be shown as decimals, but I think most people respond better to "20%" than ".20".) How do you decide whether to show frequencies or relative frequencies? This is a stylistic choice, not a matter of right and wrong. Your choice depends on what's important, what point you're trying to make. If your main concern is just with the individuals in your sample, go with frequencies. But if you want to show the relationship of the parts to the whole, show relative frequencies.
Interquartile Range (IQR)
(CH 2) the difference between the first and third quartiles
Null Hypothesis (H0)
(CH 8) A statement of "no difference."
Alternative Hypothesis (Ha)
(CH 8) The claim about the population that we are trying to find evidence for.
hypothesis test
(CH 8) a statistical method that uses sample data to evaluate a hypothesis about a population
relative frequency
(Ch 3.1) A ratio that compares the frequency of each category to the total.
z score equation
(x-mean)/standard deviation
Quartile
A division of the total into four intervals, each one representing one-fourth of the total.
(Ch 5) Binomial Probability Distribution
A probability distribution showing the probability of x successes in n trials of a binomial experiment.
df or ν "nu"
Degrees of Freedom (Brownmath CH 9 &12)
Greek letters are used for the population parameters
Greek letters are used for the population parameters
Latin letters are used for the sample statistics
Latin letters are used for the sample statistics
P(B | A)
P(A and B)/P(B) the probability that event B will happen, given that event A definitely happens. It's usually read as the probability of B given A. Defined here in Brownmath Chapter 5. Caution! The order of A and B may seem backward to you at first.
GREEK: ν "nu" or df
Population - ν "nu" or df number of categories minus 1
GREEK: σ
Population - LOWER CASE "sigma" = standard deviation of a population.
GREEK: α (alpha)
Population - Significance level in hypothesis test, or acceptable probability of a Type I error (probability you can live with).
GREEK: ∑
Population - UPPER CASE "sigma" means add them up Examples: ∑x = sum of all data points. (x means a data point. If you had to write this out the long way, it would be x1 + x2 + x3 + ... + xn, where n is the size of your data set.) ∑x² = square each data point and add up the squares. (∑ is an addition operator, so powers and multiplication happen before the summation.) ∑xf = multiply each unique data point by the number of times it occurs, and add up the results. (f means frequency or repetition count.) ∑x²f = square each unique data point and multiply by the number of times it occurs, then add up the results. ∑(x − x̅)² = take each data point and subtract the average of the whole sample, square the result, and add up all the squares. (x̅ is the average of a sample. The parentheses tell you that you don't square the average, you square the differences.)
Sample Variance Formula
Square root of sample standard deviation
degrees of freedom
The number of individual scores that can vary without changing the sample mean. Statistically written as 'N-1' where N represents the number of subjects.
Sample Mean Formula
The sample mean formula is: x̄ = ( Σ xi ) / n x̄ just stands for the "sample mean" Σ means "add up" xi "all of the x-values" n means "the number of items in the sample" Finding the mean To find the mean, sum all the numbers and then divide by the number of items in the set. For example, to find the mean of the following set of numbers: 21, 23, 24, 26, 28, 29, 30, 31, 33 First add them all together: 21 + 23 + 24 + 26 + 28 + 29 + 30 + 31 + 33 = 245 Then divide your answer by the number of items in your set. There are 9 numbers, so: 245 / 9 = 27.222
Slope of a line
The slope of a line characterizes the direction of a line. To find the slope, you divide the difference of the y-coordinates of 2 points on a line by the difference of the x-coordinates of those same 2 points . (The TI-83 uses a and some statistics books use b1.)
(CH 6.4) Central Limit Theorem
The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution.
Sample Standard Deviation Formula
The Σ means "to add up", so what you're basically doing to find the sample standard deviation is adding your numbers, squaring them and dividing.
z-score
a measure of how many standard deviations you are away from the norm (average or mean)
margin of error
a measure of the accuracy of a public opinion poll
coefficient of determination
a measure of the amount of variation in the dependent variable about its mean that is explained by the regression equation R² measures the quality of the regression line as a means of predicting ŷ from x: the closer R² is to 1, the better the line. Another way to look at it is that R² measures how much of the total variation in y is predicted by the line.
(CH 10.1) linear correlation coefficient
a measure of the strength and direction of the linear relation between two quantitative variables
X (capital X)
a variable
critical z score
a z score that separates common from rare outcomes and hence dictates whether H0 should be retained or rejected
H1 or Ha
alternative hypothesis
BD or BPD
binomial probability distribution.
CLT
central limit theorem
χ²
chi squared
R²
coefficient of determination; (r²) of the variation in (y) is explained by least squares regression of (y) on (x) (The calculator displays r², but the capital letter is standard notation.)
CI
confidence interval
discrete probability distribution
consists of the values a random variable can assume and the corresponding probabilities of the values (Brownmath CH 6)
d
difference between paired data. (In Brownmath, it's chapter 11. CH is unknown in my textbook)
DPD
discrete probability distribution
Q1
first quartile (Q3 or Q3 = third quartile)
HT
hypothesis test
IQR
interquartile range (Q3-Q1)
k-th percentile
k-th percentile
(CH 10.1) r
linear correlation coefficient of a sample
probability value
measure of the likelihood of the deviation of some observed results from the value that was expected if the null hypothesis was true. The specific meaning depends on context. In geometric and binomial probability distributions, p is the probability of "success" (Brownmath defined here in Chapter 6) on any one trial and q = (1−p) is the probability of "failure" (the only other possibility) on any one trial. In hypothesis testing, p is the calculated p-value (defined here in Brownmath Chapter 10), the probability that rejecting the null hypothesis would be a wrong decision. In tests of population proportions, p stands for population proportion and p̂ for sample proportion (see table above).
ND
normal distribution, whose graph is a bell-shaped curve; also "normally distributed".
H0 (null hypothesis)
null hypothesis
x (lower-case x)
one data value ("raw score"). As a column heading, x means a series of data values.
GREEK: β "beta"
population - in a hypothesis test, the acceptable probability of a Type II error; 1−β is called the power of the test.
N
population size
q
probability of failure on any one trial in binomial or geometric distribution, equal to (1−p) where p is the probability of success on any one trial.
p
probability value
f/n
relative frequency
n
sample size, number of data points.
z score equation population
see picture
z score equation sample
see picture
m
slope of a line.
SD (or s.d.)
standard deviation
square root of variance
standard deviation
s
standard deviation of a sample
(See above for definition) GREEK with Latin: σx̅
standard error of the mean
SEM
standard error of the mean (symbol is σx̅)
(See above for definition) GREEK with Latin: σp̂
standard error of the proportion
z
standard score or z-score
Population Standard Deviation Formula
sum of [ (value minus population average) squared] divided by population number
P(A)
the probability of event A
P(Ac) or P(not A)
the probability that A does not happen
standard deviation of a sample
the square root of the variance
(Look into. From b above) y-intercept
the y-coordinate of a point where a graph crosses the y-axis
z(area)
the z-score, such that that much of the area under the normal curve lies to the right of that z. This is not a multiplication! zarea or z(area), also known as critical z, is the z-score that divides the standard normal distribution such that the right-hand tail has the indicated area.
b or b0
y intercept of a line. Defined here in Chapter 4. (Some statistics books use b0.) https://brownmath.com/swt/symbol.htm
Population Mean Formula
μ = ∑X / N
population variance
σ² = Σ ( Xi - μ )² / N