Chapter 3: Describing Data: Numerical Measures
Mode can be determined for data measured on which scales of measurements?
nominal, ordinal, interval, or ratio
Sample Standard Deviation Formula
s= square root of s^2
Sample Variance Formula
s^2 = ∑(x- x̄)^2 / n-1 s^2 is the sample variance x is the value of each observation in the sample x̄ is the mean of the sample n is the number of observations in the sample
sample variance
s²
range
the difference between the highest and lowest scores in a distribution
Variance
The arithmetic mean of the squared deviations from the mean
Parameter
a characteristic of a population
Statistic
characteristic of a sample
weighted mean
A way to compute the arithmetic mean when there are several observations of the same value.
Process for computing population variance
1. Find the mean 2. Find the difference between each observation and the mean, and square that difference 3. Sum all of the squared differences 4. Divide the sum of the squared differences by the number of items in the population
Symmetrical Distribution Shape
1. Has a bell shape 2. All three measures of location are at the center of distribution
What are the properties of the median?
1. It is not affected by extreme values 2. It can be computed for ordinal-level data or higher
Disadvantages of mode
1. It isn't always present 2. Some data sets may have more than one mode 3. It does not use all the data
measures of dispersion
Variation or the spread in the data
A reason to study dispersion is to ____
Compare the spread in two or more distributions
What is the disadvantage of using mean?
It can be affected by extreme values
population standard deviation
the square root of the population variance
sample standard deviation
the square root of the sample variance
Bimodal
two modes
geometric mean
used when calculating investment returns over multiple periods or to measure compound growth rates over time
population variance
σ² = Σ ( Xi - μ )² / N
Properties of the Arithmetic Mean
1. To compute the mean, the data must be measured at the interval or ratio level 2. All the values are included in computing the mean 3. The mean is unique 4. The sum of the deviations of each value from the mean is zero
measures of location
A statistic that describes a location within a data set. Measures of central tendency describe the center of the distribution.
The sample standard deviation is used as _____
An estimator of the population standard deviation
Most used measure of location?
Arithmetic mean
Five measures of location
Arithmetic mean, median, mode, weighted mean, and geometric mean
Geometric mean will always____
Be less than or equal to (never more than) the arithmetic mean
We often select a sample from the population to
Estimate a specific characteristic of the population
Empirical Rule
For a symmetrical, bell-shaped frequency distribution, approximately 68% of the observations will lie within plus and minus one standard deviation of the mean; about 95% of the observations will lie within plus and minus two standard deviations of the mean; and practically all (99.7%) will lie within plus and minus three standard deviations of the mean.
Chebyshev's Theorem
For any set of observations (sample or population), the proportion of the values that lie within k standard deviations of the mean is at least 1-1/k^2, where k is any value greater than 1
The empirical rule is also called the ___
Normal rule
Mode has the advantage of
Not being affected by extreme values
How do calculate the median when there are an even number of observations?
Organize the observations from smallest to largest, then take the mean of the two middle observations.
formula for sample mean
Sum of all the values in the sample divided by the number of values in the sample
Median
The midpoint of the values after they have been ordered from the minimum to the maximum values
Mode
The value that occurs most frequently in a given data set.