Stats Ch. 4 Variability
4 steps to calculating the standard deviation
1. Determine deviation from the mean for each score 2. Square each deviation 3. Divide by n 4. Square root the result
Definitional formula for finding the sum of squares
1. Find each deviation score (x-u) 2. Square each deviation score (x-u) squared 3. Add the squared deviations
Computational formula for finding the sum of squares
1. Square each x and add together 2. Subtract this number from: 3. sum of x and square this sum divided by n
In general, a good measure of variability serves two purposes:
1. Variability describes the distribution. Usually defined in terms of DISTANCE. 2. Variability measures how well an individual score (or group of scores) represents the entire distribution. It gives you information on how much ERROR to expect if you are using a sample to represent a population.
Remember, sample variability tends to ______estimate population variability unless some correction is made.
under
A measure of ________ provides an objective description of the differences between the scores in a distribution by measuring the degree to which the scores are spread out or are clustered together.
variability
Key terms for this chapter
variability range deviation score sum of squares (SS) population variance (σ2) population standard deviation (σ) sample variance (s2) sample standard deviation (s) degrees of freedom (df) unbiased statistic biased statistic
Deviation score = ____ - ____
x - u (mean)
Range for discrete variables
xmax-xmin
Range when the scores are all whole numbers
xmax-xmin + 1
If there is no variability, what is the standard deviation?
zero
What is the sum of the deviation scores going to be?
zero
As a rule of thumb, roughly ____% of the scores in a distribution are located within a distance of one standard deviation from the mean, and almost all of the scores (roughly ____%) are within two standard deviations of the mean.
70%; 95%
Variability definition
A measure of the degree to which the scores in a distribution are clustered together or spread apart.
Which tends to be less variable...a sample or a population?
A sample
Explain when a sample statistic is biased or unbiased
A sample statistic is unbiased if the average value of the statistic is equal to the population parameter. (The average value of the statistic is obtained from all the possible samples for a specific sample size, n.) A sample statistic is biased if the average value of the statistic either underestimates or overestimates the corresponding population parameter.
Unbiased statistic definition
A statistic that, on average, provides an accurate estimate of the corresponding population parameter. The sample mean and sample variance are unbiased statistics.
Why s the range considered to be a crude and unreliable measure of variability?
Because the range does not consider all of the scores in the distribution, it often does not give an accurate description of the variability for the entire distribution.
Adding a Constant to Each Score Does Not Change/Does Change the Standard Deviation
DOES NOT CHANGE (the mean will change with the constant)
What are degrees of freedom?
Degrees of freedom = df = n − 1, measures the number of scores that are free to vary when computing SS for sample data. The value of df also describes how well a t statistic estimates a z-score.
What is the solution to measuring variability with deviation scores that would normally cancel each other out and always equal zero?
Get rid of the positive and negative signs!! The standard procedure for accomplishing this is to square each deviation score. This results in the mean squared deviation, which is called variance.
Explain the difference between a biased and an unbiased statistic.
If a statistic is biased, it means that the average value of the statistic does not accurately represent the corresponding population parameter. Instead, the average value of the statistic either overestimates or underestimates the parameter. If a statistic is unbiased, it means that the average value of the statistic is an accurate representation of the corresponding population parameter.
Standard deviation and variance are only obtained from which measurement scales?
Interval and ratio scales
M and n
M for the sample mean, n for number of scores in a sample
Why is the sample variance often called estimated population variance, and the sample standard deviation called estimated population standard deviation?
Remember that the formulas for sample variance and standard deviation were constructed so that the sample variability would provide a good estimate of population variability. When you have only a sample to work with, the variance and standard deviation for the sample provide the best possible estimates of the population variability.
In many journals, especially those following APA style, the symbol ___ is used for the sample standard deviation.
SD
Equation for population variance is:
SS/N
Equation for population standard deviation is:
Square root of SS/N
The variance is not exactly what we want. What step do we take next to get the standard deviation?
Take the square root of the variance.
What is population variance?
The average squared distance from the mean; the mean of the squared deviations. This will get rid of positive and negative signs.
Range definition
The distance from the upper real limit of the highest score to the lower real limit of the lowest score; the total distance from the absolute highest point to the lowest point in the distribution.
Standard deviation definition
The square root of the variance and provides a measure of the standard, or average, distance from the mean.
The fact that a sample tends to be less variable than its population means that sample variability gives a ______ estimate of population variability.
biased
Sum of squares
The sum of the squared deviation scores. The first part in calculating variance.
Why has an alternate formula for SS been created?
When the mean is not a whole number, the deviations all contain decimals or fractions, and the calculations become difficult. In addition, calculations with decimal values introduce the opportunity for rounding error, which can make the result less accurate.
Explain why the formula for sample variance divides SS by n − 1 instead of dividing by n.
Without some correction, sample variability consistently underestimates the population variability. Dividing by a smaller number (n − 1 instead of n) increases the value of the sample variance and makes it an unbiased estimate of the population variance.
Multiplying Each Score by a Constant Causes the Standard Deviation to:
be Multiplied by the Same Constant
A sample statistic is said to be ______ if, on average, it consistently overestimates or underestimates the corresponding population parameter.
biased
Deviation is the _____ from the ______
distance from the mean. It gives the actual distance from the mean.
Sample variance makes an adjustment in the formula from a population variance by:
doing n-1. So sample variance equals SS over n-1
To find the mean of the deviation scores, what do you do?
find all the deviation scores and divide by n. This will equal zero.
The average of the deviation scores does not work as a measure of variability because:
it is always zero
What greek letter is used for population standard deviation?
lower case sigma, which looks like an o
Variance formula
mean squared deviationi, which is the sum of squared deviations over the number of scores
The purpose of variability is to:
measure and describe the degree to which the scores in a distribution are spread out or clustered together.
if the scores in a distribution are all the same, then there is _____ variability.
no
The purpose for measuring variability is to:
obtain an objective measure of how the scores are spread out in a distribution.
Which two measurement scales are not used with standard deviation and measurement?
ordinal and nominal scales
The most commonly used and the most important measure of variability
standard deviation
SS stands for:
sum of squared devitions
Range for a continuous variable
the difference between the upper real limit (URL) for the largest score (Xmax) and the lower real limit (LRL) for the smallest score (Xmin).
The two parts to a deviation score
the number and a plus or minus sign
In this chapter, we consider three different measures of variability:
the range, standard deviation, and the variance.
There are three basic measures of variability:
the range, the variance, and the standard deviation.
The standard deviation will have to be between:
the smallest distance from the mean and the largest distance from the mean