Statistics Chapter 3 and 4 Quiz
A sample of n=8 scores has a mean of M=10. After one score is removed from the sample, the mean for the remaining score is found to be M=11. What was the score that was removed?
X = 3
central tendency
a statistical measure to determine a single score that defines the center of a distribution. The goal of central tendency is to find the single score that is most typical or most representative of the entire group.
computation of the median requires scores that can be placed ___________ and are measured on ____________
in rank order (smallest to largest) and are measured on an ordinal, ratio, or interval scale
what scale of measurement is the interquartile range used for?
interval or ratio
how is central tendency used in inferential statistics?
it is possible to compare two (or more) sets of data by simply comparing the average score (central tendency) for one set versus the average score for another set
how is variability used in descriptive statistics?
it measures the degree to which the scores are spread out or clustered together in a distribution
how is variability used in inferential statistics?
it provides a measure of how accurately any individual score or sample represents the population
what are the advantages of using the mean?
it uses all the data points, and the sample mean is an unbiased estimator of the population mean
order of central tendency when looking at a negatively skewed distribution
mean -> median -> mode (starting at the bottom left and going up to the top)
do you use mean, median, or mode to determine the central tendency of interval or ratio scales?
mean, but only if symmetrical
do you use mean, median or mode to determine the central tendency of an ORDINAL scale?
median
do you use mean, median, or mode if the data is skewed/undetermined?
median
do you use mean, median, or mode when trying to summarize the central tendency of an open-ended distribution?
median
A researcher measures eye color for a sample of n=50 people. Which measure of central tendency would be appropriate to summarize the measurements?
mode
do you use mean, median or mode to determine the central tendency of a NOMINAL scale?
mode
order of central tendency when looking at a positively skewed distribution
mode -> median -> mean (starting at the top and going down to the bottom right)
what type of scale does the mean require for computation?
scores that are numerical values measured on an interval or ratio scale
interquartile range
the difference between Q1 and Q3
semi-interquartile range
the difference between Q1 and Q3, divided by two
One sample with n=4 scores has a mean of M=12, and second sample with n=6 scores has a mean of M=8. If the two samples are combined, what is the weighted mean for the combined set of scores?
- EX(1) + EX(2) / n(1) + n(2) - 48 + 48 / 4 +6 = 96/10 = 9.6
what two purposes does variability serve?
1. variability describes the distribution (specifically, it tells whether the scores are clustered together or are spread out over a large distance) 2. variability measures how well an individual score (or group of scores) represents the entire distribution
For the population of scores shown in the frequency distribution table, the mean is _______. X f 5 2 4 1 3 3 2 2 1 2
29/10 = 5.80
what happens to the mean and standard deviation when you add a constant to all the scores in the distribution?
the mean changes by that same constant, the standard deviation does not change
mean
the mean for a distribution is the sum of scores divided by the number of scores. M = EX over N
what happens to the mean and standard deviation when you multiply all the scores in a distribution?
the mean multiples by that value and so does the standard deviation
what is the advantage of using the median over the mean?
the median is relatively unaffected by extreme scores when dealing with interval/ratio data
median
the point on the measurement scale below which 50% of the scores in the distribution are located
why would we use the interquartile/semi-interquartile range as opposed to the regular range?
the range is completely determined by two extreme values and ignores the other scores in the distribution, so it often does not give an accurate description of variability. the semi/interquartile range ignores extreme scores, and focuses on the middle 50%
what three things can variability be measured with?
the range, the interquartile range / semi-interquartile range, and the standard deviation/variance
what are disadvantages of using the mode?
the sample mode does not reliably say anything about population mode, it is possible to have more than one mode or no mode at all, it does not lend itself to additional mathematical operations
how do you calculate the range when the scores are measurements of a continuous variable?
upper real limit for the largest score minus the lower real limit for the lowest score
bimodal
when a distribution has more than one mode (note: a distribution can only have one mean and one median)
when NOT to use the mean?
when a distribution of interval/ratio scores contains a few extreme scores (or is very skewed), the mean will be pulled towards the extremes
Given a population with a mean of μ = 60, which of the following values for the population standard deviation would cause X=68 to have the most extreme position in the distribution?
σ = 2
what are the three pieces of information you need to fully describe a set of data (ordinal, interval, or ratio)?
1. the overall shape 2. a measure of central tendency 3. a measure of variability
A sample has a mean of M=50 and a standard deviation of s=10. If a new score of X=70 is added to the sample, what will happen to the mean and the standard deviation?
Both the sample mean and standard deviation will increase
central tendency serves as a ____________ because __________
descriptive statistic because it allows researchers to describe or present a set of data in a very simplified, concise form
what are the advantages and disadvantages of variance and standard deviation
advantages: uses all data, sample variance is an unbiased estimator of population variance disadvantage: is very sensitive to outliers and over-estimates true variability if there are a few scores
what are the advantages and disadvantages of the semi-interquartile range?
advantages: it is less likely to be influenced by extreme scores gives more stable measure of variability than the range disadvantages: it *only* considers the middle 50% and completely disregards the other 50%, so it does not give a complete picture of the variability for the distribution
For a particular sample size of n=10, the largest distance (deviation) between a score and the mean is 11 points. The smallest distance between a score and the mean is 4 points. Therefore, the standard deviation will be ______
between 4 and 11
the weighted mean
EX(1) + EX(2) over n(1) + n(2)
For a negatively skewed distribution with a mean of M= 20, what is the most probable value for the median?
Greater than 20
In a population of N=6, five of the individuals all have scores that are exactly 1 point above the mean. From this information, what can you determine about the score for the sixth individual?
It is below the mean by 5 points
Explain why we use ( n-1 ) in the denominator of the formula for sample variance but not in the denominator of the formula for population variance.
Sample variance naturally underestimates the population variance, so the n-1 has to be added to account for the bias. This happens because sample variances is often smaller than population variance, so that causes the underestimation. When using n-1 in large sample sizes, it generally does not make much of a difference in the results. But, in smaller sample sizes, with n-1 in the equation, sample variance becomes an unbiased estimator of population variance.
what are the advantages and disadvantages of using interquartile/semi-interquartile range?
advantage: not influenced by outliers disadvantage: doesn't reflect variability of entire distribution, only takes into account the scores in the middle
what are advantages of using the mode?
can be used with all scales of measurement, always corresponds to an actual score, and is easily determined