Stats Quiz 2
If a variable has a distribution that is bell-shaped with mean 16 and standard deviation 5, then according to the Empirical Rule, 99.7% of the data will lie between which values?
1 and 31
True or False: When comparing two populations, the larger the standard deviation, the more dispersion the distribution has, provided that the variable of interest from the two populations has the same unit of measure.
True, because the standard deviation describes how far, on average, each observation is from the typical value. A larger standard deviation means that observations are more distant from the typical value, and therefore, more dispersed.
Is the trimmed mean resistant to changes in the extreme values for the given data?
Yes, because changing the extreme values does not change the trimmed mean.
All the bars in a uniform distribution are
approx same height
In a statistics class, the standard deviation of the heights of all students was 3.9 inches. The standard deviation of the heights of males was 3.3 inches and the standard deviation of females was 3.2 inches. Why is the standard deviation of the entire class more than the standard deviation of males and females considered separately?
bc the distribution of the entire class has more dispersion
Classes
categories by which data are grouped
. The mean measures the
center of the distribution, while the standard deviation measures the spread of the distribution.
When continuous data are organized in tables,
data are categorized, or grouped, by intervals of numbers. Each interval represents a class
Misleading graph if
data is like 7.4, etc and you start at 7 should start at zero
class width
diff between consecutive lower class limits
Histogram depicts the higher standard deviation if the
distribution has more dispersion.
To determine the percentage of time a particular sum was observed,
divide frequency for that sum by total number of frequencies, that is, the total number of times the dice were thrown. Express the result as a percent.
ogive
graph that represents cumulative frequency or cumulative relative frequency for class
x with line on top of it > M if
histogram is skewed right
When an observation that is much larger than the rest of the data is added to a data set, the value of the mean will
increase.
Upper class limit
largest value within class
The distribution is skewed left if
left tail is longer than right tail
Why is the median resistant, but the mean is not?
mean is not resistant because when data are skewed, there are extreme values in the tail, which tend to pull the mean in the direction of the tail. The median is resistant because the median of a variable is the value that lies in the middle of the data when arranged in ascending order and does not depend on the extreme values of the data.
A statistic is resistant if it is
not sensitive to extreme values.
mode
peak of the distribution.
ogive is constructed by
plotting points whose x-coordinates are upper class limits and y-coordinates are cumulative frequencies or cumulative relative frequencies of class. Then line segments are drawn connecting consecutive points. An additional line segment is drawn connecting first point to the horizontal axis at location representing upper limit of class that would precede first class (if it existed).
When reading an illustrated bar graph, the
reader needs to compare the vertical scales of each bar to see if they accurately depict the data.
For annual household incomes in a country, state whether you would expect a histogram of the data to be bell-shaped, uniform, skewed left or skewed right.
skewed right
Lower class limit
smallest value within class
standard deviation measures the
spread of the data from the mean.
In a stem-and-leaf plot, the
stem of a data value will consist of the digits to the left of the right-most digit, and the leaf will consist of the right-most digit. The stems are written in a vertical column in increasing order, a vertical line is drawn to the right of the stems, and each leaf corresponding to the stems is written to the right of the vertical line in ascending order. Use the legend to read the stem-and-leaf plot.
The mode of a variable is
the most frequent observation of the variable that occurs in the data set. To compute the mode, tally the number of observations that occur for each data value. The data value that occurs most often is the mode. A set of data can have no mode, one mode, or more than one mode. If no observation occurs more than once, the data have no mode.
Since extreme values will increase the standard deviation greatly,
the standard deviation cannot be a resistant measure of spread.
true or false When plotting an ogive, the plotted points have x-coordinates that are equal to the upper limits of each class.
true
True or false There is not one particular frequency distribution that is correct, but there are frequency distributions that are less desirable than others.
true Any correctly constructed frequency distribution is valid. However, some choices for categories or classes give more info about shape of distribution.
for population standard deviation
don't do n-1 just n then ofc do the square root stuff
Identify the given statement as either true or false. The standard deviation can be negative.
false
Identify the given statement as either true or false. The standard deviation is a resistant measure of spread.
false
True or False: A data set will always have exactly one mode.
false
true or false Stem-and-leaf plots are particularly useful for large sets of data.
false
Stem-and-leaf plots
lose their usefulness when data sets are large or when consist of large range of values.
if the data set is not skewed, the central tendency that best describes the "center" of the distribution is the
mean
For a distribution that is symmetric, which of the following is true?
mean =median
For a distribution that is skewed right, which of the following is true?
mean > median
if it's symmetric
mean is best measure of central tendency
A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median? Why?
mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.
if distribution is skewed
median is best measure of central tendency
wider range will mean
more dispersion
median has
resistance
example Letter B represents the median because
roughly half of the values in the distribution are to the left of B and roughly half of the values in the distribution are to the right of B.
The mean is calculated by
summing all of the values and then dividing by the total number of values.
By adding a large value to the sum and only increasing the number of values by one,
the division will result in a larger mean.
If all observations have the same value,
then that value will also be the mean of the data. Therefore, the sum of the squared differences from the mean will be 0, and the standard deviation will be 0.
What is meant by the phrase degrees of freedom as it pertains to the computation of the sample standard deviation
there are n-1 degrees of freedom in the computation of s because an unknown parameter u is estimated by x with a line over it for each parameter estimated, 1 degree of freedom is lost
Why shouldn't classes overlap when summarizing continuous data in a frequency or relative frequency distribution?
Classes shouldn't overlap so there is no confusion as to which class observation belongs
What does it mean if a statistic is resistant?
Extreme values (very large or small) relative to the data do not affect its value substantially.
Over the past 10 years, five mutual funds all had the same mean rate of return. The standard deviations for each of the five mutual funds are shown below. Capital investment 8.3%. Vanity 6.2%. Global advisor 9.2%. International equities 4.6%. Nomad 7.3% Which mutual fund was least consistent in rate of return?
Global advisor
If the variance of a variable is 121 what is the standard deviation?
11
If the standard deviation of a variable is 9 what is the variance?
81
If a variable has a distribution that is bell-shaped with mean 23 and standard deviation 6, then according to the Empirical Rule, what percent of the data will lie between 5 and 41?
99.7%
The cumulative relative frequency for the last class must always be 1. Why?
All the observations are less than or equal to the last class
What can be said about a set of data with a standard deviation of 0?
All the observations are the same value.
What makes the range less desirable than the standard deviation as a measure of dispersion?
The range does not use all the observations.
True or False: Chebyshev's inequality applies to all distributions regardless of shape, but the empirical rule holds only for distributions that are bell shaped.
True, Chebyshev's inequality is less precise than the empirical rule, but will work for any distribution, while the empirical rule only works for bell-shaped distributions.
The standard deviation is used in conjunction with the
mean to numerically describe distributions that are bell shaped
For a distribution that is skewed left, which of the following is true?
mean < median
In 1994, major league baseball players went on strike. At the time, the average salary was $1,049,589, and the median salary was $337,500. If you were representing the owners, which summary would you use to convince the public that a strike was not needed? If you were a player, which would you use? Why was there such a large discrepancy between the mean and median salaries? Explain.
If you were representing the owners, you would use the average salary to convince the public that a strike was not needed. If you were a player, you would use the median salary to convince the public that a strike was needed. The average and median salaries differ so greatly because the distribution of salaries is skewed right.
the class width on graph
width between each number on bottom of graph ex: it goes 50, 60, 70. class width is 10
ask 36 students how many times they drank in past week sampling bias?
yes bc can't get truthful answers unless anonymous
The sum of the deviations about the mean always equals
zero