Chapter 3: Business Stats
True or false: Chebyshev's Theorem should only be applied to data sets that are normally distributed.
False
True or false: Mean-variance analysis suggests that investments with lower average returns are also associated with higher risks.
False
In a neighborhood there are five houses listed for sale for the following amounts: $250,000; $275,000; $280,000; $295,000; and $515,000. What is the BEST measure of central location for the price of a house in the neighborhood?
Median
In a data set, the 25th percentile, or equivalently, the
first quartile.
Quartiles divide the data into ______ equal parts.
four
The ______ is the multiplicative average of a data set.
geometric mean
The appropriate measure for evaluating investment returns over several years is the...
geometric mean.
When calculating average growth rates, we apply the formula for the...
geometric mean.
The average of the absolute differences between the values of the data set and the mean is the
mean absolute deviation.
The ________ is not considered a good measure of dispersion because it focuses solely on the extreme values and ignores every other observation in the sample or the population.
range
A positive skewness coefficient implies that extreme observations are concentrated in the ________ tail of the distribution.
right
A distribution is _______ if one side of the histogram is a mirror image of the other side.
symmetric
The empirical rule is appropriate when the distribution of a variable is ______ and ________-shaped.
symmetric; bell
The function to find the mean of a subset in R is __________
tapply
The range is the difference between
the largest and smallest values
If a variable has one mode, then we say it is
unimodal
Two widely used measures of dispersion are...
variance and standard deviation
A _________ mean is used to calculate the mean of a frequency distribution.
weighted
When a mean is calculated and some observations are given greater importance than others, we refer to this measure of central location as a ______.
weighted mean
A skewness coefficient of _______ indicates the observations are evenly distributed on both sides of the mean.
zero
The term ________ ________ relates to the way data tend to cluster around some middle or central value.
central location
The mean is usually less than the median when the data are __________ skewed
negative
The arithmetic mean is usually NOT a good measure of central location if a(n) ______ exists.
outlier
Extremely small or large values, also referred to as ________
outliers
The first step to determine the median is to
place the data in numerical order
If the covariance between two random variables x and y is positive, then x and y have a(n) _____.
positive linear relationship
The empirical rule states that approximately _______ % of observations will fall within three standard deviations of the mean.
100
The mean is usually greater than the median when the data are ________ skewed.
positve
Which of the following statements regarding the geometric mean is MOST accurate?
The geometric mean is less sensitive to extreme values than the arithmetic mean.
If the median price for a home is $200,000, then ______ of the homes cost less than $200,000.
50%
The median is also called the _____ percentile..
50th
In which of the following data sets would the arithmetic mean NOT be a good measure of central location?
7, 8, 8, 9, 25
The function to find the mean of a subset in Excel is
=AVERAGEIF
A __________ helps us identify outliers and skewness in the distribution of a variable.
boxplot
Which of the following is true when using the empirical rule for a sample of observations?
Approximately 68% of the observations are in the interval xx ± s.
The ___________ ___________ is the primary measure of central location.
Arithmetic mean
The owner of BevaMart wants to study the relationship between the outside temperature and hot chocolate sales. The owner computed the covariance between outside temperature and hot chocolate sales to be -81.46. Based on the covariance, which option best describes the linear relationship between temperature and hot chocolate?
As the temperature increases, hot chocolate sales decrease.
A ________value for the standard deviation indicates that the data points are close to the mean, while a ___________ value for the standard deviation indicates that the data are spread out.
Low; high
Which of the following statements about the mean absolute deviation (MAD) is MOST accurate?
MAD is denominated in the same units as the original data.
What is the most widely-used measure of central location?
Mean
The notation x represents the ______.
Sample mean
When there are an even number of observations, and the observations are in order from smallest to largest, the median is...
The average of the two middle observations
Which of the following values is included in a box plot?
The first and second quartile
When there are an odd number of observations, and the observations are in order from smallest to largest, the median is...
The middle observation
A box-and-whisker plot is another name for a
boxplot
What is the relationship between the variance and the standard deviation?
The standard deviation is the positive square root of the variance.
The formula for the weighted mean is xx=Σwixi. Using this formula, what is the restrictions on the weights.
They must sum to one.
Which of the following is not a measure of central location?
Variance
Is it possible for a data set to have NO mode?
Yes, if there are no observations that occur more than once.
In order to calculate the arithmetic mean, one
adds all of the data points, then divides by the number of data points.
The pth percentile divides a variable into two parts. What percentage is less than the pth percentile?
approximately p percent
The _____ and the _____ are the most extensively used measures of central location and dispersion, respectively.
mean, standard deviation
Rate of return is evaluated in terms of its reward (________) and risk (_______)
mean; variance
Generally, the __________ is the best measure of central location when outliers are present.
median
The ________ is a measure of central location that divides the observations for a variable in half.
median
The measure of central location where half the values of the data set lie above this measure and half the values of the data set lie below this measure is known as the ______.
median
The ______ is a measure of central location that is the most frequently occurring value in the data set.
mode
If a variable has two or more modes, then we say it is ___________
multimodal