Business Statistics Ch 3
outlier
An extreme value located far away from its mean.
third quartile (Q3)
Quartiles split the ranked data into 4 segments with an equal number of values per segment. Only 25% of the observations are greater than the ________ q____________.
central tendency
The extent to which the values of a numerical variable group around a typical or central value.
z-score (characteristics)
The number of standard deviations a data value is from the mean.
shape
The pattern of the distribution of values from the lowest value to the highest value.
mean, median, mode (mean is most common)
What are the three measures of central tendency?
CV=(2/x-bar)*100%
What is the equation for coefficient of variation (CV)?
3(n+1)/4
What is the equation to determine the quartile value in in the ranked data for the third quartile? Q3=?
median
Which of the three measures of central tendency is not sensitive to extreme values and is represented with the equation = (n+1)/2?
normal distribution
A symmetrical and bell-shaped (graph), implying that most observed values tend to cluster around the mean, which, due to the distribution's symmetrical shape, is equal to the median. It is the most common continuous distribution used in statistics.
68%
Approximately what percentage of data in a bell-shaped distribution is within 1 standard deviation of the mean or mu+or- 1sigma, according to the Empirical Rule?
95%
Approximately what percentage of data in a bell-shaped distribution lies within 2 standard deviations of the mean, or mu=or-2sigma according to the Empirical Rule?
99.7%
Approximately what percentage of data in a bell-shaped distribution lies within 3 standard deviations of the mean, or mu=or-3 sigma according to the Empirical Rule?
does not
Average does or does not mean "typical".
(1-1/k-squared)*100%
Chebyshev's rule: Regardless of how the data are distributed, at least __________________ of the values will fall within k standard deviations of the mean (for k > 1). (fill in the equation for the Chebyshev rule.)
right-skewed
If the mean is greater than median, the shape is _______-_________.
left-skewed
If the mean is less than the median, the shape would be _____-________.
median
In an ordered array, it is the "middle" number (50% above, 50% below). If the number of values is odd, the median is the middle number. If the number of values is even, the median is the average of the two middle numbers (average the two middle values divided by 2.) ** (n+1)/2 is NOT THE VALUE, only the POSITION OF THE _________ in the ranked data. It is often used, since it is not sensitive to extreme values.
symmetrical, zero skewness
Mean is equal to the median. The values below the mean are distributed in exactly the same way as the values above the mean. This means the shape is s___________, having z_____ s__________.
mode
Measures of central tendency: the ______ is the value that occurs most often, is NOT affected by extreme values, is used for either numerical or categorical data, but more widely used for CATEGORICAL data. There may not be one or there may be several.
spread, variability, dispersion
Measures of variation give information on the s______ or v_________ or d_______ of the data values. (Range, variance, standard deviation, and coefficient of variation are the four methods to measure variation)
coefficient of variation
Measures relative variation; always in percentage. Shows variation relative to mean; can be used to compare the variability of two or more sets of data measured in different units.
variation
Measures the amount of dispersion, or scattering away from a central value that the values of a numerical variable show.
variance, standard deviation
Of the four measures of variation, which two helps see what the consistency is? v__________, s________ d__________
round the, integer
Quartile Measures: Calculation rules - If the result is not a whole number or a fractional half, then r_______ t____ result to the nearest i__________ to find the ranked position.
average the two corresponding (If Q1 is 2.5, then average the values that are in the second and third position)
Quartile Measures: Calculation rules - When calculating the ranked position, if the result is a fractional half (e.g: 2.5, 7.5, 8.5, etc) then a__________ t____ t____ c_______________ data values.
position to use
Quartile Measures: Calculation rules - When calculating the ranked position, if the result is a whole number, then it is the ranked p________ t__ u____.
first quartile (Q1)
Quartiles split the ranked data into 4 segments with an equal number of values per segment. The ________ q___________ is the value for which 25% of the observations are smaller and 75% are larger.
second quartile (Q2)
Quartiles split the ranked data into 4 segments with an equal number of values per segment. The _________ q___________ is the same as the median (50% of the observations are smaller and 50% are larger)
spread, variability, dispersion
Range, variance, standard deviation, and coefficient of variation are all measures of variation that give information on the s______ or v_________ or d_________ of the data values.
Chebyshev Rule
Regardless of how the data are distributed, at least (1-1/k-squared)*100% of the values will fall within k standard deviations of the mean (for k > 1).
not influenced, resistant
The IQR is a measure of variability that is n____ i__________ by outliers or extreme values. They are considered r_________.
variation
The amount of dispersion or scattering away from a central value that the values of a numerical variable show.
sample variance, standard deviation
The average (approximately) of squared deviations of values from the mean. A necessary measure of variation for computational purposes, but not practical since the measurement is in units squared. The answer is s-squared=# and is the s_________ v____________. It still has to be simplified to get rid of the square on both sides. This answer, then is the sample s__________ d____________.Answer will always be positive.
minimum (smallest x), first quartile, median (Q2), third quartile, maximum (largest x)
The five numbers that help describe the center, spread and shape of data are the m___________, f_______ q__________, m__________, t________ q_________, and the m___________.
range
The simplest measure of variation. It is the difference between the largest and the smallest values. (Equation=X(largest) - X(smallest). It is over simplified and definitely affected by outliers. It lacks consistency. It measures the total spread in the set of data.
mean
The sum of values divided by the number of values. It is the most common measure of central tendency and is affected by extreme values (outliers). It is generally used unless extreme values (outliers) exist.
variance, standard deviation
Two commonly used measures of variation that account for how all the values are distributed are ________ and _________ ____________. They measure the "average" scatter around the mean -how larger values fluctuate above it and how smaller values fluctuate below it.
interquartile range
What does IQR stand for?
(n+1)/2
What is the equation to determine the quartile value in the ranked data for the second quartile position? Q2=?
(n+1)/4
What is the equation to determine the value in the appropriate position in the ranked data for the first quartile position? Q1=?
75%
What is the probability that a value will fall within 2 standard deviations of the mean according to Chebyshev Rule? (k=2)
89%
What is the probability that a value will fall within 3 standard deviations of the mean according to Chebyshev Rule? (k=3)
interquartile range, midspread
What measures the spread in the middle 50% of the data (Q3-Q1)? i___________ r_________ It is also called the m____________