Mathematics in the Modern world
spread/variability/dispersion
Measures of variation give information on the ____/____/____ the data values.
Coefficient of Variation
Measures relative variation Always in percentage (%) Shows variation relative to mean Can be used to compare the variability of two or more sets of data measured in different units
Skewness
Measures the amount of asymmetry in a distribution
Kurtosis
Measures the relative concentration of values in the center of a distribution as compared with the tails
coefficient of correlation
Measures the relative strength of the linear relationship between two numerical variables
Geometric mean rate of return
Measures the status of an investment over time
right-skewed
Median < Mean
Population standard deviation
Most commonly used measure of variation Shows variation about the mean Is the square root of the population variance Has the same units as the original data
Sample standard deviation
Most commonly used measure of variation n Shows variation about the mean n Is the square root of the variance n Has the same units as the original data
T
A data value is considered an extreme outlier if its Z-score is less than -3.0 or greater than +3.0. T/F?
99.7%
Approximately _____ of the data in a bell-shaped distribution lies within three standard deviations of the mean, or µ ± 3σ
95%
Approximately _____ of the data in a bell-shaped distribution lies within two standard deviations of the mean, or µ ± 2σ
68%
Approximately ______ of the data in a bell shaped distribution is within 1 standard deviation of the mean or µ ±1 σ
Sample variance
Average (approximately) of squared deviations of values from the mean
Arithmetic mean Median Mode Geometric Mean
Central Tendency covers 4 measurements including
objective
Data analysis is ________ Should report the summary measures that best describe and communicate the important aspects of the data set
subjective
Data interpretation is ______ Should be done in fair, neutral and clear manner
T
Descriptive statistics discussed a sample, not the population. T/F?
Q1= (n+1)/4
First quartile position
even
If the number of values is ______ , the median is the average of the two middle numbers
odd
If the number of values is _______ , the median is the middle number
zero
If the values are all the same (no variation), all these measures will be ______.
median
In an ordered array, the _______ is the "middle" number and is not affected by extreme values
The Interquartile Range
It is Q3 - Q1 and measures the spread in the middle 50% of the data It is also called the midspread because it covers the middle 50% of the data It is a measure of variability that is not influenced by outliers or extreme values
left-skewed
Mean < Median
Symmetric
Mean = Median
F, variation
None of measures of central tendency are ever negative. T/F? if F, what is the correct answer?
Parameters
Population is for ______
Chebyshev Rule
Regardless of how the data are distributed, at least (1 - 1/k2) x 100% of the values will fall within k standard deviations of the mean (for k > 1)
statistics
Sample is for ______
Q2= (n+1)/2
Second quartile position
Range
Simplest measure of variation Difference between the largest and the smallest values
mean
Sum of values divided by the number of values Affected by extreme values (outliers)
shape
The ______ of distribution describes how data are distributed
Z-score
The _______ is the number of standard deviations a data value is from the mean.
(n+1)/2
The location of the median when the values are in numerical order (smallest to largest) can be determine if this formula is used.
smaller
The more the data are concentrated, the _______ the range, variance, and standard deviation.
greater
The more the data are spread out, the _____ the range, variance, and standard deviation.
F, larger
The smaller the absolute value of the Z-score, the farther the data value is from the mean. T/F?
Q3= 3(n+1)/4
Third quartile position
Geometric mean
Used to measure the rate of change of a variable over time
Mode
Value that occurs most often Not affected by extreme values Used for either numerical or categorical (nominal) data There may may be nothing There may be several
Ignores the way in which data are distributed Sensitive to outliers
Why The Range Can Be Misleading?
variation
amount of dispersion or scattering of values
opposite
cov(X,Y) < 0: X and Y tend to move in ______ directions
independent
cov(X,Y) = 0 X and Y are ___________
same
cov(X,Y) > 0: X and Y tend to move in the _____ direction
central tendency
extent to which all the data values group around a typical or central value.
mean
generally used, unless extreme values (outliers) exist
covariance
measures the strength of the linear relationship between two numerical variables (X & Y) Only concerned with the strength of the relationship No causal effect is implied Major flow: It is not possible to determine the relative strength of the relationship from the size of the covariance
arithmetic mean/ mean
most common measure of central tendency
median
often used, since it is not sensitive to extreme values. For example, ____ home prices may be reported for a region; it is less sensitive to outliers.
shape
pattern of the distribution of values from the lowest value to the highest value.
sum of the values in the population divided by the population size
population mean
population mean
sum of the values in the population divided by the population size