Chapter 3 Central Tendency, Variation, and Position

Ace your homework & exams now with Quizwiz!

Measures of Variation

Measures of dispersion, variability or spread are numbers that describe how spread out or packed the data values are -range -variance -standard deviation

Range

Measures of variance, difference between the largest and the smallest values in the data set (maximum-minimum value) -same value equal to zero

Standard deviation

Measures of variation, also quantifies the amount of dispersion of data values from their average -same value equal to zero -square root of variance

Mean vs Median

Median is less influenced or sensitive than mean by outliers

Properties of the Coefficient of Variation

-only applies data ratio scale -usually % -less than, = too, more than 100% -higher CV, more relative variability in the data -unit-less or dimensionless

Normal Distribution

50% of data values are less than the mean 50% of data values are greater than the mean Mean, Median, mode located at the peak

Empirical Rule

68%-1 standard deviation of mean 95%-2 standard deviation of mean 99.7%-3 standard deviation of mean ***applies to all Normal Distribution, no matter what the shape***

Shape of a Normal Distribution

Bell-shaped Symmetric about the mean, has axis of symmetry Mean, median, mode coincide Defined by mean & Standard deviation *Mean-location or position of the bell *SD-widened or broadness of the bell

Outlier

Data value that is much larger or smaller than most values in data set; sensitive to outliers

Four Properties of Z-scores

Dimensionless (plain #'s, no units) Mean is always zero Positive z-score is above the mean Negative z-score is below the mean

Ungrouped Frequency Distribution

Frequency distribution in which the values of the variable are displayed individually

Grouped Frequency Distribution

Frequency distribution values of the variable are grouped in classes

Boxplot (a.k.a. Box and Whisker Diagram)

Graphical representation of the five number summary -outliers are represented by asterisks or dots beyond the whiskers

Interquartile Range

IQR=Q3-Q1 Measure of variation Distance from Q1 to Q3

Chebyshev's Theorem

In any data set, the fraction of data value that lie within K Standard Deviations from the mean is at least 1-(1/k^2) All distributions Lower bounds ("at least")

Coefficient of Variation

Is a measure of variance that quantifies the variability in a data set -relative to the mean -Relative Standard Deviation -CV=standard deviation/mean -unit-less

Central Tendency

Mean (average), median, mode

Median

Measure of central tendency defined as the value that separates the upper and lower halves of a data set -Middle Value -Numbers in order -Not always in the data set

variance

Measure of variance, measures how far a set of numbers are spread out from their mean -same value equal to zero -standard deviation squared

Percentiles

Numbered that divide a numerically ordered data set into one hundred equal groups, each one containing a hundredth of the data values -P1, P2, P3...P99

Quartiles

Numbers that divide a numerically ordered data set into 4 equal groups Each one containing a quarter of the data values -1st or lower quartile -2nd or median -3rd or upper quartile

Fences

Numbers used to determine if given values are outliers LF=Q1-1.5 times (IQR) UF=Q3+1.5 times (IQR) If x is greater than LF and less then UF=not an outlier If x is less than or equal to LF or x is greater than or equal to UF then it is an outlier

Extreme Fences

Numbers used to determine if the outliers are mild or extreme ELF=Q1-3 times (IQR) EUF=Q3+3 times (IQR) If x is greater than ELF, less than or equal to LF or x is greater than or equal to UF, less than EUF then x is a mild outlier If x is less than or equal to ELF Or x is greater than or equal to EUF then x is an extreme outlier

Relations between Quartiles and Percentiles

Q1=P25 Q2=P50=Median Q3=P75

Variation

Range, variance, standard deviation

Rounding Rule for Mean

Round the answer to one more decimal place than the original data

The Five Number Summary

Set of 5 values that provides information about the structure of a data set -minimum -Q1 -Q2 (Median) -Q3 -Maximum

The wider the bell curve, the bigger...

Standard deviation

Position

Standard scores, quartiles, percentiles

Me an (average)

Sum of all the numbers in a data set divided by the total amount of values

Table of Weighted Values

Table each values of the variable has an assigned weight

Properties of Normal Distribution

Total area under the curve of a normal distribution is equal to 1 (100%) Curve approaches but NEVER touches the x-axis Extends far away from the mean

Mode

Values that appear most often in a data set -not affected by outliers -applies to numerical and categorial -more than one mode -no mode at all

Standard Scores (or Z-scores)

Z=data value - mean/SD Indicates how many standard deviation away from the mean the value is

Population z-score

Z=x-mean/SD (GREEK letters)

Sample z-score

Z=x-mean/SD (LATIN letters)

Measures of central tendency

numbers describe "center" or "middle point" of data set (mean, median, mode)


Related study sets

Food chains: Biology unit 9 : Ecology

View Set

FIN 423 Final Practice Questions

View Set

6th Grade - World History - Ancient Egypt - Section 1 - Guided Notes & Vocabulary

View Set