Chapter 3 - Numerical Descriptive Measures

¡Supera tus tareas y exámenes ahora con Quizwiz!

sample standard deviation

- MOST COMMON used measure of variation - shows variation about the MEAN (shows average "scatter" around the mean) - is the square root of the variance - has same units as the orig. data

population variance

- average of squared deviations of values from the mean (x - population mean)^2 / population size (N)

coefficient of variation

- measures relative variation - always in percentage - shows variance relative to mean - can be used to compare variability of 2 or more sets of data

standard deviation

- most commonly used measure of variation - shows variation about the mean - is SQUARE ROOT of POPULATION VARIATION

n+1/4

1st quartile location

n+1/2

2nd quartile location

3(n+1)/4

3rd quartile location

smallest, 1st quartile, median, 3rd quartile, largest

5 number summary includes:

1 standard deviation

68% of data

2 standard deviations

95% of data

3 standard deviations

99.7% of data

kurtosis

affects the peakedness of the curve of distribution aka how sharply the curve rises approaching the center of distribution

empirical rule

approximates the variation of data in a bell shaped distribution

sample variance

average of squared deviations of values from the mean. sum of (value - mean)^2 / n-1

s/mean*100

coefficient of variation equation (s = standard deviation)

sum(xi-xmean)(yi-ymean)^2/n-1

covariance equation

3

data value is considered an extreme value if z-score is less than or greater than ____

measures of variation

give information on the SPREAD or VARIABILITY or dispersion of the data values

zero

if the values are all the same (NO variation), range, variance, and standard deviation will be ______

Q3-Q1

interquartile range

z score

is the number of standard deviations a data value is from the MEAN - the larger, the father data value is from the mean

left-skewed

mean < median

right skewed

mean > median

median

measure of central tendency - in an ordered array, the median is the middle - LESS SENSITIVE TO EXTREME VALUES

mean

measure of central tendency - most common, known as avg. - sum of values / number of values - AFFECTED BY EXTREME OUTLIERS

mode

measure of central tendency - value that occurs the most often - NOT AFFECTED BY EXTREME VALUES - used for either numerical or categorical data - can be several of these

range

measure of variation - difference b/t largest and smallest value - SIMPLEST measure of variation - can be MISLEADING B/C DOES NOT ACCOUNT FOR HOW DATA IS DISTRIBUTED - sensitive to outliers

negative

measures of variation will never be _____

skewness

measures the extent to which data values are NOT symmetrical

coefficient of correlation

measures the relative strength of the linear relationship b/t 2 numerical variables

interquartile range

measures the spread in the middle 50% of the data - not influenced by outliers or extreme values (resistant measure)

covariance

measures the strength of the linear relationship b/t 2 numerical variables (x&y) -BIGGEST FLAW: not able to measure the relative strength of the relationship from the size of the covariance -if covariance > 0 - move in X&Y tend to move in same direction - < 0 X&Y move in opposite direction - =0 X&Y are independent

(n+1)/2

median position equation -only works when values are in NUMERICAL order

population mean

numerical descriptive measure for a population - the sum of the values in the population divided by the population size N

Chebyshev rule

regardless of how data's distributed, at least (1-1/k^2)*100% of the values falls within "k "standard deviations of the mean.

r = cov(x,y)/sxsy

sample coefficient of correlation equation sx = sqrt of (xi-mean)^2/n-1 sy = sqrt of (yi-mean)^2/n-1

square root of variance

sample standard deviation

quartiles

split ranked data in to 4 segments w/ an equal number of values per segment

variation

the amount of dispersion or scattering away from a central value that the values of a numerical variable show

central tendency

the extent to which the values of a numerical variable group around a typical / central value

shape

the pattern of the distribution of values from the lowest value to the highest value

spread out

the range, variance, and standard deviation is greater as the data is more ______

concentrated

the range, variance, and standard deviation is smaller as the data is more ______

closer to 1

the stronger the positive relationship

closer to -1

the stronger the relationship

zero

the weaker the linear relationship (appears as straight line on scatter plot)

x - mean/standard deviation

z-score equation


Conjuntos de estudio relacionados

Fundamentals of Insurance Planning Chapter 6

View Set

AP Government study guide Unit 1 (official)

View Set

Malware Removal: Remediating an Infected System

View Set

Chapter 08: Communicating in intimate Relationships

View Set

image analysis test 3 assignment #2

View Set

Fluid And Electrolyte Practice Questions

View Set