Descriptive statistics

Ace your homework & exams now with Quizwiz!

Population variance

- Average of squared deviations of values from the mean - Most commonly used measure of variation - Shows variation about the mean - Has the same units as the original data - never negative

mean

- Most common because most data is interval or ratio - most sensitive; affected by outliers - however if there are extreme scores can use mean so use median

Standard Deviation

- Most commonly used measure of variation - Shows variation about the mean; a measure of the avg scatter around the mean - Has the same units as the original data - the squared root of the variance

Measures of central tendency

- a descriptive statistic - measures of typicality - need more than one measure of central tendency bc of shape of distribution (sensitivity varies) - type of measurement determines which measure of central tendency to use - mean median and mode

Interquartile range

- eliminates problems of outliers bc 50% is where its stable - Eliminate some high- and low-valued observations and calculate the range from the remaining values - the middle 50% (Q3-Q1)

median

- is the 50th percentile - a little more sensitive than mode (not affected by extreme values) - can't use for categorical data - best measure for ordinal data - If the number of values is odd, the median is the middle number - If the number of values is even, the median is the average of the two middle numbers

mode

- is the least sensitive measure (not affected by extreme values) - most freq occurring - problems: multiple modes, no modes - use for categorical (nominal) or numerical data

skewed distributions and positions of measures

- mode in peak - median - mean on tail

measures of variation

- numbers that indicate how much scores differ from each other, and the measure of central tendency - gives info on the spread or variability of the data values - range, interqualtile range, variance, standard deviation, coefficient of variation - no measure of varariability for nominal data

range

- simplest measure of variablity - diff between largest-smallest disadvantages: ignores ways data distributed and sensitive to outliers

quartiles

- split data into equal 4 segments with equal number of values per segment - identifies relative position of scores in a distribution - Q1=25% - Q2=50% (median)

Empirical rule

- things are highly predictable If the data distribution is bell-shaped, then intervals: - interval + or -1 contains about 68% of the values in the population or the sample - interval + or -2 contains about 95% of the values in the population or the sample - interval + or -3 contains about 99.7% of the values in the population or the sample

Exploratory data analysis

Box-and-Whisker Plot is a Graphical display of data using 5-number summary: Minimum -- Q1 -- Median -- Q3 -- Maximum The Box and central line are centered between the endpoints if data are symmetric around the median - allows you to see if there are outliers

advantages of SD and variance

Each value in the data set is used in the calculation Values far from the mean are given extra weight (because deviations from the mean are squared)

variation

most imp concept in stats is analyzing variance bc it measures diversity and individual differences - on average how different is score from mean?

normal distribution

most naturally occurring variables are norm distributed - mean, median, mode all in center at same spot

variance

only interpretable in comparison to other measures of variability; can never be negative

parameters vs statisitcs

population parameters are constant; relatively stable but sample statistics are variable

reason for n-1

standard deviation is squared root of variance - without subtracting 1 you would underestimate what happens in population because no way a small group could represent diversity of the population - subtracting 1 bumps up the final answer - without it, it is a biased estimator of population

Insensitivity

stat doesn't change when data changes *always want most sensitive


Related study sets

Investment Companies: Fixed UITs / REITs / BDCs

View Set

Psychotic Disorders-NCLEX 3000 Mental Health

View Set

Protocol Data Units and OSI Model

View Set

Benign Prostatic Hyperplasia HESI Bob Hamilton

View Set