Descriptive Statistics (Exam 1)
Characteristics of Normally Distributed Data (5)
-Unimodal -Symmetrical (no skew) -Bell Shaped (no kurtosis) -Asymptotic (tails never = 0) -Mean, median, mode are centered *normality testing is always a precursor to any analysis
Significant Kurtosis?
if K is greater than 2x the SEK, hen kurtosis is significant and distribution is likely not normal
Leptokurtic
positive skew, peak is higher and narrower than normal (think "leap" = high)
Skew/Kurtosis
shape of the spread (variability); distribution
Use Median When...
there are extreme scores
Weighted Mean
(bc mean is sensitive to outliers) used to restore balance to over or underrepresented segments of a sample
Ways to Examine Variability
Range Standard Deviation Standard Error Variance
Mode
most frequently occurring score
Calculating Skew
skewness = 3(Mean -Median) / Standard Deviation (-) value = (-) skew (+) value = (+) skew 0 = normal distribution
Elements of Variability
spread dispersion
Variance
another common measure of spread in a data set found in many statistical calculations *formula is s^2 (standard deviation without the square root
Standard Deviation
average distance of each observation from the mean (average amount of variability) most frequently reported measure of variability
Central Tendency
central/typical value around which data clusters (mean, median, mode)
Standard Error (Variance)
combines information from the standard deviation to tell you how confident you are about your population mean estimate; used to assign confidence intervals to our population mean estimate
Use Mode When...
data are categorical
Significant Skew?
if skew is greater than 2x the standard error of skew, then skew is significant and distribution is likely not normal
Variability
measure of how much scores differ from the mean
Kurtosis
measure of how peaked or flat the distribution curve will be
Skewness
measure of whether and how much extremes are affecting the bell curve (normal distributions)
Platykurtic
negative skew, peak is lower and wider than normal (think "plat" = flat)
Positive Skew
the mean is bigger than the median -"slide" runs in positive direction & "bumb" is pulled to the left of the normal curve
Negative Skew
the mean is smaller than the median -"slide" runs in a negative direction & "bumb" is pulled to the right of the normal curve
Systematic Variability
type of variability that has been identified as causes of variability
Error
type of variability that has not been or can not be accounted for
Descriptive Statistics
used to describe data sets used to visualize data 1st step in any statistical analysis
1st Quartile (Q1)
value at which 25% of all observations fall below
Median
value at which 50% of all scores fall below, and 50% fall above
3rd Quartile (Q3)
value at which 75% of all observations fall below
No Mode
when no values occur more than once
Bimodal Distribution
when two values occur the same number of times
Standard Deviation vs. Variance
-S.D. is in the same unites, making it ideal to describe data -variance is often a required input for statistical tests
Calculating Mode
-list all values in a distribution -tally the number of times each value occurs -value occurring the most is the mode
Range
-most general estimate of variability -most course estimate as well b/c only 2 extreme scores are used -conclusions should never be based on variability alone -very important in predictive statistics
Calculating the Median
-rank order from lowest to highest -find "middle" score, or average middles scores if sample size is even
Percentiles (Uses)
-used to get a feel for how spread out the data is and where most of the observations are contained -also as a comparison score (provides "normalized referenced" measure to compare one value to the larger sample)
Inter-Quartile Tange (IQR)
=Q3-Q1 the larger the IQR, the more variable the data is