Section 5
What is the empirical rule?
Also known as the 68-95-99.7% Rule for Bell-shaped Distributions. 68% of the data will lie within 1 standard deviation of the mean. 95% of the data will lie within 2 standard deviations of the mean. 99.7% of the data will lie within 3 standard deviations of the mean.
What are the two basic properties of all normal density curves?
1. The total area under the curve equals 1. 2. The density curve always lies on or above the horizontal axis. Also, the area under the curve can be treated as a probability.
What is a z-score?
The z-score for a data value tells how many standard deviations away from the mean the observation lies. If the z-score is positive, then the observed value lies above the mean. A negative z-score implies that the value was below the mean.
Normal Density Curve
This curve is symmetric and has a bell shape.
What is a Q-Q plot and what is it used for?
To help us visually assess if data are normally distributed, statisticians have developed Q-Q plots. A Q-Q plot is a graph that is used to assess if data are normally distributed. If the points in the Q-Q plot are essentially in a straight line, then we conclude that the data are normally distributed. In a Q-Q plot, the z-scores for the observed values in the data set are plotted on the horizontal axis. The z-scores of the expected values (assuming the data are normally distributed) are plotted on the vertical axis. If the data follow a normal distribution, then we would expect these points to lie in a straight line. If the data are non-normal, we expect to see a distinct curve in the Q-Q plot.
What is defined as an unusual observation?
We define an unusual observation to be something that happens less than 5% of the time. For normally distributed data, we determine if an observation is unusual based on its z-score. We call an observation unusual if z<-2 or if z>2. In other words, we will call an event unusual if the absolute value of its z-score is greater than 2.
Normal Standard Distribution
When the mean of a normal distribution is 0 and its standard deviation is 1.
What is granularity in a Q-Q plot?
You may notice some jagged jumps in the in Q-Q plot of the body temperature data. This is called granularity. It occurs here because the body temperatures were only reported to the nearest tenth of a degree. If we had more precision in the data, this would not occur.