Statistics Chapter 6
Example 2: Find the probability less than a number
Suppose that a national testing service gives a test in which the results are normally distributed with a mean of 400 and a standard deviation of 100. Find the probability that the score of a randomly selected student will be less than the mean score. In the problem, we are given Mu=400, and stdev=100. The symbolic representation is P(X<400) The z-score is ... z = x - Mu / stdev The z-score is used to convert the normal random variable to a standard normal random variable as follows: P(X<400) = P(x - Mu / stdev < 400-400/100) =P(z < 0) The problem is now represented using the standard normal distribution Since the total area under the normal curve is one and the normal distribution is symmetric about its mean, half of the probability is one one side of the mean and half is on the other, thus... P(z<0)=0.5
Example 4: find the probability between 2 numbers
The repair time for air conditioning units is believed to have a normal distribution with a mean of 38 minutes and a standard deviation of 12 minutes. Find the probability that the repair time for an air conditioning unit will be between 29 and 44 minutes. In the problem, we are given Mu=38 minutes, and stdev=12 minutes. The probability that the repair time of an AC unit between 29 and 44 minutes is as follows: P(29 < X < 44) = P(29-38/12 < x-Mu/stdev < 44-38/12) =P(-0.75 < z < 0.5) =P(-0.75 < z < 0.5) = P(z<0.5) - P(z < -0.75) =.6915 - .2266 =.4649
z-score
The z-score can transform any normal random variable into a standard normal random variable. The z-score is denoted by z and given by the formula: z = x - Mu / stdev Applying the z-score is rather simple. Find the probability that a normal random variable with a mean of 10 and a standard deviation of 20 will lie between 10 and 40. Applying the z-score yields: P(10 < X < 40) = P(10 -10 / 20 < z < 40 - 10 / 20) = P(0 < z < 1.5) The problem is now represented using the standard normal distribution. -You can also find this value directly from the z-table. P(0 < z < 1.5) = P(z < 1.5) - P(z < 0) =.9332 - .5000 =.4332
Area in between
To find the area between two values of z, use the table to look up the area to the left of each z value and then subtract the smaller area from the larger area. For example, find the area under the normal curve between z= -1.5 and z=2.65 FIRST, look up the area to the LEFT of z= -1.5, which gives you 0.0668. SECOND, look up the area to the LEFT of z= 2.65, which gives you 0.9960. LAST, subtract 0.9960 - 0.0668 = 0.9292. Then the area between our two z values is 0.9292
Example 1: Find the area under the normal curve to the left of z=2.45
To read a normal table, we will need to split the z value in two parts: the first part will be the number out to the tenths place, and the second part will be the number in the hundredths place. For this example, the first part of the z value is 2.4. The second part is 0.05. So look across the 2.4 row and down the 0.05 column. This row and column intersect at .9929. Thus, the area under the normal curve to the LEFT of z=2.45 is .9929.
Example 1: Area to the left of z
What z value has an area of 0.4 to its left? Using the "-infinity to -z table", look for the probability closest to 0.4. Looking at the row and column, you will see that z= -0.25.
Inflection point
While the mean defines the location, the standard deviation determines the shape of the curve. The inflection points on either side of a normal distribution are Mu - standard dev and Mu + standard dev. An inflection point is a point on the curve where the curvature of the line changes. So, one standard deviation is the distance from the mean to one of the inflection points. The larger the standard deviation, the more area there will be in the tails of the distribution. Therefore, the curve will appear flatter. Changing the standard deviation parameter can have rather significant effects on the shape of the distribution.
Example 2: find the area in the tails
Find the area under the normal curve to the left of z= -2.13 and to the right of z=2.13 Notice that the absolute value of each z is the same. Thus, the area in the tails of the distribution will be the same because of the symmetric property of the normal curve. So to find the area in the two tails, we only need to look up the negative tail and multiple the answer by 2. Use the normal curve table, "Standard normal table: area -infinity to -z" to find the area to the left of z= 02.13. This gives us a value of 0.0166. Multiply this area by two to equal the combined area in the tails. Thus, 0.0166 * 2 = 0.0332. So the area is 0.0332.
Finding values of a normally distributed random variable
Generally, the question is to find the probability that z is greater than or less than some value. However, is the problem is reversed, you have to find the value of z that corresponds to a tail probability of some value which is commonly required in estimation and hypothesis testing. The first problem type we will look at is when you are asked to find the z value for a given area to the left of z. Recall the cumulative standard normal table gives the area to the left of z. Therefore, to find z for an area to the left, simply scan through the body of the table until you find the given area. The z value is composed of the row and column headers that correspond to the probability given.
Example 1: Indicate the X value
Given X=62, Mu=75, and st dev = 15, indicate the curve on the mean, the inflection points, and the x value. Solution: Remember that the mean is located at the center of the distribution; therefore, it is found in the center of the curve. The inflection points are where the curvature of the line changes from being curved upward to being curved downward. They are also equal to Mu +/- stdev. We have 75-15=60 and 75+15=90. The X value can then be placed on the curve according to the placement and values of the mean and inflection points.
Example 2: convert to a standard normal curve
Given a normal curve with Mu= 38 and stdev =8, convert to a standard normal curve and indivate where a score of X =42 would be on each curve. Solution: Begin by finding the standard score for X=42 as follows. Z=X-Mu / stdev =42-38 / 8 =0.5
Probability notation
It is important to introduce some probability notation. P(X) stands for the probability that X will occur. Thus, P(z < 1.37) is the probability that z is less than 1.37, as in the first example. Remember that probability is the same as area under the curve. Therefore, P(z > 2.36) would be the area under the curve to the right of z=2.36. Notice that the inequality sign points to the same direction as the area under the curve.
The standard normal distribution
The standard normal distribution is a special version of the normal distribution. The standard normal curve has all of the properties of a normal curve, and always has a mean of 0 and a standard deviation of 1.
Properties of a normal distribution
1. A normal curve is symmetric and bell-shaped 2. A normal curve is completely defined by its mean, Mu, and standard deviation. 3. The total area under a normal curve equals 1. 4. The x-axis is a horizontal asymptote for a normal curve.
Properties of a standard normal distribution
1. A standard normal curve is symmetric and bell-shaped. 2. A standard normal curve is completely defined by its mean, Mu=0, and st dev =1. 3. The total area under a standard normal curve equals 1. 4. The x-axis is a horizontal asymptote for a standard normal curve. To standardize a normal curve to a standard normal curve, we convert each X value to a standard score, z, using the formula... z= X-Mu / stdev Should we then need to convert back from a standard score to an X value, we can reverse the calculation and use the formula... X= stdev * z + Mu
Area in the tails
Find the area under the normal curve to the left of z= -1.55 and to the right of z=3.02 Solution: There are two areas we must find. Thus, the total area we are interested in is the sum of these two areas. Let's begin by finding the area to the left of z= -1.55. Look up z= - 1.55 in the normal curve table, "Standard Normal Table: area - infinity to -z". This area equals 0.0606. Next, we need to find the area to the right of z= -3.02 in the normal curve table, "Standard normal table: area -infinity to -z." The area is 0.0013. Thus the sum of the two areas is 0.0606 + 0.0013 = 0.0619
Finding area under a normal distribution
Many of the distributions in which statisticians are interested are normal distributions. One important application of a normal distribution is that the area under any part of the normal curve is equal to the probability of the random variable falling within the region. Recall from the previous section that the area to the left of a specific value, x, of the random variable is equal to P(X<x). Similarly, the area to the right of x equals P(X>x). Notice that the inequality symbol points in the direction of the area! Furthermore, we do not talk about finding the probability that x is a specific value because of the fact that the normal curve is a continuous probability distribution. Instead, refer to the probability that x is within a range of values. Thus, choosing to include the endpoint of our range does not change the value of the probability. So we can say that P(X<x) = P(X less than or equal to x). To demonstrate how area and probability are related, let's look at an example. Suppose the mean score of Test 1 in your statistics class was a 75 with a standard deviation of 5 and the distribution of test scores was normal. What is the probability that a randomly selected test score, such as yours, was better than an 80? The mean is 75, and because the standard deviation is 5, a score of 80 is one standard deviation above the mean. The area under the normal curve above one standard deviation above the mean is equal to 0.1587, so the probability that you scored an 80 or better is .1587 or 15.87%. If the probability of a random variable having a value in a given range is equal to the area under the curve in that region, then we need to be able to calculate the area under the normal curve in any given region. The area under a curve can be found using integration, which is a calculous technique. However, because there are an infinite number of normal curves, it would be tedious to have to calculate the area for each unique curve. Luckily, the area under any given part of the standard normal curve never changes, so tables have been created to allow us to look up the area under the standard normal curve without relying on calculous. There are many different normal curve tables that statisticians use, each one giving us a different method for finding the same area under the normal curve. Each table produces the same results. Since we will be using z values rounded to two decimal places, our normal curve table, "Standard Normal Table: Area - Infinity to z", reflects just that. The first decimal place of the z value is listed down the left-hand column, with the second decimal place across the top row. Where the appropriate row and column intersect, we find the amount of area under the standard normal curve to the LEFT of that particular z value. Let's now look at an example of finding the area to the left of a particular z.
Finding probability using a normal distribution
Normal distributions are all bell shaped, but the bells come in various shapes and sizes. In addition, the mean, mode, and median are equal. The normal distribution has two parameters, Mu (mean) and variance, respectively. The mean defines the location and the variance determines the dispersion. Changing the variance parameter can have rather significant effects on the shape of the distribution. Although the distribution can range in value from negative infinity to positive infinity, values that are a great distance from the mean occur rarely. One of the most important properties of normal random variables is that within a fixed number of standard deviations from the mean, all normal random variables contain the same fraction of their probabilities. Size of interval in std.|Area under curve in intrval +/- 0.5 stdev ------> .3829 +/-1.0 stdev ------> .6287 +/-1.5 stdev ------> .8664 +/-2.0 stdev ------> .9545 +/-2.5 stdev ------> .9876 +/-3.0 stdev ------> .9973 Note, the probability of a normal random variable being in some interval corresponds to the area under the curve.
Example 2: Area to the right of z
Now let's consider a problem in which the area given is to the right of z. The method for finding the z score is to again first look up the given probability in the table and find its corresponding z-score. There are two ways to find the area to the right. 1. You can use the "z to infinity" table listed in the software and find the value. 2. You can use the "-infinity to -z" table. Since the normal curve is symmetric, the area to the right of z is equal to the area to the left of -z. What z value has an area of 0.352 to its right? Looking through the tables, you should find the value of 0.352 in the "z to infinity" table listed at z=0.38. If you are using tables from a book that may not include this table, you will see that z=-0.38 using the "- infinity to -z" table. Since the area is to the right of z,and by the symmetry of the normal curve, we can simply change the sign of the z value to obtain the correct value of z=0.38
Z less than or greater than
The final problem type is to find z such that the given value is the sum of the area that is either less than -z or greater than z. In other words, find z when the area in the tails (outside of -z and z) of the curve is given. Divide the given value by two and then find the resulting value in the "-infinity to -z" table. This will give you the value of -z. Then, by symmetry, you automatically have the value for z also. Find the value of z such that the area to the left of -z plus the area to the right of z is 0.2846. First, divide 0.2846 in half to obtain the value 0.1423. Now look up the probability 0.1423 in the "- infinity to -z" table. You will see that 0.1423 is -z = -1.07. Again, by symmetry, z=1.07
Example 3: Find the probability greater than a number
The life of a jet engine can be modeled using a normal distribution with a mean of 4 million miles and a standard deviation of 1 million miles. Find the probability that a given jet engine will function for more than 6 million miles. In the problem, we are given Mu=4 and stdev=1. The probability that a jet engine will function for more than 6 million miles can be calculated as follows: P(X>6) = P(x-Mu/stdev > 6-4/1) =P(z>2) = P(z<-2) = 0.0228
Normal distribution
The most prevalent distribution is the normal distribution, a continuous probability distribution for a given random variable X that is completely defined by its mean and standard deviation. A graphical representation of a normal curve is a symmetric, bell-shaped curve centered above the mean of the distribution as shown on the right. An example of a data set that would produce a close to normal distribution is the height of 500 randomly selected men. The random variable in this example if men's heights. The heights would be approximately normally distributed with a mean close to 69.2 inches. Heights of men produce a normal distribution because most men are fairly close to the same height, give or take a few inches. Very tall and very short men are rare. Some other example of data that are normally distributed over a large randomly selected sample would be shoe size, weight, and pregnancy distribution. Normal distributions are all bell-shaped, but the bells come in various shapes and sizes. In addition, the mean, mode, and median are equal. Note that the bell shape of the curve means that the majority of the data will be in the middle of the distribution and the amount of the data will taper off evenly in both directions from the center. Another important property of a normal distribution is that the total area under the curve to the left of a specific value of the random variable, x, equals the probability that a randomly chosen value will be less than x. I.e., P(X=x). Therefore, the total area under the curve is equivalent to the probability of randomly choosing a value from the distribution that is less than or equal to the largest value in the distribution. This probability certainly equals 1, therefore the total area under the curve equals 1. Finally, the x-axis is a horizontal asymptote for the normal distribution. This fact is derived from the mathematical definition of the normal curve's function, which can never equal zero. This says that the normal curve will approach the x-axis on both ends, but will never touch or cross it.
Area to the right
The normal curve table we are using, "Standard normal table: area -infinity to z" only gives the area to the left of a given z value, but we can use the table, along with the properties of the standard normal distribution, to find other areas as well. Remember that the total area under the standard normal curve is 1. So, if the table gives us the area to the left of z, then subtracting that area from 1 gives us the area to the right of z. Recall that the normal curve is symmetric about the mean. In terms of area under the curve, this means that the area to the right of z is equal to the area to the left of -z. To find the area to the right of z, instead of looking up the area to the left of z and subtracting that area from 1, you can simply look up -z. For example, Find the area under the normal curve to the right of z=2.45. Method 1: We know that the area to the left of z=2.45 is 0.9929, so the area to the right of z=2.45 is 1-0.9929, which is 0.0071. Method2: We can look up z= -2.45 , which also gives us 0.0071
Example 3: Area in between -z and z
The third problem type is finding z such that a given area lies between -z and z. Again, several different approaches can be used to solve this problem. The first method is to divide the probability given in half. Then find that value in the "0 to z" table. This will give you the value of z. Then, by symmetry, you automatically have the value for -z also. Another method is to find the area to the left in the two tails by subtracting the given area from the total area of one. Next, divide the value just found by two to give the area just in the left tail. Use this area to obtain -z from the "-infinity to -z" table. Again, by symmetry, just change the value of -z to z to obtain the positive value. Find the value of z such that 0.9464 of the area lies between -z and z. First, divide 0.9464 in half to obtain the value of 0.4732. Now use the "0 to z" table to find the positive value of z. Here z=1.93 Using an alternate method, first subtract the probability given from the total area, one. 1 - 0.9464=0.0536. Next, divide 0.0536 in half to obtain 0.0268. Using the "-infinity to -z table" you will find that -z = -1.93, and by symmetry, z=1.93