MAT-152-COS1 Chapter 5
Significantly high number of successes:
x successes among n trials is a significantly high number of successes if the probability of x or more successes is 0.05 or less. That is, x is a significantly high number of successes if P(x or more) ≤ 0.05. The value 0.05 is not absolutely rigid. Other values, such as 0.01, could be used to distinguish between results that are significant and those that are not significant.
Significantly low number of successes:
x successes among n trials is a significantly low number of successes if the probability of x or fewer successes is 0.05 or less. That is, x is a significantly low number of successes if P(x or fewer) ≤ 0.05. The value 0.05 is not absolutely rigid. Other values, such as 0.01, could be used to distinguish between results that are significant and those that are not significant.
Binomial Probability Distribution
•A binomial probability distribution results from a procedure that meets these four requirements: •Each trial must have all outcomes classified into exactly two categories, commonly referred to as success and failure. •The probability of a success remains the same in all trials.
Binomial Probability Distribution
•A binomial probability distribution results from a procedure that meets these four requirements: •The procedure has a fixed number of trials. (A trial is a single observation.) •The trials must be independent, meaning that the outcome of any individual trial doesn't affect the probabilities in the other trials.
Every probability distribution must satisfy each of the following three requirements.
1.There is a numerical (not categorical) random variable x, and its number values are associated with corresponding probabilities. 2.∑P(x) = 1 where x assumes all possible values. (The sum of all probabilities must be 1, but sums such as 0.999 or 1.001 are acceptable because they result from rounding errors.)
Probability Distribution Requirements
3.0 ≤ P(x) ≤ 1 for every individual value of the random variable x. (That is, each probability value must be between 0 and 1 inclusive.)
Treating Dependent Events as Independent
5% Guideline for Cumbersome Calculations When sampling without replacement and the sample size is no more than 5% of the size of the population, treat the selections as being independent (even though they are actually dependent).
Continuous Random Variable
A continuous random variable has infinitely many values, and the collection of values is not countable. (That is, it is impossible to count the individual items because at least some of them are on a continuous scale, such as body temperatures.)
Discrete Random Variable
A discrete random variable has a collection of values that is finite or countable. (If there are infinitely many values, the number of values is countable if it is possible to count them individually, such as the number of tosses of a coin before getting heads.)
Example: Using Parameters to Determine Significance
A previous example involved n = 460 overtime wins in NFL football games. We get p = 0.5 and q = 0.5 by assuming that winning the overtime coin toss does not provide an advantage, so both teams have the same 0.5 chance of winning the game in overtime. a. Find the mean and standard deviation for the number of wins in groups of 460 games. b. Use the range rule of thumb to find the values separating the numbers of wins that are significantly low or significantly high. c. Is the result of 252 overtime wins in 460 games significantly high? Solution a. With n = 460, p = 0.5, and q = 0.5, previous formulas can be applied as follows: µ = np = (460)(0.5) = 230.0 games For random groups of 460 overtime games, the mean number of wins is 230.0 games, and the standard deviation is 10.7 games. Solution b. The values separating numbers of wins that are significantly low or significantly high are the values that are two standard deviations away from the mean. With µ = 230.0 games and σ = 10.7 games, we get (µ − 2σ) = 230.0 − 2(10.7) = 208.6 games (µ + 2σ) = 230.0 + 2(10.7) = 251.4 games Solution Significantly low numbers of wins are 208.6 games or fewer, significantly high numbers of wins are 251.4 games or greater, and values not significant are between 208.6 games and 251.4 games. c. The result of 252 wins is significantly high because it is greater than the value of 251.4 games found in part (b).
Probability Formula
A probability distribution could also be in the form of a formula. Consider the formula We find that P(0) = 0.25, P(1) = 0.50, and P(2) = 0.25. The probabilities found using this formula are the same as those in the table.
Probability Distribution
A probability distribution is a description that gives the probability for each value of the random variable. It is often expressed in the format of a table, formula, or graph.
Probability Histogram: Graph of a Probability Distribution
A probability histogram is similar to a relative frequency histogram, but the vertical scale shows probabilities instead of relative frequencies based on actual sample results. Probability Histogram for Number of Heads When Two Coins Are Tossed
Random Variable
A random variable is a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure.
Example: Devil of a Problem
Based on a Harris poll, 60% of adults believe in the devil. Assuming that we randomly select five adults, use Table A-1 to find the following: a. The probability that exactly three of the five adults believe in the devil b. The probability that the number of adults who believe in the devil is at least two Solution a. The following excerpt from the table shows that when n = 5 and p = 0.6, the probability for x = 3 is given by P(3) = 0.346. TABLE A-1
Example: Coin Toss
Let's consider tossing two coins, with the following random variable: x = number of heads when two coins are tossed The above x is a random variable because its numerical values depend on chance.
Expected Value
Expected Value
Using Mean and Standard Deviation for Critical Thinking
For Binomial Distributions
Example: Twitter
Given that there is a 0.85 probability that a randomly selected adult knows what Twitter is, use the binomial probability formula to find the probability that when five adults are randomly selected, exactly three of them know what Twitter is. That is, apply the previous formula to find P(3) given that n = 5, x = 3, p = 0.85, and q = 0.15. Solution Using the given values of n, x, p, and q in the binomial probability formula, we get The probability of getting exactly three adults who know Twitter among five randomly selected adults is 0.138.
Example: Job Interview Mistakes
Hiring managers were asked to identify the biggest mistakes that job applicants make during an interview, and the table below is based on their responses (based on data from an Adecco survey). Does the table below describe a probability distribution? Solution The table violates the first requirement because x is not a numerical random variable. The "values" of x are categorical data, not numbers. The table also violates the second requirement because the sum of the probabilities is 1.57, but that sum should be 1. Because the three requirements are not all satisfied, we conclude that the table does not describe a probability distribution.
Methods for Finding Binomial Probabilities
Method 1: Binomial Probability Formula where n = number of trials x = number of successes among n trials p = probability of success in any one trial q = probability of failure in any one trial (q = 1 − p)
Methods for Finding Binomial Probabilities
Method 2: Using Excel
The Rare Event Rule for Inferential Statistics
If, under a given assumption, the probability of a particular outcome is very small and the outcome occurs significantly less than or significantly greater than what we expect with that assumption, we conclude that the assumption is probably not correct.
Methods for Finding Binomial Probabilities
Method 2: Using Excel Excel can be used to find binomial probabilities. The screen displays on the next slide list binomial probabilities for n = 5 and p = 0.85, as in the previous example. Notice that in each display, the probability distribution is given as a table. Method 2: Using Excel
Methods for Finding Binomial Probabilities
Method 3: Using Table A-1 in Appendix A This method can be skipped if technology is available. Table A-1 in Appendix A lists binomial probabilities for select values of n and p. It cannot be used if n > 8 or if the probability p is not one of the 13 values included in the table. To use the table of binomial probabilities, we must first locate n and the desired corresponding value of x. At this stage, one row of numbers should be isolated. Now align that row with the desired probability of p by using the column across the top. The isolated number represents the desired probability. A very small probability, such as 0.000064, is indicated by 0+.
Identifying Significant Results with the Range Rule of Thumb
Range Rule of Thumb for Identifying Significant Values •Significantly low values are (µ − 2σ) or lower. •Significantly high values are (µ + 2σ) or higher. •Values not significant: Between (µ − 2σ) and (µ + 2σ).
Parameters of a Probability Distribution
Remember that with a probability distribution, we have a description of a population instead of a sample, so the values of the mean, standard deviation, and variance are parameters, not statistics. The mean, variance, and standard deviation of a discrete probability distribution can be found with the following formulas: •Mean, µ, for a probability distribution
Notation for Binomial Probability Distributions
S and F (success and failure) denote the two possible categories of all outcomes. P(S) = p(p = probability of a success) P(F) = 1 − p = q(q = probability of a failure) n the fixed number of trials x - a specific number of successes in n trials, so x can be any whole number between 0 and n, inclusive p - probability of success in one of the n trials q - probability of failure in one of the n trials P(x) - probability of getting exactly x successes among the n trials The word success as used here is arbitrary and does not necessarily represent something good. Either of the two possible categories may be called the success S as long as its probability is identified as p. CAUTION When using a binomial probability distribution, always be sure that x and p are consistent in the sense that they both refer to the same category being called a success.
Using Probabilities to Determine When Results Are Significantly High or Low
Significantly high number of successes: x successes among n trials is significantly high if the probability of x or more successes is 0.05 or less. That is, x is a significantly high number of successes if P(x or more) ≤ 0.05. Significantly low number of successes: x successes among n trials is significantly low if the probability of x or fewer successes is 0.05 or less. That is, x is a significantly low number of successes if P(x or fewer) ≤ 0.05.
Range Rule of Thumb
Significantly low values ≤ (µ − 2σ) Significantly high values ≥ (µ + 2σ) Values not significant: Between (µ − 2σ) and (µ + 2σ)
Example: Be a Better Bettor
Solution
Example: Be a Better Bettor
Solution Interpretation The $5 bet in roulette results in an expected value of −26¢ and the $5 bet in craps results in an expected value of −7¢. Because you are better off losing 7¢ instead of losing 26¢, the craps game is better in the long run, even though the roulette game provides an opportunity for a larger payoff when playing the game once.
Example: Finding the Mean, Variance, and Standard Deviation
Solution The standard deviation is the square root of the variance, so Interpretation When tossing two coins, the mean number of heads is 1.0 head, the variance is 0.50 heads², and the standard deviation is 0.7 head. Also, the expected value for the number of heads when two coins are tossed is 1.0 head, which is the same value as the mean. If we were to collect data on a large number of trials with two coins tossed in each trial, we expect to get a mean of 1.0 head.
Example: Finding the Mean, Variance, and Standard Deviation
Solution The two columns at the left describe the probability distribution. The two columns at the right are for the purposes of the calculations required.
Example: Devil of a Problem
Solution b. The phrase "at least two" successes means that the number of successes is 2 or 3 or 4 or 5.
Parameters of a Probability Distribution
Standard deviation, σ, for a probability distribution
Key Concept
The focus of this section is the binomial probability distribution and methods for finding probabilities. Easy methods for finding the mean and standard deviation of a binomial distribution are also presented. As in other sections, we stress the importance of interpreting probability values to determine whether events are significantly low or significantly high.
Parameters of a Probability Distribution
Variance, σ², for a probability distribution
Example: Finding the Mean, Variance, and Standard Deviation
The table describes the probability distribution for the number of heads when two coins are tossed. Find the mean, variance, and standard deviation for the probability distribution described.
Key Concept
This section introduces the concept of a random variable and the concept of a probability distribution. We illustrate how a probability histogram is a graph that visually depicts a probability distribution. We show how to find the important parameters of mean, standard deviation, and variance for a probability distribution. Most importantly, we describe how to determine whether outcomes are significant (significantly low or significantly high).
Example: Identifying Significant Results with the Range Rule of Thumb
We found that when tossing two coins, the mean number of heads is µ = 1.0 head and the standard deviation is σ = 0.7 head. Use those results and the range rule of thumb to determine whether 2 heads is a significantly high number of heads. Solution Using the range rule of thumb, the outcome of 2 heads is significantly high if it is greater than or equal to (µ + 2σ) . With µ = 1.0 head σ = 0.7 head, we get (µ + 2σ) = 1 + 2(0.7) = 2.4 heads Significantly high numbers of heads are 2.4 and above. Interpretation Based on these results, we conclude that 2 heads is not a significantly high number of heads (because 2 is not greater than or equal to 2.4).
Example: Overtime Rule in Football
We previously noted that between 1974 and 2011, there were 460 NFL football games decided in overtime, and 252 of them were won by the team that won the overtime coin toss. Is the result of 252 wins in the 460 games equivalent to random chance, or is 252 wins significantly high? We can answer that question by finding the probability of 252 wins or more in 460 games, assuming that wins and losses are equally likely. Solution Using the notation for binomial probabilities, we have n = 460, p = 0.5, q = 0.5, and we want to find the sum of all probabilities for each value of x from 252 through 460. The formula is not practical here, because we would need to apply it 209 times—we don't want to go there. Table A-1 (Binomial Probabilities) doesn't apply because n = 460, which is way beyond the scope of that table. Instead, we wisely choose to use Excel. Solution The Excel display on the next page shows that the probability of 251 or fewer wins in 460 overtime games is 0.978 (rounded). The probability of 252 or more wins in 460 overtime games is therefore 1− 0.978 = 0.022, which is low (such as less than 0.05). This shows that it is unlikely that we would get 252 or more wins by chance. If we effectively rule out chance, we are left with the more reasonable explanation that the team winning the overtime coin toss has a better chance of winning the game. Solution
Example: Twitter
When an adult is randomly selected (with replacement), there is a 0.85 probability that this person knows what Twitter is (based on results from a Pew Research Center survey). Suppose that we want to find the probability that exactly three of five randomly selected adults know what Twitter is. a. Does this procedure result in a binomial distribution? b. If this procedure does result in a binomial distribution, identify the values of n, x, p, and q. Solution a. This procedure does satisfy the requirements for a binomial distribution, as shown below. 1.The number of trials (5) is fixed. 2.The 5 trials are independent because the probability of any adult knowing Twitter is not affected by results from other selected adults. Solution 3.Each of the 5 trials has two categories of outcomes: The selected person knows what Twitter is or that person does not know what Twitter is. 4.For each randomly selected adult, there is a 0.85 probability that this person knows what Twitter is, and that probability remains the same for each of the five selected people. Solution b. Having concluded that the given procedure does result in a binomial distribution, we now proceed to identify the values of n, x, p, and q. 1.With five randomly selected adults, we have n = 5. 2.We want the probability of exactly three who know what Twitter is, so x = 3. Solution 3.The probability of success (getting a person who knows what Twitter is) for one selection is 0.85, so p = 0.85. 4.The probability of failure (not getting someone who knows what Twitter is) is 0.15, so q = 0.15. Solution Again, it is very important to be sure that x and p both refer to the same concept of "success." In this example, we use x to count the number of people who know what Twitter is, so p must be the probability that the selected person knows what Twitter is. Therefore, x and p do use the same concept of success: knowing what Twitter is.
Example: Be a Better Bettor
Which of the preceding two bets is better in the sense of producing higher expected value?
Example: Coin Toss
With two coins tossed, the number of heads can be 0, 1, or 2, and the table is a probability distribution because it gives the probability for each value of the random variable x and it satisfies the three requirements listed earlier: 1.The variable x is a numerical random variable, and its values are associated with probabilities. 2.∑P(x) = 0.25 + 0.50 + 0.25 = 1 3.Each value of P(x) is between 0 and 1. The random variable x in the table is a discrete random variable, because it has three possible values (0, 1, 2), and three is a finite number, so this satisfies the requirement of being finite.
Example: Be a Better Bettor
You have $5 to place on a bet in the Golden Nugget casino in Las Vegas. You have narrowed your choice to one of two bets: Roulette: Bet on the number 7 in roulette. Craps: Bet on the "pass line" in the dice game of craps.