Ch 5: Discrete Probability Distributions
Jacob Bernouli Binomial Random Experiment
An urn containing black and white pebbles. -Every pebble represented a subject. -A subject could have two characteristics (white or black). -The proportion of black pebbles is unknown (urn is not transparent). -To find out the proportion of black pebbles, a sample of pebbles is selected. -A binomial random experiment is like drawing pebbles from an urn.
Continuous Random Variables
Continuous Random Variables can take on any value in an interval and cannot be enumerated. Pr(X=x)=0. For example, for continuous random variables
Discrete Random Variables
Discrete Random Variables have a countable number of possible values (may be infinite but there are gaps between the possible values and we can enumerate them 0, 1, 2, 3, ....)
Probability Distribution
Every random variable has a corresponding probability distribution. A probability distribution applies the theory of probability to describe the behavior of a random variable. In continuous cases, it allows us to determine the probabilities associated with specified ranges of values. In discrete cases, it specifies all possible outcomes along with the probability that each will occur. The probabilities represent the relative frequency of occurrence of each outcome x in a large number of trials repeated under essentially identical conditions. Can also be thought of as the relative frequencies associated with an infinitely large sample. It has exhaustive list of all possible values. Hence the sum of probabilities must be 1. The area of each vertical bar represents P(X=x)
Binomial Distribution Continued
If X is binomially distributed, then: where x = number of successes of interest (x = 0, 1, 2, . . . n ) -p = P(success) on each trial -n = number of trials Expected Value (Mean) =E(X)=μ=nπ Variance=Var(X)=σ2 =nπ(1-π) A special case of a binomial random variable with n=1 is also called a Bernoulli trial.
Permutations
In most biostatistical applications, we sample without replacement, so we are concerned only with the last two sampling strategies: We sample without replacement and we care about order: S = { (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,4), (3,5), (3,6),(4,1), (4,2), (4,3), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), } These are permutations, or ordered arrangements. -The number of permutations of size r that can be made from N objects is given by
Combinations
In most biostatistical applications, we sample without replacement, so we are concerned only with the last two sampling strategies: We sample without replacement and we don't care about order: S = {(1,2), (1,3), (1,4), (1,5), (1,6), (2,3), (2,4), (2,5), (2,6), (3,4), (3,5), (3,6), (4,5), (4,6), (5,6)} These are combinations, or unordered arrangements. -The number of combinations of size r that can be made from N objects is given by
Mean and SD of Discrete RV
Multiply each x value by its probability and add the results to get μx = 1.41. Variance = σx2 = 1.4019 standard deviation = σx = 1.184.
Summary
Probability distributions show the probability associated with the possible outcomes. Binomial distribution is used to find the probability of x successes in n trials when the trials are independent and have the same probability.
Cumulative Distribution Function
The CDF of a random variable is For a discrete random variable: 1. The CDF will have "jumps" at each possible value equal to the probability of that value 2. The graph of the CDF will be a step function 3. The graph increases from a minimum of 0 to a maximum of 1 Continuous random variables have continuous CDFs with no jumps
Binomial Distribution
The binomial distribution is a probability model that applies when the following are true: 1. Each performance of an experiment results in 1 of only 2 possible outcomes, usually denoted Success or Failure. 2. The probability that the outcome is a success is constant from experiment to experiment. 3. The experiments are independent. Examples in which X follows a binomial distribution: -Flip a coin n times. Let X={number of heads}. -Provide a medical treatment to n subjects. Record whether each subject survives. Let X={number of survivors} -Ask survey respondents: "Do you believe in capital punishment?" Let X={number who answer "Yes"}
Combinations and Permutations
What if we randomly select 2 people instead of 1? -What is P(at least one is a man)? -All selections are still equally likely, so -P(at least one M) = (# ways to get 1 M)/(# ways to pick 2) But these numbers depend on how we select the 2 people: -Can the same person be selected twice? (No=sampling without replacement; Yes = sampling with replacement) -Do we care who was selected first and who second? R code factorial: factorial(N) permutation:factorial(N)/factorial(r) combination:choose(N, r)
Binomial Distribution Right Skew
When p<0.5
Symmetrical Binomial Distribution
When p=0.5 The variance np(1-p) is largest when p=0.5
Binomial Distribution Left Skew
When p>0.5
Pascal's Triangle
nCr are also called binomial coefficients. We can use a device called Pascal's triangle to find the values of nCk .