Ch. 3
Two types of random variables
Discrete random variables: Often take only integer values Ex: A year, number of people, etc Continuous random variables: Take real (decimal) values Ex: cost of books, difference in cost of books, etc
Table proportions
Summarize joint probabilities
Variance equation for variability in linear combinations of random variables
var (aX + bY) = a^2 x Var(x) + b^2 x Var (Y) Assumes random variables are independent; if not then modification to equation is necessary
Complement
The complement of event A is denoted Ac , and Ac represents all outcomes not in A. A and Ac are mathematically related: P(A) + P(A c ) = 1, i.e. P(A) = 1 − P(A c )
When bins are extremely thin
histogram begins to resemble a curve Best to use continuous numerical variables which can outline extremely slim bins
marginal probabilities
the probabilities based on a single variable without regard to any other variables
Sample space
the set of all possible outcomes
If two events are independent,
then knowing the outcome of one should provide no information about the other. We can show this is mathematically true using conditional probabilities.
Tree Diagrams
A diagram that is shaped like a tree and is used to show the outcomes of a situation or experiment.
2 important concepts about combinations of random variables:
A final value can sometimes be described as the sum of its parts in an equation It is guaranteed to be true in linear combinations of random variables (aka just a combo of 2 random variables)
sets or collections
A group of data
General Multiplication Rule
For events that might be dependent: finds the probability both A and B occur using the formula: P(A and B) = P(A|B) x P(B)
Variance of random variables
R (sometimes K) and i=1: means go through every possible value from i to R (aka K)
(a) Verify the probability of event A, P(A), is 1/3 using the Addition Rule. (b) Do the same for event B. On a dice: A: 1,2 B: 4,6
a) P(A) = P(1 or 2) = P(1) + P(2) = 1 6 + 1 6 = 2 6 = 1 3 . (b) Similarly, P(B) = 1/3
Suppose 5 people are selected at random. (a) What is the probability that all are right-handed? (b) What is the probability that all are left-handed? (c) What is the probability that not all of the people are right-handed?
a) The abbreviations RH and LH are used for right-handed and left-handed, respectively. Since each are independent, we apply the Multiplication Rule for independent processes: P(all five are RH) = P(first = RH, second = RH, ..., fifth = RH) = P(first = RH) × P(second = RH) × · · · × P(fifth = RH) = 0.91 × 0.91 × 0.91 × 0.91 × 0.91 = 0.624 Using the same reasoning as in (a), 0.09 × 0.09 × 0.09 × 0.09 × 0.09 = 0.0000059 (c) Use the complement, P(all five are RH), to answer this question: P(not all RH) = 1 − P(all RH) = 1 − 0.624 = 0.376
If we sample from a small population without replacement
we no longer have independence between our observations
We are interested in the probability of rolling a 1, 4, or 5. (a) Explain why the outcomes 1, 4, and 5 are disjoint. (b) Apply the Addition Rule for disjoint outcomes to determine P(1 or 4 or 5).2
(a) The random process is a die roll, and at most one of these outcomes can come up. This means they are disjoint outcomes. (b) P(1 or 4 or 5) = P(1) + P(4) + P(5) = 1 6 + 1 6 + 1 6 = 3 6 = 1 2
Events A = {1, 2} and B = {4, 6} are shown in Figure 3.2 on page 84. (a) Write out what Ac and Bc represent. (b) Compute P(Ac ) and P(Bc ). (c) Compute P(A) + P(Ac ) and P(B) + P(Bc )
(a) Ac = {3, 4, 5, 6} and Bc = {1, 2, 3, 5}. (b) Noting that each outcome is disjoint, add the individual outcome probabilities to get P(Ac ) = 2/3 and P(Bc ) = 2/3. (c) A and Ac are disjoint, and the same is true of B and Bc . Therefore, P(A) + P(Ac ) = 1 and P(B) + P(Bc ) = 1.
(a) Using Figure 3.2 as a reference, what outcomes are represented by event D? (b) Are events B and D disjoint? (c) Are events A and D disjoint? A: 1,2 B: 4,6 D: 2,3
(a) Outcomes 2 and 3. (b) Yes, events B and D are disjoint because they share no outcomes. (c) The events A and D share an outcome in common, 2, and so are not disjoint.
(a) Compute P(Dc ) = P(rolling a 1, 4, 5, or 6). (b) What is P(D) + P(Dc )?
(a) The outcomes are disjoint and each has probability 1/6, so the total probability is 4/6 = 2/3. (b) We can also see that P(D) = 1 6 + 1 6 = 1/3. Since D and Dc are disjoint, P(D) + P(Dc ) = 1.
About 9% of people are left-handed. Suppose 2 people are selected at random from the U.S. population. Because the sample size of 2 is very small relative to the population, it is reasonable to assume these two people are independent. (a) What is the probability that both are left-handed? (b) What is the probability that both are right-handed?
(a) The probability the first person is left-handed is 0.09, which is the same for the second person. We apply the Multiplication Rule for independent processes to determine the probability that both will be left-handed: 0.09×0.09 = 0.0081. (b) It is reasonable to assume the proportion of people who are ambidextrous (both right- and left-handed) is nearly 0, which results in P(right-handed) = 1 − 0.09 = 0.91. Using the same reasoning as in part (a), the probability that both will be right-handed is 0.91 × 0.91 = 0.8281.
In the loans data set in Chapter 2, the homeownership variable described whether the borrower rents, has a mortgage, or owns her property. Of the 10,000 borrowers, 3858 rented, 4789 had a mortgage, and 1353 owned their home a) Are the outcomes rent, mortgage, and own disjoint? (b) Determine the proportion of loans with value mortgage and own separately. (c) Use the Addition Rule for disjoint outcomes to compute the probability a randomly selected loan from the data set is for someone who has a mortgage or owns her home.
(a) Yes. Each loan is categorized in only one level of homeownership. (b) Mortgage: 4789 10000 = 0.479. Own: 1353 10000 = 0.135. (c) P(mortgage or own) = P(mortgage) + P(own) = 0.479 + 0.135 = 0.614.
Rules for probability distributions
1. The outcomes listed must be disjoint. 2. Each probability must be between 0 and 1. 3. The probabilities must total 1.
Law of large numbers
As more observations are collected, the proportion ˆpn of occurrences with a particular outcome converges to the probability p of that outcome
Probability of rolling a 1 the first and second time.
Because the rolls are independent, the probabilities of the corresponding outcomes can be multiplied to get the final answer: (1/6)×(1/6) = 1/36. This can be generalized to many independent processes
In the loans data set describing 10,000 loans, 1495 loans were from joint applications (e.g. a couple applied together), 4789 applicants had a mortgage, and 950 had both of these characteristics. Create a Venn diagram for this setup
Both the counts and corresponding probabilities (e.g. 3839/10000 = 0.384) are shown. Notice that the number of loans represented in the left circle corresponds to 3839 + 950 = 4789, and the number represented in the right circle is 950 + 545 = 1495.
"or"
Can mean and/or; A or B means A, B, or both
disjoint (mutually exclusive)
Two events are disjoint if they share no outcomes in common. If A and B are disjoint, then knowing that A occurs tells us that B cannot occur. Disjoint events are also called "mutually exclusive."
probability density function
Density distribution; special property: total area under the curve = 1
General addition rule
If A and B are any two events, disjoint or not, then the probability that at least one of them will occur is: P(A or B) = P(A)+P(B) - P(A and B)
If A and B are disjoint, describe why this implies P(A and B) = 0. (b) Using part (a), verify that the General Addition Rule simplifies to the simpler Addition Rule for disjoint events if A and B are disjoint
If A and B are disjoint, A and B can never occur simultaneously; If A and B are disjoint, then the last P(A and B) term of in the General Addition Rule formula is 0 (see part (a)) and we are left with the Addition Rule for disjoint events
Multiplication Rule for independent processes
If A and B rep events from two different and independent processes, then the probability that both A and B occur can be calculated as the product of their separate probabilities: P(A and B) = P(A) x P(B)
Addition rule of disjoint outcomes
If A1 and A2 rep. two disjoint outcomes, then the probability that one of them occurs is given by: P(A1 or A2) = P(A1) + P(A2)
Marginal and joint probabilities
If a probability is based on a single variable, it is a marginal probability. The probability of outcomes for two or more variables or processes is called a joint probability.
EXAMPLE: A "die", the singular of dice, is a cube with six faces numbered 1, 2, 3, 4, 5, and 6. What is the chance of getting 1 when rolling a die?
If the die is fair, then the chance of a 1 is as good as the chance of any other number. Since there are six outcomes, the chance must be 1-in-6 or, equivalently, 1/6.
Sum of conditional probabilities
Let A1, ..., Ak represent all the disjoint outcomes for a variable or process. Then if B is an event, possibly for another variable or process, we have: (See image)
Probability distributions
Listings of possible outcomes or events with a probability (chance of occurrence) assigned to each outcome
Standard Deviation of Random Variables i 1 2 3 total xi $0 $137 $170 P(X=xi) .2 .55 .25 xi*P(X=xi) 0 75.35 42.5 117.85 xi-E(x) -117.85 19.15 52.15 (xi-E(x))^2 13888.62 366.72 2719.62 (xi-E(x))^2 (P(X=xi)) 2777.7 201.7 679.9 3659.3 * We calculate variance; then we square root the total (3659.3)
P(X=xi): prob the value will occur xi*P(X=xi): observed value (prob of value) xi-E(x): observed-E(x) (xi-E(x))^2: same, but squared (xi-E(x))^2 (P(X=xi)) (" ")^2(probability of value occuring)
In Guided Practice 3.10, you confirmed B and D from Figure 3.2 are disjoint. Compute the probability that event B or event D occurs. A: 1,2 B: 4,6 D: 2,3
Since B and D are disjoint events, use the Addition Rule: P(B or D) = P(B) + P(D) = 1 3 + 1 3 = 2 3 .
Bayes Theorem
The probability of an event occurring based upon other event probabilities. (1) First identify the marginal probabilities of each possible outcome of the first variable: P(A1), P(A2), ..., P(Ak). (2) Then identify the probability of the outcome B, conditioned on each possible scenario for the first variable: P(B|A1), P(B|A2), ..., P(B|Ak). Once each of these probabilities are identified, they can be applied directly within the formula. Bayes' Theorem tends to be a good option when there are so many scenarios that drawing a tree diagram would be complex.
Let X and Y represent the outcomes of rolling two dice (a) What is the probability that the first die, X, is 1? (b) What is the probability that both X and Y are 1? (c) Use the formula for conditional probability to compute P(Y = 1 | X = 1). (d) What is P(Y = 1)? Is this different from the answer from part (c)? Explain. We can show in Guided Practice 3.38(c) that the conditioning information has no influence by using the Multiplication Rule for independence processes: P(Y = 1 | X = 1) = P(Y = 1 and X = 1) ------------------- means divided by P(X = 1) = P(Y = 1) × P(X = 1) ------------------ P(X = 1) = P(Y = 1)
The samples are large relative to the difference in death rates for the "inoculated" and "not inoculated" groups, so it seems there is an association between inoculated and outcome. However, as noted in the solution to Guided Practice 3.33, this is an observational study and we cannot be sure if there is a causal connection. (Further research has shown that inoculation is effective at reducing death rates.)
Conditional probability
the likelihood that a target behavior will occur in a given circumstance
Variance of random variables i 1 2 3 total xi $0 $137 $170 P(X=xi) .2 .55 .25 xi*P(X=xi) 0 75.35 42.5 117.85 xi-E(x) -117.85 19.15 52.15 (xi-E(x))^2 13888.62 366.72 2719.62 (xi-E(x))^2 (P(X=xi)) 2777.7 201.7 679.9 3659.3
i: observation number xi: value observed P(X=xi): prob the value will occur xi*P(X=xi): observed value (prob of value) xi-E(x): observed-E(x) (xi-E(x))^2: same, but squared (xi-E(x))^2 (P(X=xi)) (" ")^2(probability of value occuring)
Expected outcome (or the mean) table: i 1 2 3 total xi $0 $137 $170 P(X=xi) .2 .55 .25 xi*P(X=xi) 0 75.35 42.5 117.85
i: observation number xi: value observed P(X=xi): prob. the value will occur xi*P(X=xi): observed value (prob of value)
Independence
if knowing the outcome of one provides no useful information about the outcome of the other
If we sample with replacement
independent
Expected value of a Discrete Random Variable
is computed as a weighted average of the possible outcome of that random variable, where the weights are the probabilities of that outcome.
When considering the average of a linear combo of random variables,
it is sage to plug in the mean of each random variable and then compute the final result
Random variable
l a variable or process with a numerical outcome represent this random variable with a capital letter such as X, Y , or Z. P(X=x) - probability of a random variable being Equal to a certain amount.
Bin size of histograms
larger bin = less detail smaller = more descriptive
Probability
proportion of times the outcome would occur if we observed the random process an infinite number of times.
Bayesian statistics
statistics that involve a formula for calculating the likelihood of a hypothesis being true and meaningful, taking into account relevant prior knowledge