Chapter 3 - Probability (3)
Expected value (definition)
Expected value of a discrete random variable If X takes outcomes x1, x2, ..., xn with probabilities p1, p2, ..., pn, the expected value of X is the sum of each outcome multiplied by its corresponding probability:
Addition Rule of Disjoint Outcomes
If A1 and A2 represent two disjoint outcomes, then the probability that one of them occurs is given by P(A1 or A2) = P(A1) + P(A2) (AND multiply OR add)
Standard deviation of the sum and difference of random variables
If X and Y are independent random variables: SDX+Y = SDX−Y = (SDX )2 + (SDY )2
Binomial formula
Suppose the probability of a single trial being a success is p. Then the probability of observing exactly k successes in n independent trials is given by
Complement (A^c)
The complement of event A is denoted Ac, and Ac represents all outcomes not in A. A and Ac are mathematically related: P(A) + P(Ac) = 1, i.e. P(A) = 1 − P(Ac)
Conditional probability (concept)
The conditional probability of the outcome of interest A given condition B is computed as the following: P (A|B) = P (A and B) / P(B)
Computing binomial probabilities
The first step in using the binomial model is to check that the model is appropriate. If it is, the next step is to identify n, p, and k. The final step is to apply the formulas and interpret the results.
Symbolic notation for "and" + "or"
The symbol ∩ means intersection and is equivalent to "and". The symbol ∪ means union and is equivalent to "or". It is common to see the General Addition Rule written as P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
Binomial conditions (BINS)
(1) The trials are independent. (2) The number of trials, n, is fixed. (3) Each trial outcome can be classified as a success or failure. (4) The probability of a success, p, is the same for each trial.
Rules for probability distributions
1. The outcomes listed must be disjoint. 2. Each probability must be between 0 and 1. 3. The probabilities must total 1.
Random variable (definition)
A random process or variable with a numerical outcome.
Probability distributuion
A table of all disjoint outcomes and their associated probabilities.
Law of Large Numbers
As more observations are collected, the observed proportion pˆn of occurrences with a particular outcome after n trials converges to the true probability p of that outcome.
General Additional Rule
If A and B are any two events, disjoint or not, then the probability that A or B will occur is P (A or B) = P (A) + P (B) − P (A and B) where P(A and B) is the probability that both events occur.
Multiplication rule for independent processes
If A and B represent events from two different and independent processes, then the probability that both A and B occur can be calculated as the product of their separate probabilities: P(A and B) = P(A) × P(B)
General Multiplication Rule
If A and B represent two outcomes or events, then P(A and B) = P(A|B) × P(B) For the term P(A|B), it is useful to think of A as the outcome of interest and B as the condition.
Variance of a discrete random variable
If X takes outcomes x1, x2, ..., xn with probabilities p1, p2, ..., pn and expected value μx = E (X ), then to find the standard deviation of X , we first find the variance and then take its square root. Var(X)=σx2 =(x1 −μx)2 ×p1 +(x2 −μx)2 ×p2 +···+(xn −μx)2 ×pn n
Condition
In P (A|B) = P (A and B) / P(B), condition is event B.
Outcome of interest
In P (A|B) = P (A and B) / P(B), outcome of interest event A.
Marginal probability
Probabilities based on a single variable without conditioning on any other variables
Notation P(A)
Probability of outcome A.
Sample space (S)
The set of all possible outcomes. We often use the sample space to examine the scenario where an event does not occur.
Standard deviation of the linear combinations of random variables
To find the standard deviation of a linear combination of random variables, we first consider aX and bY separately. We find the standard deviation of each, and then we apply the equation for the standard deviation of the sum of two variables: SDaX+bY = (a×SDX)2 +(b×SDY)2 This equation is valid as long as the random variables X and Y are independent of each other.
Venn diagram
Useful when outcomes can be categorized as "in" or "out" for two or three variables, attributes, or random processes.
Disjoint (mutually exclusive)
When two outcomes cannot both happen in the same trial. ex. Rolling both a 1 and a 2 in the same roll of one die.
"or" is inclusive
When we write, "or" in statistics, we mean "and/or" unless we explicitly state otherwise. Thus, A or B occurs means A, B, or both A and B occur. This is equivalent to at least one of A or B occurring.