Probability
Probability: Joint, Marginal and Conditional Probabilities: Bayes' Theorem Example Let's assume we know that 1% of women over the age of 40 have breast cancer. [p(cancer)=0.01] Let's assume that 90% of women who have breast cancer will testpositive for breast cancer in a mammogram. [p(positive test | cancer)=0.9] Eight percent of women that do NOT have cancer will also test positive. [p(positive test | no cancer)=0.08] What is the probability that a woman has cancer if she tests positive [p(cancer | positive test)]?
- 1% of women over the age of 40 have breast cancer. [p(cancer)=0.01] - 90% of women who have breast cancer will testpositive for breast cancer in a mammogram. [p(positive test | cancer)=0.9] Eight percent of women that do NOT have cancer will also test positive. [p(positive test | no cancer)=0.08] We will call p(cancer) = P(A), and the P(positive test) = P(B). We want to know P(A|B)-the probability of having cancer if you have a positive test. Using Bayes' theorem, we calculate that the likelihood that a woman has breast cancer, given a positive test equals approximately 0.10. This makes intuitive sense as (1) this result is greater than 1% (the percent of breast cancer in the general public). { [p(positive test | cancer)=0.9] * [p(cancer)=0.01] } / { [p(positive test | cancer)=0.9] * [p(cancer)=0.01] + [p(positive test | no cancer)=0.08] * [p(no cancer = .99] }
Probability: Probability Axioms/Rules
1. A probability may range from zero (0) to one (1), inclusive. 2. The probabilities of all possible outcomes must sum to one. This axiom can be written as: see picture This is the short hand for writing 'the sum (the sigma sign) of the probabilities (p) of all events (Ai) from i=0 to i=n equals one'.
Probability Distributions: Learning Objectives
1. Understand the difference between a discrete and continuous probability distribution. 2. Understand a discrete distribution. 3. Understand the binomial distribution (discrete) and calculate probabilities of discrete outcomes. 4. Understand and calculate probabilities of the Poisson (discrete) distribution. 5. Understand the standard normal probability distribution (mean of zero, sd of 1). 6. Be able to apply the three sigma rule (68-95-99.7 rule). 7. Understand and be able to calculate z-scores. 8. Be able to find probabilities greater than, less than or within some range of z-scores from a normal probability table. 9. Understand the student's t-distribution and be able to calculate probabilities from t-scores.
Probability Distribution: The Normal Probability Distribution
A probability distribution is formed from all possible outcomes of a random process (for a random variable X) and the probability associated with each outcome. Probability distributions may either be discrete (distinct/separate outcomes, such as number of children) or continuous (a continuum of outcomes, such as height). A probability density function is defined such that the likelihood of a value of X between a and b equals the integral (area under the curve) between a and b. This probability is always positive. Further, we know that the area under the curve from negative infinity to positive infinity is one. The normal probability distribution, one of the fundamental continuous distributions of statistics, is actually a family of distributions (an infinite number of distributions with differing means (μ) and standard deviations (σ). Because the normal distribution is a continuous distribution, we can not calculate exact probability for an outcome, but instead we calculate a probability for a range of outcomes (for example the probability that a random variable X is greater than 10). The normal distribution is symmetric and centered on the mean (same as the median and mode). While the x-axis ranges from negative infinity to positive infinity, nearly all of the X values fall within +/- three standard deviations of the mean (99.7% of values), while ~68% are within +/-1 standard deviation and ~95% are within +/- two standard deviations. This is often called the three sigma rule or the 68-95-99.7 rule. The normal density function is shown below (this formula won't be on the diagnostic!) As illustrated at the top of this page, the standard normal probability function has a mean of zero and a standard deviation of one. Often times the x values of the standard normal distribution are called z-scores. We can calculate probabilities using a normal distribution table (z-table). Here is a link to a normal probability table. It is important to note that in these tables, the probabilities are the area to the LEFT of the z-score. If you need to find the area to the right of a z-score (Z greater than some value), you need to subtract the value in the table from one.
Probability Distribution: The Normal Probability Distribution - continued Normal standard distribution in table diagram
As illustrated at the top of this page, the standard normal probability function has a mean of zero and a standard deviation of one. Often times the x values of the standard normal distribution are called z-scores. We can calculate probabilities using a normal distribution table (z-table). Here is a link to a normal probability table. It is important to note that in these tables, the probabilities are the area to the LEFT of the z-score. If you need to find the area to the right of a z-score (Z greater than some value), you need to subtract the value in the table from one.
Probability: Joint, Marginal and Conditional Probabilities: Bayes' Theorem
Bayes' theorem: an equation that allows us to manipulate conditional probabilities. For two events, A and B, Bayes' theorem lets us to go from p(B|A) to p(A|B) if we know the marginal probabilities of the outcomes of A and the probability of B, given the outcomes of A.
Probability: Probability Axioms/Rules: Practice Problem It is estimated that 40% of Durham residents visit Falls Lake in a given year. Three Durham residents are selected at random. What is the likelihood that all three residents visited Falls Lake last year?
Because the population of Durham is so large (approx. 230,000), we don't have to worry the issues of replacement vs. non-replacement in our sampling. We can use the multiplication rule above in the form: P(A and B and C) = P(A) * P(B) * P(C) = 0.4 * 0.4 * 0.4 = 0.064.
Discrete Probability Distributions: Binomial Distribution
Binomial Distribution: The binomial distribution is a probability distribution designed to calculate the probability of outcomes of a set of binary independent events (often called trials) (e.g., success/failure; yes/no; presence/absence; correct/incorrect, etc.). Each trial must be INDEPENDENT (the result of one trial must not affect the likelihood of result of another trial). The probability of success must remain constant across all trials. The first part of the binomial formula is the binomial coefficient and is often stated as 'n choose k').
Probability Introduction: Sample Space, Independent and Dependent Events: COMPLEMENT
Complement (often denoted as A^c or A with a bar on top) : the probability of the complement of A includes the sum of all probabilities in the sample space that are not A. For example, the probability of the complement of rolling a five on a die (S={1,2,3,4,6}) equals 5/6.
Probability: Joint, Marginal and Conditional Probabilities: Conditional Probability
Conditional probability: p(A|B) is the probability of event A occurring, given that event B occurs. Example: given that you drew a red card, what's the probability that it's a four (p(four|red))=2/26=1/13. So out of the 26 red cards (given a red card), there are two fours so 2/26=1/13.
Probability Distribution: Continuous Probability Distribution
Continuous probability distribution: A probability distribution in which the random variable X can take on any value (is continuous). Because there are infinite values that X could assume, the probability of X taking on any one specific value is zero. Therefore we often speak in ranges of values (p(X>0) = .50). The normal distribution is one example of a continuous distribution. The probability that X falls between two values (a and b) equals the integral (area under the curve) from a to b:
Probability Introduction: Continuous random variable
Continuous random variable: outcomes and related probabilities are not defined at specific values, but rather over an interval of values. An example: A random variable, X, the weight of an adult blue crab caught from North Creek, may range from 0.5 to 3.0 kg. The probability that an adult weighs between these values is the area under the curve of the probability density function. **any value in an interval - - there is NO WAY to count continuous variables (e.g. .01; .02... <-- but what about btw two values) e.g. Y = exact mass of a random animal selected at the zoo y = exact winning time for men's 100m dash
Probability Introduction: Sample Space, Independent and Dependent Events: DEPENDENT EVENTS
Dependent events: if an event occurring (A) changes the probability of another event occurring (B), we can say that the probability of event B is dependent on event A. For example, the probability of elevated ground-level ozone concentrations is dependent on the occurrence of a large traffic jam. https://www.khanacademy.org/math/ap-statistics/probability-ap/probability-multiplication-rule/v/introduction-to-dependent-probability
Probability Distributions: Discrete probability distribution
Discrete probability distribution: describes a probability distribution of a random variable X, in which X can only take on the values of discrete integers. Example: Number of earthquakes (X) in the US that are 7.5 (Richter Scale) or higher in a given year.
Probability Introduction: Discrete random variable
Discrete random variable: the random outcomes are countable (finite) and values between these counts can not occur. An example: A random variable, X, takes on the value of one if a coin shows heads, and zero if tails. The expected value or mean (μ) of a discrete random variable is Σxp(x). **distinct / seperate values - usually if you're dealing w/ discrete variables it's prob a finite number of values BUT it does not have to be - as long as the values are countable (1st value for x; 2nd value for x...) it's discrete e.g. heads or tails e.g. y= yr that a random student was born e.g. z = # of ants born tmrw in the universe
Probability Distribution: Example Normal Problem We want to determine the probability that a randomly selected blue crab has a weight greater than 1 kg. Based on previous research we assume that the distribution of weights (kg) of adult blue crabs is normally distributed with a population mean (μ) of 0.8 kg and a standard deviation (σ) of 0.3 kg. How do we determine this probability?
First, we calculate the z score by replacing X with 1, the mean (μ) with 0.8 and standard deviation (σ) with 0.3. We calculate our z-score to be (1-0.8)/0.3=0.6667. We can then look in our z table to determine the p(z>0.6667) is roughly 1-0.748 (pulled from the chart, somewhere between 0.7454 and 0.7486) = 0.252. Therefore, based on our normality assumption, we conclude that the likelihood that a randomly selected adult blue crab weighs more than one kilogram is roughly 25.2% (the area shaded in blue).
Probability Distributions: Binomial Distribution - Sample Question 2 You take a 10 question multiple-choice test. Each question has four choices and you guess on each question, what is the probability of getting exactly 7 questions correct?
Fortunately, you may look up these probabilities in a binomial table. If you go to page T-9 and go to n=10, k=9 then you can scroll across to p=0.25. You will see that the probability p(X=7) = 0.0031, the same probability we calculated. Extending on the example problem above, you might want to know what is the probability of getting a 7 OR LESS on the quiz. In this case you would want to add p(X=0) + p(X=1) + p(X=2) + p(X=3) + p(X=4) + p(X=5) + p(X=6) + p(X=7). We can do this fairly easily using the binomial table. Looking at the binomial table, we would want to add 0.0563 + 0.1877 + 0.2816 + 0.2503 + 0.1460 + 0.0584 + 0.0162 + 0.0031 =0.999 or, even more easily, we could take 1-[p(X=10) + P(X=9) + P(X=8)] = 1-[0 + 0 + 0.0004] = 0.999.
Probability: Probability Axioms/Rules: MULTIPLICATION RULE (INDEPENDENT EVENTS) SAMPLE QUESTION: if we roll the die twice and want to know the probability of getting a one on both rolls:
If two events are INDEPENDENT (the occurrence of one event does not affect the probability of another event occurring), then P(A and B) = P(A)P(B) P(one (roll1) and one (roll2)) = P(one(roll1))*P(one(roll2)) = 1/6 * 1/6 = 1/36
Probability Introduction: Sample Space, Independent and Dependent Events: INDEPENDENT EVENTS
Independent events: If an event occurring does not alter the probability of another event occurring, we say that these events are independent. For example, if we roll a die twice, getting a three on the first roll does not affect the probability of getting a three on the second roll. Therefore, we can say that these two events are independent.
Probability: Joint, Marginal and Conditional Probabilities: Joint Probability
Joint probability: p(A and B). The probability of event A and event B occurring. It is the probability of the intersection of two or more events. The probability of the intersection of A and B may be written p(A ∩ B). Example: the probability that a card is a four and red =p(four and red) = 2/52=1/26. (There are two red fours in a deck of 52, the 4 of hearts and the 4 of diamonds).
Probability: Joint, Marginal and Conditional Probabilities Marginal Probability
Marginal probability: the probability of an event occurring (p(A)), it may be thought of as an unconditional probability. It is not conditioned on another event. Example: the probability that a card drawn is red (p(red) = 0.5). Another example: the probability that a card drawn is a 4 (p(four)=1/13).
Probability Introduction: Sample Space, Independent and Dependent Events: MUTUALLY EXCLUSIVE EVENTS
Mutually exclusive events: Two (or more) events that can not occur at the same time. p(A and B) = 0. Example: The Duke's women's basketball team can not both win (event A) and lose (event B) a game, therefore p(win and lose) or p(A and B) = 0. We also call these events disjoint.
Probability Distributions: Binomial Distribution - Question 1 If you flip a coin four times, what's the probability of getting three heads?
One way to solve the problem is to think about all of the possible combinations of heads (H) and tails (T) on four flips. We know that each combination is equally likely (assuming the probability of getting heads and tails are the same). We could then count how many of these 16 combinations contain three heads. From this we count four possible outcomes, so there is 4/16 or 1/4 chance of getting exactly three heads. So this math isn't too bad if the numbers of trials is small enough. But what if it is not? We can use the binomial distribution to solve the problem.
Probability: Probability Axioms/Rules: ADDITION RULE Sample Question - in a deck of cards: We want to know the probability that a drawn card is either a red card (P(A)) OR a seven (P(B)).
P(A or B) = P(A) + P(B) - P( BOTH A and B) These two events are NOT mutually exclusive (i.e., a card can be a red AND a seven). P(red or seven) = p(red) + p(seven) - p(red and seven) P(red or seven) = 26/52 + 4/52 - 2/52 = 28/52 = 7/13
Probability Introduction: Probability (frequentist)
Probability (frequentist): over the long run, the proportion or percentage of time that an event will occur out of all observations. For example, I rolled one die a hundred times. Seventeen times out of a hundred the die showed a value of one. The probability of getting a one is therefore 0.17.
Probability Introduction: Probability (subjective)
Probability (subjective): a measure of strength of belief. For example, the likelihood that it will storm this evening is 0.7 [p(storm)=0.7].
Probability Introduction: PROBABILITY - defined
Probability - From the ideas behind simple random sampling, to interpretation of p-values and confidence intervals, probability and probability manipulations provide the building blocks of much of data analysis. Probabilities may be either marginal, joint or conditional. For the probability section of the exam, you should be able to manipulate probabilities based on the probability axioms (rules) and understand the difference among types of probabilities (e.g., joint, marginal, conditional). You should be able to make calculations of probability based on the normal distribution.
Probability Introduction Probability: DEFINED
Probability: the likelihood that an event will occur. For example, there is a 50% probability that a fair coin will come up heads on any given flip. Probabilities can be expressed as percents (30%), in decimal form (o.3) or in fractions (3/10). In statistics we most often deal with probability as decimals.
Probability Introduction: Random variable
Random variable: a variable (often denoted X) is a variable whose value is a function of a random process. In other words, the value is determined by chance or a stochastic process (can also think of as an experiment or data generating process). **mapping outcomes (quantifying outcomes) of a random process - as soon as you quantify outcomes you can start to do more math related to those outcomes e.g. x = { heads = 0 { tails = 1 y= sum of 7 dice rolled --- P(Y is smaller or equal to 30)
Probability Distribution: The Student t Probability Distribution
Similar to the normal distribution, the t-distribution is a family of distributions that varies based on the degrees of freedom. A unimodal, continuous distribution, the student's t distribution has thicker tails than the normal distribution, particularly when the number of degrees of freedom is small. We use the student's t distribution when comparing means when we do not know the standard deviation of the population and must estimate it from the sample. Above you will find the probability density function of the t-distribution with varying degrees of freedom.
Probability: Joint, Marginal and Conditional Probabilities: How to Manipulate among Joint, Conditional and Marginal Probabilities?
The equation below is a means to manipulate among joint, conditional and marginal probabilities. (see pic) As you can see in the equation, the conditional probability of A given B is equal to the joint probability of A and B divided by the marginal of B. card example: We know that the conditional probability of a four, given a red card equals 2/26 or 1/13. This should be equivalent to the joint probability of a red and four (2/52 or 1/26) divided by the marginal P(red) = 1/2. ANSWER: 1/13 = 1/26 divided by 1/2. For the diagnostic exam, you should be able to manipulate among joint, marginal and conditional probabilities.
Probability: Probability Axioms/Rules: The law of large numbers
The law of large numbers (sometimes named the Law of Averages) states that as the # of trials of a random experiment increases, the empirical probability of an outcome will get closer and closer to its true probability. Or another way of thinking about it, as the number of random trials increases, the expected value of the trial outcomes will approach the true population mean. Another way of thinking about the law of LARGE numbers is that the expected value of the observed values will approach the population mean, with increasing number of trials.
Probability Distribution: The Normal Probability Distribution - continued How to convert any and all normal distributions to the standard normal distribution
We can convert any and all normal distributions to the standard normal distribution using the equation below. The z-score equals an X minus the population mean (μ) all divided by the standard deviation (σ).
Probability Distributions: Binomial Distribution - Continued BINOMIAL COEFFICIENT EQUATION
binomial coefficient - often stated as 'n choose k'). This coefficient calculates the number (integer) of possible ways of getting the result of interest out of all possible combinations. So if we flipped a coin four times (n=4), what is the probability of getting three heads (k=3)? This could be calculated as 4!/3!1! which simplifies to 4-the same result as above. Because we know that the probability of heads and tails are the same, (or that each of the 16 outcomes is equally likely) we can ignore the second half of the binomial formula with the probabilities. Now let's move on to something a little more difficult.
Probability Introduction: Sample Space, Independent and Dependent Events: SAMPLE SPACE (S)
the collection of all possible outcomes. The sample space for the roll of a die is: S={1, 2, 3, 4, 5, 6}