ch 3 probability

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Tree diagrams

help organize outcomes and probabilities based on the structure of the data. They are especially useful when the data can be put into some kind of sequential structure. The first branch, for inoculation, is called the primary branch. All other branches, in this case for result are secondary The probabilities for the primary branch are marginal. The probabilities for the secondary branches are conditional. Joint probabilities are shown to the right of each secondary branch. These are computed using the General Multiplication Rule P(A and B) = P(A|B) × P(B)where the primary branch represents event B and the secondary branch event A.

Example: Find P(sum at least 4)

we want to know if the sum is at least 4. This means that we want to know if the sum is greater than or equal to 4.Let B = {4,5,...,12} be the event that the sum is at least 4. Then Bc = {2, 3}, the event that the sum is less than 4. P(B) = 1 − P(Bc)= 1 − P ({2, 3}) = 1 − [P (3) + P (2)] 2 1 =1− 36+36 = 1 − (3/36)= 11/12

Sampling From a Small Population

Usually we sample only a very small fraction of the population. However, we may occasionally sample more than 10% of the population without replacement. Without replacement means we do not have a chance of sampling the same cases twice. Think back to the raffle drawing: without replacement is when we pull 10 raffle tickets without putting any of those tickets back. ex. Suppose there are 25 students. Your discussion TA asks 3 questions and calls on people at random to answer them. Assume that he will not call on the same person twice. What is the probability that you will not be selected? P (Q1 = not selected and Q2 = not selected and Q3 = not selected) =11/12×10/11× 9/10 = 9 /12=0.75

with or without replacement

If we sample from a small population without replacement, we no longer have independence between our observations. If we sample from a small population with replacement, we have independent observations.

events

It is common to work with sets of outcomes instead of individual outcomes. We call these sets, ie. Let A be the event that rolling a d6 results in a 1 or a 2. Let B be the event that rolling a d6 results in a 4 or a 6. We write these out as A = {1,2} and B = {4,6}. Since events A and B have no elements (outcomes in a set) in common, they are disjoint.

Multiplication Rule for Independent Processes

Let A and B be events from two different and independent processes. Then the probability that both A and B occur can be calculated as the product of their separate probabilities: P (A and B) = P (A) × P (B) Similarly, if there are k events A1, . . . , Ak from k independent processes, then the probability they all occur is P(A1)×P(A2)×···×P(Ak)

general conditional probability formula

Let A and B be outcomes. The conditional probability of outcome A occurring given the condition that B has occurred is P(A|B) = P(A and B)/ P(B)

Sum of Conditional Probabilities

Let A1, . . . , Ak represent all the disjoint outcomes for a variable or process. Then if B is some event, P(A1|B) + ··· + P(Ak|B) = 1 The rule for complements also holds when an event and its complement are conditioned on the same information: P(A|B) = 1 − P(Ac|B)

Example: Complement of an Event

Let D = {2, 3} be the event that a single roll of our d6 is a 2 or a 3. Find P(D or Dc).First, note that an event and it's complement are always disjoint! So P(D or Dc) = P(D) + P(Dc) = (1/3) + (2/3) =1

Linear Combinations With Negatives

Note that if we have a linear combination such as aX+bY =2X−3Y Then a = 2 and b = −3. The expected value will be 2E(X) − 3E(Y ) so the negative will impact the expectation. However, the variance will be22V ar(X) + (−3)2V ar(Y ) = 4V ar(X) + 9V ar(Y ) so the negative will not impact the variance or standard deviation.

Bayes' Theorem

P(statement about variable 1 | statement about variable 2) when we have information about P(statement about variable 2 | statement about variable 1). Consider the following conditional probability for variable 1 and variable 2: P (outcome A1 of variable 1 | outcome B of variable 2) Bayes' Theorem states that this conditional probability can be identified as the following fractionP (B|A1)P (A1) / P(B|A1)P(A1)+P(B|A2)P(A2)+···+P(B|Ak)P(Ak) The numerator identifies the probability of getting both A1 and B. The denominator is the marginal probability of getting B. This bottom component of the fraction looks complicated since we have to add up probabilities from all of the different ways to get B. TO APPLY 2 steps: 1 Identify the marginal probabilities of each possible outcome of the first variable. P(A1),P(A2),...,P(Ak) 2 Identify the probability of the outcome B, conditioned on each possible scenario for the first variable. P(B|A1),P(B|A2),...,P(B|Ak) When each of these has been identified, they can be plugged into Bayes' Theorem.

sample space (S)

Rolling our six-sided die results in some event in the set S = {1,2,3,4,5,6}. We call this set our _______.The _____ is defined as the set of all possible outcomes.

disjoint or mutually exclusive

if 2 outcomes are ___ they cannot both happen. If we roll our d6 only one time, we cannot roll a 1 and a 2. On any single roll, the outcomes "rolling a 1" and "rolling a 2" are disjoint. to calculate: P (rolling a 1 and rolling a 2) = P (1 and 2) = 0. We can roll either a 1 or a 2, but not both (on the same roll). P (1 or 2) = P (1) + P (2) = 1/6 + 1/6 = 1/3 If we want to roll a 1 or 2, we have a 2-out-of-6 or 2/6 = 1/3 chance.

If two events are disjoint, are they independent?

independent events have no relationship with one another. This means that if we know something about event A, we don't get any information about event B. For disjoint events, if event A occurs, we can be totally certain that event B did not occur. Therefore they are dependent. independent- can happen at same time disjoint- cannot occur at same time

Venn Diagram

is a good way to visualize the relationship between events.

marginal probability

is a probability based on a single variable. Think of the margins as the edges of a contingency table where we have the information for each variable individually. It is based on a single variable without regard to any other variables.

joint probability

is a probability for two or more variables together. Think of this as a probability that two or more variables occur jointly (together).

The general multiplication rule

is for all events, whether or not they are independent P(A and B) = P(A|B) × P(B) Notice that this is not new information! This is just a rearrangement of the formula for conditional probability.

Complement of an Event

Let D = {2, 3} be the event that a single roll of our d6 is a 2 or a 3. The complement of D is the set of events in the sample space that are not in D. We denote the complement by Dc. Then Dc = {1,4,5,6}

expected value

of a random variable is computed by adding each outcome weighted by its probability. denoted E(X), We call this average outcome the expected value of X, .The expected value of a random variable is computed by adding each outcome weighted by its probability. If X takes outcomes x1, . . . , xk with probabilitiesP(X = x1),...,P(X = xk), the expected value of X is the sum of each outcome multiplied by its corresponding probability: E(X)=x1 ×P(X =x1)+···+xk ×P(X =xk)

About 9% of people are left-handed. Suppose 2 people are selected at random from the U.S. population. Because the sample size of 2 is very small relative to the population, it is reasonable to assume these two people are independent. 1 What is the probability that both are left-handed? 2 What is the probability that both are right-handed?

1. Let L1 be the event that the first person is left-handed and L2 the event that the second person is left-handed. We are told that 9% of people are left-handed, so P(L1) = P(L2) = 0.09. We are assuming that these people are independent, so we can use the multiplication rule: P(L1 andL2)=P(L1)×P(L2) = (0.09) × (0.09) = 0.0081 2. First, assume that everyone is either right- or left-handed. Then Lc1 is the event that the first person is right-handed and Lc2 is the event that the second person is right-handed. From the previous slide, we decided that P(L1) = P(L2) = 0.09 So P(Lc1) = 1 − P(L1) = 1 − 0.09 = 0.91 and P(Lc2) = 0.91 P(Lc1 and Lc2) = P(Lc1) × P(Lc2) = (0.91) × (0.91) = 0.8281

A linear combination

A ____ of two random variables X and Y describes any situation where we can write our relationship out as aX + bY where a and b are some fixed, known numbers. For my commute time, there were five random variables (one for each day I come to campus). Each random variable could be written as having a fixed coefficient of 1. W =1X1 +1X2 +1X3 +1X4 +1X5. if X and Y are random variables and a and b are some fixed numbers, then E(aX + bY ) = a × E(X) + b × E(Y ). to compute the expected value of a linear combination of random variables, we plug in the average of each individual random variable, multiply by the constants, and compute the result.

random processes

Flipping a coin Wait time (in minutes) at the DMV How many hours of sleep you get each night Some of these aren't completely random (the DMV is probably less crowded on, say, Tuesday mornings), but we may still want to model them based on random processes.

Law of Total Probability

For two events A and B, the ______ states P(B) = P(B|A1)P(A1)+P(B|A2)P(A2)+···+P(B|Ak)P(Ak) where A1 . . . Ak are the k possible outcomes for event A.

Independence Considerations

For two independent events, knowing the outcome of one should give us no information about the probability of the other. Consider X and Y , the outcomes for rolling two six-sided dice. 1 Find P(X = 1). 2 FindP(X=1andY =1). 3 FindP(Y =1|X=1). Knowing the outcome of X doesn't give us any additional information about Y . We can use the Multiplication Rule to show that the conditioning information has no influence for independent processes: P(Y =1|X=1)=P(Y =1andX=1) P(X = 1) =P(Y =1)P(X=1) P(X = 1) =P(Y =1)

General Addition Rule

General Addition Rule For any two events A and B, the probability that at least one of them will occur is P (A or B) = P (A) + P (B) − P (A and B) where P(A and B) is the probability that both events occur. Note: In statistics, whenever we say "or" we mean "and/or". If we say that "A or B occurs", that means A, B, or both A and B occur. for disjoint events, P(A and B) = 0 (they can never occur simultaneously), so the general addition rule will work for both disjoint and non-disjoint events.

Variance of Linear Combinations of Random Variables

Given random variables X and Y and known constant numbers a and b, the variance of the linear combination aX + bY is V ar(aX + bY ) = a^2Var(X) + b^2Var(Y ) Essentially, we plug in the variances for each individual variable and square the coefficients. If X and Y are dependent, we must modify this equation..(covariance)

Variance Formula

If X takes outcomes x1, . . . , xk with probabilitiesP(X = x1),...,P(X = xk) and expected value μ = E(X), then the variance of X, denoted by V ar(X) or σ2, is Var(X)=(x1 −μ)2 ×P(X =x1)+···+(xk −μ)2 ×P(X =xk) The standard deviation of X, labeled sd(X) or σ, is the square root of the variance.

addition rule for disjoint outcomes

Suppose A1 and A2 are two disjoint outcomes. Then P(A1 orA2)=P(A1)+P(A2). This can be extended to many disjoint outcomes A1, . . . , Ak where the probability that at least one of these outcomes will occur is P(A1)+P(A2)+···+P(Ak).

The addition rule of sets

The addition rule applies to sets in the same way that it applies to outcomes. Keep A = {1,2} and B = {4,6}. For our die, P(A) = 1/3 and P(B) = 1/3, so P (A or B) = P (A) + P (B) = (1/3) + (1/3) = 2/3

Law of Large Numbers

The tendency for pˆn to converge to the true value as n gets large. We can illustrate probability by thinking about rolling a d6 and estimating the probability that we roll a 1. We estimate this probability by counting up the number of times we roll a 1 and dividing by the number of times we rolled the d6. Each time we roll, we recalculate and our estimate will change a little bit. We denote this estimate pˆn, where n is the number of rolls. We denote the true probability of rolling a 1 as p = 1/6. As the number of rolls, n, increases, pˆn will get closer and closer to the true value of 1/6, or 16.7%. We say that pˆn converges to the true probability. This is another case of more data = better information!

example Suppose there are 12 students. Your discussion TA pick 3 students to answer his question without regarding to who he already picked. What is the probability that you will not be selected?

Then, based on the General Multiplication Rule P (Q1 = not selected and Q2 = not selected and Q3 = not selected) = 11 /12× 11/12 × 11/12

random variable

We call a variable or process with a numerical outcome a represent with capital letters such as X, Y , or Z.

Example: Find P(sum not 6)

We could add all of the probabilities for the sums that are not 6... or we could use the complement!Let A = {not 6} be the event that the sum is not 6. Then Ac = {6}, the event that the sum is 6.Recall P(A) = 1 − P(Ac) = 1 − P (6) =1−5 36 = 31/36

independence

We say that two random processes are _____ if knowing the outcome of one provides no useful information about the outcome of the other. Independence of random processes is similar to independence of variables and observations. consider our discussion on rolling 2 six-sided dice.The roll of the first die has no effect on the roll of the second die. Thus our two dice rolls are independent of one another.

Conditional Probability Notation

We separate our outcome of interest from our condition in our probability notation with a vertical bar: P (truth is fashion given classifier is pred fashion) becomes P (truth is fashion | classifier is pred fashion) = 197 219 We read the vertical bar as the word given.

EX rolling 2 die

What if we have 2d6? What is the chance that we roll two 1s? We know that there is a 1/6 chance that the first die is a 1. Then, of those 1/6 times, there is a 1/6 chance that the second die is a 1. Then the chance that both dice roll a 1 is (1/6) × (1/6) = 1/36 or 2.78%. We can also picture this in a table

Probability for Non-Disjoint Events

What if we want to know the probability that a randomly selected card is a diamond or a face card? We start by adding up the probabilities P(♦)+P(face card)=13/52+12/52 But this double counts the 3 cards in the overlap! We need to correct for this double count: P(♦ or face) = P(♦) + P(face) − P(♦ and face) = 13/52 + 12/52 − 3/52 = 22/52

conditional probability

When we are given some useful information that allows us to restrict our attention, we call these probabilities conditional probabilities. We can say that we condition based on some given information, or that we computed the probability under the condition that theclassifier is pred fashion. 1 The outcome of interest is whatever we want to know about. 2 The condition is information we know to be true, a known outcome or event. P (truth is fashion given classifier is pred fashion)= # cases (truth is fashion and classifier is pred fashion)/ # cases (classifier is pred fashion) Suppose we took a sample of 1000 photos. We could multiply each probability by 1000 to get an estimate of how many would fall into each place in our contingency table.

Probability Notation

shorthand notation for talking about probabilities. We denote "the probability of rolling a 1" as P(rolling a 1). As we get more comfortable with our notation, (assuming it's clear that we're talking about rolling a die) we may shorten this further to P(1). So we can write P(rolling a 1) = P(1) = 1/6.

probability distribution

shows all possible (disjoint) outcomes and their corresponding probabilities. is a list of the possible outcomes with corresponding probabilities that satisfies three rules: 1 The outcomes listed must be disjoint. 2 Each probability must be between 0 and 1. 3 The probabilities must sum to 1. We can use these rules to check whether something is a valid probability distribution.

probability

the chance of something happening We use____ to describe and understand random processes and their outcomes. In the examples, the random process is rolling a die and the outcome is the number rolled. the ____ of an outcome is the proportion of times the outcome would occur if we were able to observe the random process an infinite number of times. is defined as a proportion and it always takes values between 0 and 1. As a percentage, it takes values between 0% and 100%. A probability of 0 (0%) means the outcome is impossible. A probability of 1 (100%) means that the outcome has to happen (all other outcomes are impossible).


Set pelajaran terkait

Fundamentals - Chapter 31: Physical Assessment

View Set

Summer School World History: Unit 1 Packet Chapters 1 and 3

View Set

Chapter 15: Cholinesterase Inhibitors and Their Use in Myasthenia Gravis

View Set

MGT 499 Assessing the Internal Environment of the Firm SB

View Set

Economics: Inventory Method: LIFO

View Set