Exam 2
basic probability relationships
- Complement of an Event - Union of two events -Intersection of two events -Mutually Exclusive Events
hypergeometric probability distribution
-Consider a hypergeometric distribution with n trials and let p = (r/N) denote the probability of a success on the first trial. -If the population size is large, the term (N - n)/(N - 1) approaches 1. -The expected value and variance can be written E(x) = np and Var(x) = np(1 - p). -Note that these are the expressions for the expected value and variance of a binomial distribution. -When the population size is large, a hypergeometric distribution can be approximated by a binomial distribution with n trials and a probability of success p = (r/N).
Relation ship between two variables can be determined by
-Covariance -Correlation Coefficient
Joint Probability table
-Joint probabilities appear in the body of the table -Marginal probabilities appear in the margins of the table
Chebyshev's Theorem
-Pafnuty Lvovich Chebyshev -born in Russia -At least (1 - 1/z2) of the data values must be within z standard deviations of the mean, where z is any value greater than 1. -Chebyshev's theorem requires z > 1; but z need not be an integer. -At least 75% of the data values must be within z = 2 standard deviations of the mean. -At least 89% of the data values must be within z = 3 standard deviations of the mean. -At least 94% of the data values must be within z = 4 standard deviations of the mean.
Data dashboards
-not limited to graphical displays -Numerical measure such as the mean and standard deviation of KPIs, to a data dashboard is often critical. -Dashboards are often interactive. -Drilling down
Mutually Exclusive Events
-Two events are said to be mutually exclusive if the events have no sample points in common -Two events are mutually exclusive if, when one event occurs, the other cannot occur.
Mutual exclusiveness v Independent
-Two events with nonzero probabilities cannot be both mutually exclusive and independent. -If one mutually exclusive event is known to occur, the other cannot occur.; thus, the probability of the other event occurring is reduced to zero * (and they are therefore dependent).* -Two events that are not mutually exclusive, might or might not be independent.
Four Properties of a Binomial Experiment
1. The experiment consists of a sequence of n identical trials. 2. Two outcomes, success and failure, are possible on each trial 3. The probability of a success, denoted by p, does not change from trial to trial. (This is referred to as the stationarity assumption.) 4. The trials are independent.
Basic Requirements for Assigning Probabilities
1. The probability assigned to each experimental outcome must be between 0 and 1, inclusively. 2. The sum of the probabilities for all experimental outcomes must equal 1.
Two Properties of a Poisson Experiment
1. The probability of an occurrence is the same for any two intervals of equal length. 2. The occurrence or nonoccurrence in any interval is independent of the occurrence or nonoccurrence in any other interval. -*the mean and variance are equal.*
5 number summary
1. smallest value 2. first quartile 3. median 4. third quartile 5. largest value
Types of discrete probability distributions
1. uses the rules of assigning probabilities to experimental outcomes to determine probabilities for each value of the random variable. 2. uses a special mathematical formula to compute the probabilities for each value of the random variable
Tree Diagram
A helpful graphical representation of a multiple-step experiment
Combinations
A second useful counting rule enables us to count the number of experimental outcomes when n objects are to be selected from a set of N objects. -orders not important (smaller than permutations) - Ex- lottery
Permutations
A third useful counting rule enables us to count the number of experimental outcomes when n objects are to be selected from a set of N objects, where the order of selection is important. -Order matters (always bigger than combinations)
Values of random variables
Always equally likely
Sample point
An experimental outcome
Skewness
An important numerical measure of the shape of a distribution
Relative Frequency Method
Assigning probabilities based on *experimentation or historical data* -Example-Lucas tool rental
Subjective Method
Assigning probabilities based on *judgment* -Example- Bradley Investment -We can use any data available as well as our experience and intuition, but ultimately a probability value should express our *degree of belief* that the experimental outcome will occur. -The best probability estimates often are obtained by combining the estimates from the classical or relative frequency approach with the subjective estimate.
Classical Method
Assigning probabilities based on the assumption of *equally likely outcomes* -Example- Rolling the die
correlation Coefficient
Correlation is a measure of linear association and not necessarily causation -Just because two variables are highly correlated, it does not mean that one variable is the cause of the other. -The coefficient can take on values between -1 and +1. -Values near -1 indicate a *strong negative linear relationship* -Values near +1 indicate a *strong positive linear relationship* -The closer the correlation is to zero, the weaker the relationship
Independent
If the probability of event A is not changed by the existence of event B, we would say that events A and B are independent -can use multiplication law
Z-score
It denotes the number of standard deviations a data value xi is from the mean -often called the standardized value -observation's z-score is a measure of the relative location of the observation in a data set. -data < sample mean, z-score < 0 -data> sample mean, z-score > 0 -data= sample mean, z-score=0
Prior probabilities
Often we begin probability analysis with initial
Statistical experiments
Random experiments -probability determines outcomes -different outcomes may occur
Complement of an event
The complement of event A is defined to be the event consisting of all sample points that are not in A. -A^c
Intersection of two events
The intersection of events A and B is the set of all sample points that are in both A and B. -The intersection of events A and B is denoted by A (upside down U) B
stationarity assumption
The probability of a success, denoted by p, does not change from trial to trial.
Conditional Probability
The probability of an event given that another event has occurred
Union of two events
The union of events A and B is the event containing all sample points that are in A or B or both -The union of events A and B is denoted by A U B
Empirical discrete distribution
The use of the relative frequency method to develop discrete probability distributions
Posterior Probabilities
Then, from a sample, special report, or a product test we obtain some additional information. -Given this information, we calculate revised probabilities
Event
a collection of sample points
Box plot
a graphical display of data that is based on a five-number summary. -need median and quartiles(1/3) -another way to identify outliers -Limits are located (not drawn) using the interquartile range (IQR).
Covariance
a measure of the linear association between two variables -pos values= pos relation -neg values= neg relation
Random Variable
a numerical description of the outcome of an experiment.
Probability
a numerical measure of the likelihood that an event will occur. -scale 0 to 1 -near 0= unlikely -near 1 = almost certain
Random experiment
a process that generates well-defined experimental outcomes.
Outlier
an unusually small or unusually large value in a data set. -A data value with a z-score less than -3 or greater than +3 might be considered an outlier. -It might be: 1. an incorrectly recorded data value 2. a data value that was incorrectly included in the data set 3. a correctly recorded data value that belongs in the data set
Empirical Rule
can be used to determine the percentage of data values that must be within a specified number of standard deviations of the mean. -MUST BE BELL SHAPED CURVE, NORMAL DISTRIBUTION -Approximately 68% of the data values will be within one standard deviation of the mean. -Approximately 95% of the data values will be within two standard deviations of the mean. -Almost all of the data values will be within three standard deviations of the mean.
hypergeometric distribution
closely related to the binomial distribution. -the trials are not independent -the probability of success changes from trial to trial.
discrete probability functions specified by formulas
discrete-uniform, binomial, Poisson, and hypergeometric distributions
discrete uniform probability function
f(x)= 1/n - n = the number of values the random variable may assume
Probability Distribution
for a random variable describes how probabilities are distributed over the values of the random variable -can describe a discrete probability distribution with a table, graph, or formula.
Probability of any event
is equal to the sum of the probabilities of the sample points in the event.
Poisson distribution probability
is often useful in estimating the number of occurrences over a *specified interval of time or space* -infinite sequence of values
Sample space for experiment
is the set of all experimental outcomes
Continuous random variable
may assume any numerical value in an interval or collection of intervals.
Discrete random variable
may assume either a finite number of values or an infinite sequence of values.
Expected value
or mean, of a random variable is a measure of its central location. -a weighted average of the values the random variable may assume. The weights are the probabilities. -The expected value does not have to be a value the random variable can assume.
Probability distributions are defined by
probability function -denoted by f(x), that provides the probability for each value of the random variable. -must be greater than 0 but cumulatively total 1
The Addition Law
provides a way to compute the probability of event A, or B, or both A and B occurring.
Multiplication Laws
provides a way to compute the probability of the intersection of two events. -can also be used to test independent events
Baye's Theorem
provides the means for revising the prior probabilities -Thomas Bayes -born in London England -Duke investments example -Bayes' theorem is applicable when the events for which we want to compute posterior probabilities are mutually exclusive and their union is the entire sample space.
Drilling down
refers to functionality in interactive dashboards that allows the user to access information and analyses at an increasingly detailed level
Variance
summarizes the variability in the values of a random variable -a weighted average of the squared deviations of a random variable from its mean. The weights are the probabilities.
Binomial distribution probability
the number of successes over a specific number of trials
discrete uniform probability distribution
the simplest example of a discrete probability distribution given by a formula.
Standard Deviation
the square root of the variance
tabular representation
using past data in a table