Statistics Chapter 6
Geometric Probability
If Y has the geometric distribution with probability p of success on each trial, the possible values of Y are 1, 2, 3.... If k is any one of these values, the formula follows.
Effect on a Random Variable of Multiplying (or Dividing) by a Constant
Multiplying (or dividing) each value of a random variable by a number b: -Multiplies (divides) measures of center and location (mean, median, quartiles, percentiles) by b. -Multiplies (divides) measures of spread (range, IQR, standard deviation) by |b|. -Does not change the shape of the distribution. Note: Multiplying a random variable by a constant b multiplies the variance by b².
Mean of the Sum of Random Variables
For any two random variables X and Y, if T=X+Y, then the expected value of T is E(T)=µ(X)+µ(Y). In general, the mean of the sum of several random variables is the sum of their means.
Range of the Sum of Random Variables
For any two random variables X and Y, if T=X+Y, then the range of T is the sum of the range of X and the sum of Y. That is, there's more variability in the values of T than in the values of X or Y alone. This makes sense, because the variation in X and the variation in Y both contribute to the variation in T.
Binomial Probability
If X has the binomial distribution with n trials and probability p of success on each trial, the possible values of X are 0, 1, 2, ..., n. If k is any one of these values, we can use this formula. There are two handy commands in the graphing calculator for finding binomial probabilities: 1. binompdf(n,p,k) computes P(X=k) 2. binomcdf(n,p,k) computes P(X≤k)
Mean (Expected Value) of a Geometric Random Variable
If Y is a geometric random variable with probability of success p on each trial, then its mean (expected value) is E(Y)=1/p. That is, the expected number of trials require to get the first success is 1/p.
Effects of a Linear Transformation on the Mean and Standard Deviation
If Y=a+bX is a linear transformation of the random variable X, then -the probability distribution of Y has the same shape as the probability distribution of X. -µ(Y)=a+bµ(X). -σ(Y)=|b|σ(X) (since b could be a negative number). Whether we're dealing with data or random variables, the effects of a linear transformation are the same. Note that these results apply to both discrete and continuous random variables.
Mean and Standard Deviation of a Binomial Random Variable
If a count X has the binomial distribution with number of trials n and probability of success p, the mean and standard deviation of X are µ=np σ=√(np(1-p)) Remember that these formulas work only for binomial distributions and can't be used for other distributions.
Independent Random Variables
If knowing whether any event involving x alone has occurred tells us nothing about the occurrence of any event involving Y alone, and vice versa, then X and Y are independent random variables.
Mean (Expected Value of a Discrete Random Variable)
Suppose that X is a discrete random variable. The expected value is the sum of each random variable multiplied by its probability, which takes into account the fact that not all outcomes may be equally likely (weighted average). NOTE: If the mean of a random variable has a non-integer value, but you report it as an integer, your answer will be marked incorrect.
Variance and Standard Deviation of a Discrete Random Variable
Suppose that X is a discrete random variable. The variance and standard deviation of X are shown in the diagram. The standard deviation of a random variable X is a measure of how much the values of the variable tend to vary, on average, from the mean.
Normal Distribution
The sum or difference of independent Normal random variables follows a Normal distribution.
Binomial Setting
A binomial setting arises when we perform several independent trials of the same chance process and record the number of times that a particular outcome occurs. The four conditions for a binomial setting are: -Binary? The possible outcomes of each trial can be classified as "success" or "failure" (only two outcomes). -Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. -Numbers? The number of trials n of the chance process must be fixed in advance. -Success? On each trial, the probability p of success must be the same. Acronym: BINS.
Continuous Random Variable
A continuous random variable X takes on all values in a interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event. All continuous probability models assign probability 0 to every individual outcome. In many cases, discrete random variables arise from counting something. Continuous variables often arise from measuring something.
Discrete Random Variables and Their Probability Distributions
A discrete random variable X takes a fixed set of possible values with gaps between. The probability distribution of a discrete random variable X lists the values x₁, x₂... and their probabilities p₁, p₂.... The probabilities p₁, p₂... must satisfy two requirements: 1. Every probability is a number between 0 and 1. 2. The sum of the probabilities is 1. To find the probability of any event, add the probabilities of the particular values that made up the event.
Geometric Setting
A geometric setting arises when we perform independent trials of the same chance process and record the number of trials until a particular outcome occurs. The four conditions for a geometric setting are: -Binary? The possible outcomes of each trial can be classified as "success" or "failure." -Independent? Trials must be independent; that is, knowing the result of one trial must not have any effect on the result of any other trial. -Trials? The goal is to count the number of trials until the first success occurs. -Successes? On each trial, the probability p of success must be the same. Acronym: BITS.
Random Variable
A random variable takes numerical values that describe the outcomes of some chance process. There are two types of random variables: discrete and continuous.
Effects on a Random Variable of Adding (or Subtracting) a Constant
Adding the same number a (which could be negative) to each value of a random variable: -Adds a to measures of center and location (mean, median, quartiles, percentiles). -Does not change shape or measures of spread (range, IQR, standard deviation).
Variance of the Sum of Independent Random Variables
For any two independent random variables X and Y, if T=X+Y, then the variance of T is the sum of the variances of X and Y. In general, the variance of the sum of several independent random variables is the sum of their variances. Tidbit: Just remember that you can add variances only if the two random variables are independent, and that you can never add standard deviations.
Mean of the Difference of Random Variables
For any two random variables X and Y, if D=X-y, then the expected value of D is the difference of the mean of X and the mean of Y. In general, the mean of the difference of several random variables is the difference of their means. Tidbit: The order of subtraction is important.
Variance of the Difference of Random Variables
For any two random variables X and Y, if D=X-y, then the variance of T is the sum of the variances of X and Y. The range of D is the sum of the ranges of X and Y.
Normal Approximatioin for Binomial Distributions
Suppose that a count X has the binomial distribution with n trials and success probability p. When n is large, the distribution of X is approximately Normal µ=np and σ=√(np(1-p)). As a rule of thumb, we will use the Normal approximation when n is so large that np≥10 and n(1-p)≥10. That is, the expected number of successes and failures are both at least 10. The accuracy of the Normal approximation improves as the sample size n increases. It is most accurate for any fixed n when p is close to 0.5 and least accurate when p is near 0 or 1.
Binomial Random Variable and Binomial Distribution
The count X of successes in a binomial setting is a binomial random variable. The probability distribution of X is a binomial distribution with parameters n and p, where n is the number of trials of the chance process and p is the probability of a success on any one trial. The possible values of X are the whole numbers from 0 to n. In a binomial setting, we can define a random variable as the number of successes in n independent trials. Binomial distribution is important in statistics when we wish to make inferences about the proportion p of successes in a population.
Geometric Random Variable and Geometric Distribution
The number of trials Y that it takes to get a success in a geometric setting is a geometric random variable. The probability distribution of Y is a geometric distribution with parameters p, the probability of a success on any trial. The possible values of Y are 1, 2, 3.... The shape of any geometric distribution is heavily right-skewed. That's because the most likely value of a geometric random variable is 1. The probability of each successive value decreases by a factor of (1-p).
Binomial Coefficient
The number of ways of arranging k successes among n observations is given by the binomial coefficient.
Probability Distribution
The probability distribution of a random variable gives its possible values and their probabilities.
Sampling Without Replacement Condition
When taking an SRS of size n from a population of size N. we can use a binomial distribution to model the count of successes in the sample as long as n≤0.1N.