ECN 221 Foster Exam #2
Counting rules for combinations/permutations:
(2)For a combination: Number of Combinations of N objects taken out of n at a time. A second useful counting rule enables us to count the number if experimental outcomes when n objects are to be selected from a set of N objects. See slides for formula (3) For a permutation: Number of Permutations of N objects taken out of n at a time. A third useful counting rule enables us to count the number of experimental outcomes when n objects are to be selected from a set of N objects, where the order of selection is important. See slides for formula.
The standard normal probability distribution has a mean of ______ and a standard deviation of ______.
0, 1
What critical value of n/N is used to determine whether or not a finite population should be treated as infinite?
0.05
The sum of the probabilities for all experimental outcomes must equal:
1
Empirical Rule
68.3% of values of a normal random variable are within +/-1 standard deviation of its mean. 95.4% of values of a normal random variable are within +/-2 standard deviation of its mean. 99.7% of values of a normal random variable are within +/-3 standard deviation of its mean.
Poisson Probability Distribution
A Poisson distributed random variable is often useful in estimating the number of occurrences over a specified interval of time or space. It is a discrete random variable that may assume an infinite sequence of values (x=0,1,2,....)
Point Estimation
A form of statistical inference. We use data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter.
Bivariate Distributions
A probability distribution involving two random variables is called a bivariate probability distribution. Each outcome of a bivariate experiment consist of two values, one for each random variable. Example: Rolling a pair of dice. When dealing with bivariate probability distributions, we are often interested in the relationship between the random variables.
Standard Normal Probability Distributions
A random variable having a normal distribution with a mean of 0 and a standard deviation of 1 is said to have a standard normal probability distribution. To convert to standard normal: Z= x-µ / σ
Uniform Probability Distributions
A random variable is uniformly distributed whenever the probability is proportional to the interval's length. The uniform probability density function is: f(x)=1/(b-a) for a <= x <= b and =0 elsewhere where a = smallest value the variable can assume and b= largest value the variable can assume.
Sample Points
An experimental outcome is also called a sample point
Relative Frequency Method
Assigning probabilities based on experimentation or historical data. Example: Waiting time in the X-ray department for a local hospital.
Subjective Method:
Assigning probabilities based on judgement. Best probability estimates often are obtained by combining the estimates from the classical or relative frequency approach with the subject estimate. Ultimately, a probability value should express our degree of belief that the experimental outcome will occur. Could be inappropriate to assign probabilities based solely on historical data.
Classical Method
Assigning probabilities based on the assumption of equally likely outcomes. Ex: Rolling a die. If an experiment has n possible outcomes, the classical method would assign a probability of 1/n to each outcome. So, experiment: rolling a die. Sample Space: S= {1,2,3,4,5,6}. Probabilities: Each sample point has a 1/6 chance of occurring.
Continuous random variables
Can assume any value in an interval on the real line or in a collection of intervals. It is not possible to talk about the probability of the random variable assuming a particular value. Instead, we talk about the probability of the random variable assuming a value within a given interval.
Elements, Population, Sample, Target and Sampled populations, and Frames:
Element; the entity on which data are collected. Population; a collection of all the elements of interest. Sample; a subset of the population. Sampled population; the population from which the sample is drawn. Frame; a list of all the elements that the sample will be selected from.
Covariance and correlation between two variables are reported in the same units.
False
It is possible to talk about the probability of a continuous random variable assuming a particular value.
False
Systematic sampling can never lead to a biased sample.
False
The expected value of a discrete probability distribution must be a value the random variable can assume.
False
The probability of any given outcome in a discrete uniform probability distribution does not depend on the number of values the random variable may assume.
False
To compute the intersection of two events simply add their probabilities.
False
When dealing with the sampling distribution of the proportion, z-scores cannot be used to find the probability that a sample meets certain criteria.
False
Finite vs. Infinite Populations
Finite populations are often defined by lists such as: Organization membership rosters, Credit card account numbers, Inventory product numbers. A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected. Infinite populations is trying to select a sample, but find it is not possible to obtain a list of all elements in the population. As a result, we cannot construct a frame for the population. Hence, we cannot use the random number selection procedure. There is no upper limit. Some examples are: parts being manufactured on a production line, transactions occurring at a bank, telephone calls arriving at a technical help desk, customers entering a store.
Systematic Sampling
If a sample of n is desired from a population containing N elements, we might sample one element for every n/N elements in the population. We randomly select one of the first n/N elements from the population list. We then select every n/Nth element that follows in the population list. This method has the properties of a simple random sample, especially if the list of the population elements is a random ordering. Advantage: The sample usually will be easier to identify than it would be If simple random sampling were used. Example: selecting every 100th listing in a telephone book after the first randomly selected listing.
Sampling with and without replacement
If we replace each sampled element before selecting subsequent elements, it Is with replacement. Sampling without replacement, however, is the procedure used most often.
Convenience sampling
It is a nonprobability sampling technique. Items are included in the sample without known probabilities of being selected. The sample is identified primarily by convenience. Advantage: Sample selection and data collection are relatively easy. Disadvantage: It is impossible to determine how representative of the population the sample is.
Discrete Random Variables
May assume either a finite number of values or an infinite sequence of values. Example: An accountant taking a CPA exam. It has four parts, let random variable x = the number of parts of the CPA examination passed. X may assume the finite number of values 0,1,2,3,4. For an infinite number of values example: Cars arriving at a toll booth. Let x = number of cars arriving in one day, where x can take on the values of 0,1,2,3,.... We can count the customers arriving, but there is no finite upper limit on the number that might arrive.
Expected Value
Or the mean, of a random variable is a measure of its central location. E(x) = µ =∑xf(x). The expected value is a weighted average of the values the random variable may assume. The weights are the probabilities. The expected value does not have to be a value the random variable can assume.
Stratified Random Sampling
Population is first divided into groups of elements called strata. Each element in the population belongs to one and only one stratum. Best results are obtained when the elements within each stratum are as much alike as possible (ex: a homogeneous group) An advantage Is if strata are homogeneous, this method is as "precise" as a simple random sampling but with a smaller total sample size.
The Normal Random Variable
Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and .5 to the right)
Selecting a sample
Reasons why we select a sample is to collect data to answer a research question about a population. The sample results provide only estimates of the population characteristics. The reason is simply that the sample contains only a portion of the population.
Experiments
Statistical experiments: Probability determines outcomes. Even though the experiment is repeated in exactly the same way, an entirely different outcome may occur. For this reason, statistical experiments are sometimes called random experiments. Overall, an experiment is any process that generates well-defined outcomes
Variance
Summarizes the variability in the values of a random variable. Var(x) = σ^2 = Σ(x - µ) ^2 f(x). The variance is a weighted average of the squared deviations of a random variable from its mean. The weights are the probabilities.
Complement of event
The complement of event A is defined to be the event consisting of all sample points that are not in A.
Big Data and Errors in Sampling
The difference between the value of sample statistic and the corresponding value of the population parameters is called the sampling error. Deviations of the sample from the population that occur for reasons other than random sampling are referred to as nonsampling errors. Nonsampling error can occur in a sample or a census.
Intersection of two events
The intersection of events A and B is the set of all sample points that are in both A and B.
Normal Probability Distributions
The most important distribution for describing a continuous random variable. It is widely used in statistical inference. It has been used in a wide variety of applications including height of people, test scores, rainfall amounts, scientific measurements.
Multiplication Law
The multiplication law provides a way to compute the probability of the intersection of two events. Look on slides for formula.
Judgement sampling
The person most knowledgeable on the subject of the study selects elements of the population that he or she feels are most representative of the population. It is a nonprobability sampling technique. Advantage: It is a relatively easy way of selecting a sample. Disadvantage: The quality of the sample results depends on the judgement of the person selecting the sample.
Cluster Sampling
The population is first divided into seperate groups of elements called clusters. Ideally, each cluster is a representative small scale version of the population (ex: heterogeneous group). A simple random sample of the clusters is then taken. All elements within each sampled (chosen) cluster form the sample. Advantage: The close proximity of elements can be cost effective. (ex: Many sample observations can be obtained in a short time) Disadvantage: This method generally requires a larger total sample size than simple or stratified random sampling.
discrete probability distribution
The probability distribution for a random variable describes how probabilities are distributed over the values of the random variable. We can describe a discrete probability distribution with a table, graph, or formula. Two types of discrete probability distributions: First type: uses the rules of assigning probabilities to experimental outcomes to determine probabilities for each value of the random variable. Second type: Uses a special mathematical formula to compute the probabilities for each value of the random variable. The required conditions for a discrete probability function are : f(x) > 0 and Σf(x) = 1
Conditional Probability
The probability of an event given that another event has occurred is called conditional probability. The conditional probability of A given B is denoted by P(A|B).
Bayes' Theorem
The probability of an event occurring based upon other event probabilities.
Sample Spaces
The sample space for an experiment is the set of all experimental outcomes
Conjunction Fallacy
The tendency to assume that specific conditions are more probable than general ones
Availability Bias
The tendency to overestimate the likelihood of events with greater "availability" in memory.
Gambler's Fallacy
The tendency to think future probabilities are altered by past events, when in reality they are unchanged.
Union of two events
The union of events A and B is the event containing all sample points that are in A or B or Both.
A continuous random variable can assume any value in an interval on the real line or in a collection of intervals.
True
According to the Central Limit Theorem the sampling distribution of the sample mean can be approximated by a normal distribution as the sample size becomes large.
True
Any normal probability distribution can be converted to a standard normal probability distribution through the use of z-scores.
True
As long as np > 5 AND n(1 - p) > 5 the probability distribution of the sample proportion can be approximated as a normal distribution.
True
If each individual variance of two variables is known and the variance of their sum is also known, it is easy to calculate the covariance between the two variables.
True
It is possible for a discrete random variable to assume either a finite number of values or an infinite sequence of values.
True
Probability values are always assigned on a scale from 0 to 1.
True
The hypergeometric probability distribution is closely related to the binomial distribution except that the trials are not independent and the probability of success changes from trial to trial.
True
The probability for a given range of a continuous random variable can be calculated by measuring the are underneath the density function in that range.
True
The stationarity assumption states that the probability of success in a given binomial distribution does not change from trial to trial.
True
Mutually exclusive events
Two events are said to be mutually exclusive if the events have no sample points in common. Two events are mutually exclusive if, when one event occurs, the other cannot occur. The intersection is 0 and to find the union, is it just A + B.
Exponential Probability Distribution
Useful in describing the time it takes to complete a task. The exponential random variables can be used to describe: Time between vehicle arrivals at a toll booth. Tie required to complete a questionnaire. Distance between major defects in a highway. In waiting line applications, the exponential distribution is often used for service times. A property of the exponential distribution is that the mean and standard deviation are equal. The exponential distribution is skewed to the right. Its skewness measure is 2.
uncertainties
What are the chances that sales will decrease if we increase prices? What is the likelihood a new assembly method will increase productivity? What are odds that a new investment will be profitable?
Which distribution involves two random variables that may or may not have a numerical relationship with each other?
a bivariate discrete distribution
The __________ of event A is defined to be the event consisting of all sample points that are not in A.
complement
An exponential probability function with an expected population value of two will have a density function of:
f(x) = 1/2 e^(-x/2)
x-bar = μ s = σ p-bar = p
flip
If the probability of event A is not changed by the existence of event B, we would say that events A and B are:
independent
The __________ of events A and B is the set of all sample points that are in both A and B.
intersection
For cases of N objects taken n at a time the number of possible combinations is _____________ the number of possible permutations.
less than or equal to
Standard deviation
or, σ, is defined as the positive square root of the variance.
The variable in which probability distribution is often useful in estimating the number of occurrences over a specified interval of time or space?
the Poisson distribution
Which distribution is used to calculate the probability of a given number of successes for a set number of trials where the only two options are success and failure?
the binomial distribution
Which method assigns probabilities based on the assumption of equally likely outcomes?
the classical method
Which is the tendency to assume that specific conditions are more probable than general ones?
the conjunction fallacy
Which is the tendency to think future probabilities are altered by past events, when in reality they are unchanged?
the gamblers fallacy
For which type of continuous distribution are the mean and the median always the same?
uniform and normal distributions only
Standard error of the mean
whenever sample size is increased, the standard error of the mean is decreased.
The skewness of a normal distribution is:
zero