econ terms 2

Ace your homework & exams now with Quizwiz!

The mean of a discrete probability distribution is computed by the formula:

where P(x) is the probability of a particular value x. In other words, multiply each x value by its probability of occurrence, and then add these products.

Standard normal value(z value)

x is the value of any particular observation or measurement. μ is the mean of the distribution. σ is the standard deviation of the distribution.

The Empirical Rule graph

picture

To summarise there a four situations for finding the area under the standard normal probability distribution.

picture

Difference between the d distribution and z distribution graph

picture Note particularly that the t distribution is flatter, more spread out, than the standard normal distribution. This is because the standard deviation of the t distribution is larger than that of the standard normal distribution.

Characteristics of a normal distribution

picture of graph

Values of z and t for the 95% Level of Confidence

pictures

A few examples will further illustrate what is meant by a random variable.

* The number of employees absent from the day shift on Monday, the number might be 0, 1, 2, 3, . . . The number absent is the random variable. * The hourly wage of a sample of 50 plumbers in Jacksonville, FL. The hourly wage is the random variable. The number of defective lightbulbs produced in an hour at the Cleveland Electric Company, Inc. The grade level (Freshman, Sophomore, Junior, or Senior) of the members of the St. James High School Varsity girls' basketball team. The grade level is the random variable and notice that it is a qualitative variable. The number of participants in the 2016 New York City Marathon. The daily number of drivers charged with driving under the influence of alcohol in Brazoria County, Texas, last month.

To compute a confidence interval for a population mean, we will consider two situations:

*We use sample data to estimate μ with x bar and the population standard deviation (σ) is known. * We use sample data to estimate μ with x bar and the population standard deviation is unknown. In this case, we substitute the sample standard deviation (s) for the population standard deviation (σ).

BINOMIAL PROBABILITY EXPERIMENT

1. An outcome on each trial of an experiment is classified into one of two mutually exclusive categories—a success or a failure. 2.The random variable is the number of successes in a fixed number of trials. 3.The probability of success is the same for each trial. 4.The trials are independent, meaning that the outcome of one trial does not affect the outcome of any other trial.

To determine a confidence interval for the population mean with an unknown population standard deviation, we:

1.Assume the sampled population is either normal or approximately normal. This assumption may be questionable for small sample sizes, and becomes more valid with larger sample sizes. 2.Estimate the population standard deviation (σ) with the sample standard deviation (s). 3.Use the t distribution rather than the z distribution.

How to find the z value for a confidence interval

1.First, we divide the confidence level in half, so .9500/2 = .4750. 2.Next, we find the value .4750 in the body of Table 9-1. Note that .4750 is located in the table at the intersection of a row and a column. 3.Locate the corresponding row value in the left margin, which is 1.9, and the column value in the top margin, which is .06. Adding the row and column values gives us a z value of 1.96. 4.Thus, the probability of finding a z value between 0 and 1.96 is .4750. 5.Likewise, because the normal distribution is symmetric, the probability of finding a z value between −1.96 and 0 is also .4750. 6.When we add these two probabilities, the probability that a z value is between −1.96 and 1.96 is .9500.

The results of the central limit theorem allow us to make the following general confidence interval statements using z-statistics:

1.Ninety-five percent of all confidence intervals computed from random samples selected from a population will contain the population mean. These intervals are computed using a z-statistic equal to 1.96. 2. Ninety percent of all confidence intervals computed from random samples selected from a population will contain the population mean. These confidence intervals are computed using a z-statistic equal to 1.65.

To develop a confidence interval for a proportion, we need to meet two requirements:

1.The binomial conditions, discussed in Chapter 6, have been met. These conditions are: a.The sample data are the number of successes in n trials. b.There are only two possible outcomes. (We usually label one of the outcomes a "success" and the other a "failure.") c.The probability of a success remains the same from one trial to the next. d.The trials are independent. This means the outcome on one trial does not affect the outcome on another. 2.The values nπ and n(1 − π) should both be greater than or equal to 5. This allows us to invoke the central limit theorem and employ the standard normal distribution, that is, z, to complete a confidence interval.

In summary, we took all possible random samples from a population and for each sample calculated a sample statistic (the mean amount earned). This example illustrates important relationships between the population distribution and the sampling distribution of the sample mean:

1.The mean of the sample means is exactly equal to the population mean. 2.The dispersion of the sampling distribution of the sample mean is narrower than the population distribution. 3.The sampling distribution of the sample mean tends to become bell-shaped and to approximate the normal probability distribution.

CHARACTERISTICS OF A PROBABILITY DISTRIBUTION

1.The probability of a particular outcome is between 0 and 1 inclusive. 2.The outcomes are mutually exclusive. 3.The list of outcomes is exhaustive. So the sum of the probabilities of the outcomes is equal to 1.

POISSON PROBABILITY EXPERIMENT

1.The random variable is the number of times some event occurs during a defined interval. 2.The probability of the event is proportional to the size of the interval. 3.The intervals do not overlap and are independent.

When studying characteristics of a population, there are many practical reasons why we prefer to select portions or samples of a population to observe and measure. Here are some of the reasons for sampling:

1.To contact the whole population would be time-consuming. A candidate for a national office may wish to determine her chances for election. A sample poll using the regular staff and field interviews of a professional polling firm would take only 1 or 2 days. Using the same staff and interviewers and working 7 days a week, it would take nearly 200 years to contact all the voting population! Even if a large staff of interviewers could be assembled, the benefit of contacting all of the voters would probably not be worth the time. 2.The cost of studying all the items in a population may be prohibitive. Public opinion polls and consumer testing organizations, such as Harris Interactive Inc., CBS News Polls, and Zogby Analytics, usually contact fewer than 2,000 of the nearly 60 million families in the United States. One consumer panel-type organization charges $40,000 to mail samples and tabulate responses to test a product (such as breakfast cereal, cat food, or perfume). The same product test using all 60 million families would be too expensive to be worthwhile. 3.The physical impossibility of checking all items in the population. Some populations are infinite. It would be impossible to check all the water in Lake Erie for bacterial levels, so we select samples at various locations. The populations of fish, birds, snakes, deer, and the like are large and are constantly moving,Page 252 being born, and dying. Instead of even attempting to count all the ducks in Canada or all the fish in Lake Pontchartrain, we make estimates using various techniques—such as counting all the ducks on a pond selected at random, tracking fish catches, or netting fish at predetermined places in the lake. 4.The destructive nature of some tests. If the wine tasters at the Sutter Home Winery in California drank all the wine to evaluate the vintage, they would consume the entire crop, and none would be available for sale. In the area of industrial production, steel plates, wires, and similar products must have a certain minimum tensile strength. To ensure that the product meets the minimum standard, the Quality Assurance Department selects a sample from the current production. Each piece is stretched until it breaks and the breaking point (usually measured in pounds per square inch) recorded. Obviously, if all the wire or all the plates were tested for tensile strength, none would be available for sale or use. For the same reason, only a few seeds are tested for germination by Burpee Seeds Inc. prior to the planting season. 5.The sample results are adequate. Even if funds were available, it is doubtful the additional accuracy of a 100% sample—that is, studying the entire population—is essential in most problems. For example, the federal government uses a sample of grocery stores scattered throughout the United States to determine the monthly index of food prices. The prices of bread, beans, milk, and other major food items are included in the index. It is unlikely that the inclusion of all grocery stores in the United States would significantly affect the index because the prices of milk, bread, and other major foods usually do not vary by more than a few cents from one chain store to another.

The following examples illustrate point estimates of population means.

1.Tourism is a major source of income for many Caribbean countries, such as Barbados. Suppose the Bureau of Tourism for Barbados wants an estimate of the mean amount spent by tourists visiting the country. It would not be feasible to contact each tourist. Therefore, 500 tourists are randomly selected as they depart the country and asked in detail about their spending while visiting Barbados. The mean amount spent by the sample of 500 tourists is an estimate of the unknown population parameter. That is, we let the sample mean serve as a point estimate of the population mean.Page 284 2.Litchfield Home Builders Inc. builds homes in the southeastern region of the United States. One of the major concerns of new buyers is the date when the home will be completed. Recently, Litchfield has been telling customers, "Your home will be completed 45 working days from the date we begin installing drywall." The customer relations department at Litchfield wishes to compare this pledge with recent experience. A sample of 50 homes completed this year revealed that the point estimate of the population mean is 46.7 working days from the start of drywall to the completion of the home. Is it reasonable to conclude that the population mean is still 45 days and that the difference between the sample mean (46.7 days) and the proposed population mean (45 days) is sampling error? In other words, is the sample mean significantly different from the population mean? 3.Recent medical studies indicate that exercise is an important part of a person's overall health. The director of human resources at OCF, a large glass manufacturer, wants an estimate of the number of hours per week employees spend exercising. A sample of 70 employees reveals the mean number of hours of exercise last week is 3.3. This value is a point estimate of the unknown population mean.

Probability distribution

A listing of all the outcomes of an experiment and the probability associated with each outcome. defines or describes the likekihoods for a range of possible future outcomes

CLUSTER SAMPLING

A population is divided into clusters using naturally occurring geographic or other boundaries. Then, clusters are randomly selected and a sample is collected by randomly selecting from each cluster. Another common type of sampling is cluster sampling. It is often employed to reduce the cost of sampling a population scattered over a large geographic area.

stratified random sampling

A population is divided into subgroups, called strata, and a sample is randomly selected from each stratum. When a population can be clearly divided into groups based on some characteristic, we may use stratified random sampling. It guarantees each group is represented in the sample. The groups are called strata A random sample from each stratum is taken in a number proportional to the stratum's size when compared to the population. Once the strata are defined, we apply simple random sampling within each group or stratum to collect the sample.

SAMPLING DISTRIBUTION OF THE SAMPLE MEAN

A probability distribution of all possible sample means of a given sample size.

SYSTEMATIC RANDOM SAMPLE

A random starting point is selected, and then every kth member of the population is selected.

DISCRETE RANDOM VARIABLE

A random variable that can assume only certain clearly separated values. A discrete random variable can assume only a certain number of separated values. For example, the Bank of the Carolinas counts the number of credit cards carried for a group of customers. The data are summarized with the following relative frequency table. A discrete random variable can, in some cases, assume fractional or decimal values. To be a discrete random variable, these values must be separated—that is, have distance between them

CONFIDENCE INTERVAL

A range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability. The specified probability is called the level of confidence.

SIMPLE RANDOM SAMPLE

A sample selected so that each item or person in the population has the same chance of being included.

RANDOM VARIABLE

A variable measured or observed as the result of an experiment. By chance, the variable can have different values.

probability distribution

As with discrete random variables, the likelihood of a continuous random variable can be summarized with a probability distribution. For example, with a probability distribution for the flight time between Atlanta and Los Angeles, we could say that there is a probability of 0.90 that the flight will be less than 4.5 hours

Something interesting to know about the t and z table

Before doing the confidence interval exercises, we would like to point out a useful characteristic of the t distribution that will allow us to use the t table to quickly find both z and t values. Earlier in this section, we detailed the characteristics of the t distribution. The last point indicated that as we increase the sample size the t distribution approaches the z distribution. In fact, when we reach an infinitely large sample, the t distribution is exactly equal to the z distribution. To explain, Table 9-3 is a portion of Appendix B.5, with the degrees of freedom between 4 and 99 omitted. To find the appropriate z value for a 95% confidence interval, we begin by going to the confidence interval section and selecting the column headed "95%." Move down that column to the last row, which is labeled "∞," or infinite degrees of freedom. The value reported is 1.960, the same value that we found using the standard normal distribution in Appendix B.3. This confirms the convergence of the t distribution to the z distribution. What does this mean for us? Instead of searching in the body of the z table, we can go to the last row of the t table and find the appropriate value to build a confidence interval. An additional benefit is that the values have three decimal places. So, using this table for a 90% confidence interval, go down the column headed "90%" and see the value 1.645, which is a more precise z value that can be used for the 90% confidence level.

Before using systematic random sampling, we should carefully observe the physical order of the population.

Before using systematic random sampling, we should carefully observe the physical order of the population. When the physical order is related to the population characteristic, then systematic random sampling should not be used because the sample could be biased. For example, if we wanted to audit the invoices in a file drawer that were ordered in increasing dollar amounts, systematic random sampling would not guarantee an unbiased random sample. Other sampling methods should be used.

POINT ESTIMATE examples

Buy Inc. wants to estimate the mean age of people who purchase LCD HDTV televisions. They select a random sample of 75 recent purchases, determine the age of each buyer, and compute the mean age of the buyers in the sample. The mean of this sample is a point estimate of the population mean.

Binomial probability

C denotes a combination. n is the number of trials. x is the random variable defined as the number of successes. π is the probability of a success on each trial.

examples of stratified random sampling

For example, college students can be grouped as full time or part time; as male or female; or as freshman, sophomore, junior, or senior Usually the strata are formed based on members' shared attributes or characteristics.

Samples are used to estimate population characteristics.

For example, the mean of a sample is used to estimate the population mean However, since the sample is a part or portion of the population, it is unlikely that the sample mean would be exactly equal to the population mean. We can therefore expect a difference between a sample statistic and its corresponding population parameter. This difference is called sampling error

Important to know

Fortunately we can use the sample standard deviation to estimate the population standard deviation. That is, we use s, the sample standard deviation, to estimate σ, the population standard deviation. But in doing so, we cannot use formula (9-1). Because we do not know σ, we cannot use the z distribution. However, there is a remedy. We use the sample standard deviation and replace the z distribution with the t distribution.

CENTRAL LIMIT THEOREM

If all samples of a particular size are selected from any population, the sampling distribution of the sample mean is approximately a normal distribution. This approximation improves with larger samples. The central limit theorem states that, for large random samples, the shape of the sampling distribution of the sample mean is close to the normal probability distribution. The approximation is more accurate for large samples than for small samples.We can reason about the distribution of the sample mean with absolutely no information about the shape of the population distribution from which the sample is taken. In other words, the central limit theorem is true for all population distributions.

sample proportion

If we let p represent the sample proportion, x the number of "successes," and n the number of items sampled, we can determine a sample proportion as follows. The population proportion is identified by π. Therefore, π refers to the percent of successes in the population.

How does π and n affect the shape of the distribution

If π, the probability of success, remains the same but n becomes larger, the shape of the bimodal distribution becomes more symmetrical. If n remains the same but π increases, from .05 to .95, the shape of the distribution changes. The distribution of a π of .05 is positively skewed. As π approaches .50 it becomes symmetrical. As π goes beyond .50 and moves towards .95 it becomes negatively skewed

random variable.

In any experiment of chance, the outcomes occur randomly. So it is often called a random variable.

The normal probability distribution has the following characteristics

It is bell-shaped and has a single peak at the center of the distribution. The arithmetic mean, median, and mode are equal and located in the center of the distribution. The total area under the curve is 1.00. Half the area under the normal curve is to the right of this center point and the other half, to the left of it. It is symmetrical about the mean. If we cut the normal curve vertically at the center value, the shapes of the curves will be mirror images. Also, the area of each half is 0.5. It falls off smoothly in either direction from the central value. That is, the distribution is asymptotic: The curve gets closer and closer to the X-axis but never actually touches it. To put it another way, the tails of the curve extend indefinitely in both directions. The location of a normal distribution is determined by the mean, μ. The dispersion or spread of the distribution is determined by the standard deviation, σ.

standard normal probability distribution

It is called the standard normal probability distribution, and it is unique because it has a mean of 0 and a standard deviation of 1. Any normal probability distribution can be converted into a standard normal probability distribution by subtracting the mean from each observation and dividing this difference by the standard deviation. The results are called z values or z scores.

The following characteristics of the t distribution are based on the assumption that the population of interest is normal, or nearly normal

It is, like the z distribution, a continuous distribution. It is, like the z distribution, bell-shaped and symmetrical. There is not one t distribution, but rather a family of t distributions. All t distributions have a mean of 0, but their standard deviations differ according to the sample size, n. There is a t distribution for a sample size of 20, another for a sample size of 22, and so on. The standard deviation for a t distribution with 5 observations is larger than for a t distribution with 20 observations. The t distribution is more spread out and flatter at the center than the standard normal distribution (see Chart 9-1). As the sample size increases, however, the t distribution approaches the standard normal distribution because the errors in using s to estimate σ decrease with larger samples.

The Empirical Rule

It states that if a random variable is normally distributed, then: 1.Approximately 68% of the observations will lie within plus and minus one standard deviation of the mean 2.About 95% of the observations will lie within plus and minus two standard deviations of the mean. 3.Practically all, or 99.7% of the observations, will lie within plus and minus three standard deviations of the mean.

table of random numbers

Of course, the process of writing all the players' names on a slip of paper is very time-consuming. A more convenient method of selecting a random sample is to use a table of random numbers

How to Apply the Correction Factor

Only four cases may arise. These cases are: 1.For the probability at least x occur, use the area above (x − .5). 2.For the probability that more than x occur, use the area above (x + .5). 3.For the probability that x or fewer occur, use the area below (x + .5). 4.For the probability that fewer than x occur, use the area below (x − .5).

Also recall that this value is called the standard error.

Recall that the sampling distribution of the sample mean is the distribution of all sample means, x bar, of sample size n from a population. The population standard deviation, σ, is known. From this information, and the central limit theorem, we know that the sampling distribution follows the normal probability distribution with a mean of μ and a standard deviation σ/√n. Also recall that this value is called the standard error.

Advantages of stratum sampling

Stratified sampling has the advantage, in some cases, of more accurately reflecting the characteristics of the population than does simple random or systematic random sampling.

Examples of cluster sampling

Suppose you want to determine the views of residents in the greater Chicago, Illinois, metropolitan area about state and federal environmental protection policies. Selecting a random sample of residents in this region and personally contacting each one would be time-consuming and very expensive. Instead, you could employ cluster sampling by subdividing the region into small units, perhaps by counties. These are often called primary units. There are 12 counties in the greater Chicago metropolitan area. Suppose you randomly select 3 counties. The 3 chosen are La Porte, Cook, and Kenosha (see Chart 8-1 below). Next, you select a random sample of the residents in each of these counties and interview them. This is also referred to as sampling through an intermediate unit.I n this case, the intermediate unit is the county. (Note that this is a combination of cluster sampling and simple random sampling.)

Here are some examples where we wish to estimate the population means and it is unlikely we would know the population standard deviations.

The Dean of the Business College wants to estimate the mean number of hours full-time students work at paying jobs each week. He selects a sample of 30 students, contacts each student, and asks them how many hours they worked last week. From the sample information, he can calculate the sample mean, but it is not likely he would know or be able to find the population standard deviation (σ) required in formula (9-1). The Dean of Students wants to estimate the distance the typical commuter student travels to class. She selects a sample of 40 commuter students, contacts each, and determines the one-way distance from each student's home to the center of campus. From the sample data, she calculates the mean travel distance, that is, x bar. It is unlikely the standard deviation of the population would be known or available, again making formula (9-1) unusable. The Director of Student Loans wants to estimate the mean amount owed on student loans at the time of his/her graduation. The director selects a sample of 20 graduating students and contacts each to find the information. From the sample information, the director can estimate the mean amount. However, to develop a confidence interval using formula (9-1), the population standard deviation is necessary. It is not likely this information is available.

Poisson Probability Distribution

The Poisson probability distribution describes the number of times some event occurs during a specified interval. Examples of an interval may be time, distance, area, or volume.

Binomial probability distribution

The binomial probability distribution is a widely occurring discrete probability distribution.

The central limit theorem indicates that

The central limit theorem indicates that, regardless of the shape of the population distribution, the sampling distribution of the sample mean will move toward the normal probability distribution. The larger the number of observations sampled or selected, the stronger the convergence. The central limit, defined on page 265, does not say anything about the dispersion of the sampling distribution of the sample mean or about the comparison of the mean of the sampling distribution of the sample mean to the mean of the population.

The formula for the variance of a probability distribution is:(also the standard deviation)

The computational steps are: 1.Subtract the mean from each value of the random variable, and square this difference. 2.Multiply each squared difference by its probability. 3.Sum the resulting products to arrive at the variance. The standard deviation, σ, is found by taking the positive square root of the variance; that is,

SAMPLING ERROR

The difference between a sample statistic and its corresponding population parameter

What two assumptions is Poisson Probability Distribution based upon?

The distribution is based on two assumptions. The first is that the probability is proportional to the length of the interval. The second assumption is that the intervals are independent. Poisson probability distribution is a discrete probability distribution because it is formed by counting.

To describe experimental outcomes with a binomial distribution, there are four requirements.

The first requirement is there are only two possible outcomes on a particular experimental trial. For example, on a test, a true/false question is either answered correctly or incorrectly. In a resort, a housekeeping supervisor reviews an employee's work and evaluates it as acceptable or unacceptable. A key characteristic of the two outcomes is that they must be mutually exclusive. This means that the answer to a true/false question must be either correct or incorrect but cannot be both correct and incorrect at the same time. Another example is the outcome of a sales call. Either a customer purchases or does not purchase the product, but the sale cannot result in both outcomes. Frequently, we refer to the two possible outcomes of a binomial experiment as a "success" and a "failure." However, this distinction does not imply that one outcome is good and the other is bad, only that there are two mutually exclusive outcomes. The second binomial requirement is that the random variable is the number of successes for a fixed and known number of trials. For example, we flip a coin five times and count the number of times a head appears in the five flips, we randomly select 10 employees and count the number who are older than 50 years of age, or we randomly select 20 boxes of Kellogg's Raisin Bran and count the number that weigh more than the amount indicated on the package. In each example, we count the number of successes from the fixed number of trials. third requirement is that we know the probability of a success and it is the same for each trial. Three examples are: For a test with 10 true/false questions, we know there are 10 trials and the probability of correctly guessing the answer for any of the 10 trials is 0.5. Or, for a test with 20 multiple-choice questions with four options and only one correct answer, we know that there are 20 trials and the probability of randomly guessing the correct answer for each of the 20 trials is 0.25. The final requirement of a binomial probability distribution is that each trial is independent of any other trial. Independent means there is no pattern to the trials. The outcome of a particular trial does not affect the outcome of any other trial.

PROPORTION definition and example

The fraction, ratio, or percent indicating the part of the sample or the population having a particular trait of interest. As an example of a proportion, a recent survey indicated that 92 out of 100 people surveyed favored the continued use of daylight savings time in the summer. The sample proportion is 92/100, or .92, or 92%. If we let p represent the sample proportion, x the number of "successes," and n the number of items sampled, we can determine a sample proportion as follows.

exponential distribution formula

The graph of the exponential distribution starts at the value of λ when the random variable's (x) value is 0. The distribution declines s teadily as we move to the right with increasing values of x. Formula (7-6) describes the exponential probability distribution with λ as rate parameter. It is a pleasant surprise that both the mean and the standard deviation of the exponential probability distribution are equal to 1/λ.

The mean

The mean is a typical value used to represent the central location of a probability distribution. It also is the long-run average value of the random variable. The mean of a probability distribution is also referred to as its expected value. It is a weighted average where the possible values of a random variable are weighted by their corresponding probabilities of occurrence.

an important conclusion

The mean of the distribution of sample means will be exactly equal to the population mean if we are able to select all possible samples of the same size from a given population. That is: μ=μx(this is not u times x, it is just U of x) Even if we do not select all samples, we can expect the mean of the distribution of sample means to be close to the population mean. 2. There will be less dispersion in the sampling distribution of the sample mean than in the population. If the standard deviation of the population is σ, the standard deviation of the distribution of sample means is /n‾√. Note that when we increase the size of the sample, the standard error of the mean decreases.

things that need to occur before you can apply the normal approximation when can you use the normal approximation to binomial

The normal probability distribution is a good approximation to the binomial probability distribution when nπ and n(1 − π) are both at least 5. However, before we apply the normal approximation, we must make sure that our distribution of interest is in fact a binomial distribution. Recall from Chapter 6 that four criteria must be met: 1.There are only two mutually exclusive outcomes to an experiment: a "success" and a "failure." 2.The distribution results from counting the number of successes in a fixed number of trials. 3.The probability of a success, π, remains the same from trial to trial. 4.Each trial is independent.

Something to know:

The sample mean, x bar, is not the only point estimate of a population parameter. For example, p, a sample proportion, is a point estimate of π, the population proportion; and s, the sample standard deviation, is a point estimate of σ, the population standard deviation.

The exponential distribution usually describes situations such as:

The service time for customers at the information desk of the Dallas Public Library. The time between "hits" on a website. The lifetime of a kitchen appliance. The time until the next phone call arrives in a customer service center. The exponential probability distribution is positively skewed Another feature of the exponential distribution is its close relationship to the Poisson distribution.

Z VALUE

The signed distance between a selected value, designated x, and the mean, μ, divided by the standard deviation, σ. So, a z value is the distance from the mean, measured in units of the standard deviation. The formula for this conversion is: Therefore, the z distribution has all the characteristics of any normal probability distribution.

What two values is the standard error is affected by two values

The size of the standard error is affected by two values. The first is the standard deviation of the population. The larger the population standard deviation, σ, the larger σ/√n. If the population is homogeneous, resulting in a small population standard deviation, the standard error will also be small. However, the standard error is also affected by the number of observations in the sample. A large number of observations in the sample will result in a small standard error of estimate, indicating that there is less variability in the sample means.

POINT ESTIMATE

The statistic, computed from sample information, that estimates a population parameter. A point estimate is a single statistic used to estimate a population parameter.

continuous random variable examples

The times of commercial flights between Atlanta and Los Angeles are 4.67 hours, 5.13 hours, and so on. The random variable is the time in hours and is measured on a continuous scale of time. The annual snowfall in Minneapolis, Minnesota. The random variable is the amount of snow, measured on a continuous scale.

The uniform probability distribution

The uniform probability distribution is the simplest distribution for a continuous random variable. This distribution is rectangular in shape and is completely defined by its minimum and maximum values. Here are some examples that follow a uniform distribution.

CONTINUITY CORRECTION FACTOR( check on google docs formula above)

The value .5 subtracted or added, depending on the question, to a selected value when a discrete probability distribution is approximated by a continuous probability distribution. Because we use the normal distribution to determine the binomial probability of 60 or more successes, we must subtract, in this case, .5 from 60. The value .5 is called the continuity correction factor. This small adjustment is made because a continuous distribution (the normal distribution) is being used to approximate a discrete distribution (the binomial distribution). Check what formula in google docs above this and what it is used for

variance of a poisson distribution

The variance of a poison distribution is equal to the mean

Applications of the Poisson Probability Distribution

This probability distribution has many applications. It is used as a model to describe the distribution of errors in data entry, the number of scratches and other imperfections in newly painted car panels, the number of defective parts in outgoing shipments, the number of customers waiting to be served at a restaurant or waiting to get into an attraction at Disney World, and the number of accidents on I-75 during a three-month period.

futher illustrating the central limit theorem

To further illustrate the central limit theorem, if the population follows a normal probability distribution, then for any sample size the sampling distribution of the sample mean will also be normal. If the population distribution is symmetrical (but not normal), you will see the normal shape of the distribution of the sample mean emerge with samples as small as 10. On the other hand, if you start with a distribution that is skewed or has thick tails, it may require samples of 30 or more to observe the normality feature. This concept is summarized in Chart 8-3 for various population shapes. Observe the convergence to a normal distribution regardless of the shape of the population distribution.

how to know when to use z or t

Use the t distribution rather than the z distribution. We base the decision on whether to use the t or the z on whether or not we know σ, the population standard deviation. If we know the population standard deviation, then we use z. If we do not know the population standard deviation, then we must use t.

soemthing to know

We can use the normal distribution (a continuous distribution) as a substitute for a binomial distribution (a discrete distribution) for large values of n because, as n increases, a binomial distribution gets closer and closer to a normal distribution. The normal probability distribution is a good approximation to the binomial probability distribution when nπ and n(1 − π) are both at least 5. However, before we apply the normal approximation, we must make sure that our distribution of interest is in fact a binomial distribution. Recall from Chapter 6 that four criteria must be met:

Choosing an appropriate sample size

When working with confidence intervals, one important variable is sample size. However, in practice, sample size is not a variable. It is a decision we make so that our estimate of a population parameter is a good one. Our decision is based on three variables: 1.The margin of error the researcher will tolerate. 2.The level of confidence desired, for example, 95%. 3.The variation or dispersion of the population being studied.

things to know about the poisson distribution and chapter 6 stuff

You can estimate a binomial probability using a poisson distribution. Poisson can be though of as an approximation forthe binomial,when n, the number of trials is large, and π, the probability of success is small. As n gets larger and π smaller, the difference between the two distributions gets smaller Poisson probability distribution is always positively skewed and the random variable has no specific upper limit As the mean increases it becomes more symmetrical

continuous random variable

can assume an infinite number of values within a given range. It is measured on a continuous interval or ratio scale.

There are two types of random variables

discrete or continuous

Normal probability distributions with equal means but different standard deviations

picture

examples of the uniform probability distribution

ex)The sales of gasoline at the Kwik Fill in Medina, New York, follow a uniform distribution that varies between 2,000 and 5,000 gallons per day. The random variable is the number of gallons sold per day and is continuous within the interval between 2,000 gallons and 5,000 gallons.

Area for uniform distribution

formula

Confidence Intervals for a population with a σ known

formula

Standard error of the mean

formula

The equation for the uniform probability distribution is a

formula

The standard deviation describes the dispersion of a distribution. In the uniform distribution, the standard deviation is also related to the interval between the maximum and minimum values

formula

To develop a confidence interval for the population mean using the t distribution, we adjust formula (9-1) as follows.

formula

confidence interval for a population proportion

formula

finding a probability using the exponential distribution

formula

mean of a uniform distribution

formula

finding the z value of the x bar when the standard deviation is known

formula However, when we sample from populations, we are interested in the distribution of X bar , the sample mean, instead of X, the value of one observation. That is the first change we make in formula (7-5). The second is that we use the standard error of the mean of n observations instead of the population standard deviation. That is, we use σ/√n in the denominator rather than σ. Therefore, to find the likelihood of a sample mean within a specified range, we first use the following formula to find the corresponding z value. Then we use Appendix B.3 or statistical software to determine the probability.

Normal probability distributions

formula Pie and e are constants in this formula

How do we determine a 95% confidence interval?

formula The width of the interval is determined by two factors: (1) the level of confidence, as described in the previous section, and (2) the size of the standard error of the mean. To find the standard error of the mean, recall from the previous chapter [see formula (8-1)] that the standard error of the mean reports the variation in the distribution of sample means. It is really the standard deviation of the distribution of sample means. The formula is repeated below: σx bar is the symbol for the standard error of the mean. We use a Greek letter because it is a population value, and the subscript x bar reminds us that it refers to a sampling distribution of the sample means. σ is the population standard deviation. n is the number of observations in the sample.

The t distribution is a continuous probability distribution, with many similar characteristics to the z distribution.

formula where s is an estimate of σ.

Mean of a poisson distribution

n is the total number of trials and π the probability of success

Determining When to Use the z Distribution or the t Distribution

picture

Poisson Distribution

μ (mu) is the mean number of occurrences (successes) in a particular interval. e is the constant 2.71828 (base of the Napierian logarithmic system). x is the number of occurrences (successes). P(x) is the probability for a specified value of x.

Variance of a binomial distribution

π is the probability of a success on each trial.

mean of a binomial distribution

π is the probability of a success on each trial.


Related study sets

Abnormal Psychology Final Chapter 13-16

View Set

Solving Quadratic Equations by Factoring, Set 1

View Set

Chapter 48 Skin Integrity and Wound Care

View Set

Marketing Chapter 12 - Developing New Products

View Set

7C Showing relationship between ideas

View Set

Law Quiz 3 pg. 41-43 & pg. 48-54

View Set

APWH: Chapter 25: New Worlds: the Americas and Oceania

View Set