Statistics Chapter 8

Ace your homework & exams now with Quizwiz!

Important Reminders

1. Our method of calculation assumes that the data comes from an SRS of size n from the population of interest. 2. The margin of error in a confidence interval covers only chance variation due to random sampling or random assignment.

Confidence Intervals: A Four-Step Process

State: What parameter do you want to estimate, and at what confidence level? Plan: Identify the appropriate inference method. Check the conditions. Do: If the conditions are met, perform calculations. Conclude: Interpret your interval in the context of the problem. Tidbit: Remember that the margin of error in a confidence interval includes only sampling variability. There are other sources of error that are not taken into account, such as non-response and response bias.

Standard Error of the Sample Mean

The standard error of the sample mean is shown in the diagram. It describes how far the sample mean will be from the population mean, on average, in repeated SRSs of size n.

One-Sample z Interval for a Population

Draw an SRS of size n from a population having unknown mean and known standard deviation. As long as the Normal and Independent conditions are met, a level C confidence interval for the population mean is shown in the diagram. The critical value z* is found from the standard Normal distribution. We can use the one-sample z interval for a population mean to estimate the sample size needed to achieve a specified margin of error.

Formula for Confidence Interval

Statistic±(critical value)*(standard deviation of statistic) where (critical value)*(standard deviation of statistic) is the margin of error. The critical value depends on both the confidence level C and the sampling distribution of the statistic. The statistic we use is the point estimator for the parameter.

Margin of Error

We would like a high confidence and a small margin of error. The margin of error gets smaller when 1. The confidence level decreases. 2. The sample size n increases.

Standard Error (SE)

When the standard deviation of a statistic is estimated from data, the result is called the standard error of the statistic.

Point Estimator and Point Estimate

A point estimator is a statistic that provides an estimate of a population parameter. The value of that statistic from a sample is called a point estimate. Ideally, a point estimate is our "best guess" at the value of an unknown parameter.

Conditions for Inference about a Population Mean

1. Random. The data comes from a random sample of size n from the population of interest or a randomized experiment. This condition is very important. 2. Normal. The population has a Normal distribution OR the sample size is large (n≥30). 3. Independent: The methods for calculating a confidence interval assumes that individual observations are independent. To keep the calculations reasonable accurate when we sample without replacement from a finite population, we should check the 10% condition: verify that the sample size is no more than 1/10 of the population size. Tidbit: When the actual df does not appear in the table, use the greatest df available that is less than your desired df.

Conditions for Constructing a Confidence Interval

1. Random: The data come from a well-designed random sample of randomized experiment. This allows us to generalize the results to a larger population and make cause and effect relationships. 2. Normal: The sampling distribution of the statistic is approximately Normal, which is required to compute the confidence interval. 3. Independent: Individual observations are independent. When sampling without replacement, the sample size n should be no more than 10% of the population size N (the 10% condition) to use our formula for the standard deviation of the statistic.

Using One-Sample t Procedures: The Normal Condition

1. Sample size less than 15. Use t procedures if the data appear close to Normal (roughly symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t. 2. Sample size at least 15. The t procedures can be used except in the presence of outliers or strong skewness. 3. Large samples: The t procedures can be used even for clearly skewed distributions when the sample is large, roughly n≥30. If your sample data would give a biased estimate for some reason, then you shouldn't bother computing a t interval. Or if the data you have are the entire population of interest, then there's no need to perform inference (because you would know the true parameter value).

Confidence Interval, Margin of Error, Confidence Level

A confidence interval (also referred to as an interval estimate) for a parameter has two parts: 1. An interval calculated from the data, which has the form estimate ± margin of error The margin of error tells how close the estimate tends to be to the unknown parameter in repeating random sampling. 2.A confidence level C, which gives the overall success rate of the method for calculating the confidence interval. That is, in C% of all possible samples, the method would yield an interval that captures the true parameter value.

Robust Procedures

An inference procedure is called robust if the probability calculations involved in that procedure remain fairly accurate when a condition for using the procedure is violated. The stated confidence level of a one sample t interval for the population mean is exactly correct when the population distribution is exactly Normal. Therefore, the usefulness of the t procedures in practice depends on how strongly they are affected by lack of Normality. The t procedures are NOT robust against outliers, because the sample mean and standard deviation are not resistant to outliers. Fortunately, the t procedures are quite robust against non-normality of the population except when outliers or strong skewness are present. Larger samples improve the accuracy of critical values from the t distributions when the population is not Normal. This is true for two reasons: 1. The sampling distribution of the sample mean from a large sample is close to Normal (central limit theorem). Normality of the individual observations is of little concern when the sample size is large. 2. As the sample size n grows, the sample standard deviation will be an accurate estimate of the population standard deviation whether or not the population has a Normal distribution. Tidbit: The condition that the data come from a random sample or randomized experiment is more important than the condition that the population distribution is Normal.

One-Sample z-interval for a Population Proportion

Choose an SRS of size n from a large population that contains an unknown proportion p of successes. An approximate level C confidence interval for p is shown in the diagram. When we e do not know p of the population, we replace it with p-hat, a value that should theoretically be close to p. The notation "z*" is the critical value for the standard Normal curve with area C between -z* and z*. Use this interval only when the number of successes and failures in the sample are both at least 10 and the population is at least 10 times as large as the sample.

The One-Sample t Interval for a Population Mean

Choose an SRS of size n from a population having unknown mean. A level C confidence interval is shown in the diagram, where t* is the critical value for the t(sub n-1) distribution. Use this interval only when 1) the population distribution is Normal OR the sample size is large (n≥30), and 2) the population is at least 10 times as large as the sample.

Interpreting Confidence Levels and Confidence Intervals

Confidence level: To say that we are 95% confident is shorthand for "95% of all possible samples of a given size from this population will result in an interval that captures the unknown parameter." 1. The confidence level tells us how likely it is that the method we are using will produce an interval that captures the population parameter if we use it many times. 2. The confidence level does NOT tell us the chance that a particular confidence interval captures the population parameter. Confidence Interval: To interpret a C% confidence interval for any unknown parameter, say, "We are C% confident that the interval from ____ to ____ captures the actual value of the [population parameter in context]." 1.Confidence intervals are statements about parameters, not the sample statistic. Tidbit: Plausible does not mean the same thing as possible. Some would argue that just about any value of a parameter is possible. A plausible value of a parameter is a reasonable or believable value based on the data.

The t Distributions; Degrees of Freedom

Draw an SRS of size n from a large population that has a Normal distribution with mean µ and standard deviation σ. The statistic shown in the diagram has the t distribution with degrees of freedom df=n-1. This statistic will have approximately a t(sub n-1) distribution as long as the sampling distribution of the sample mean is close to Normal. 1. The density curves of the t distributions are similar in shape to the standard Normal curve. They are symmetric about 0, single-peaked, and bell-shaped. 2. The spread of the t distributions is a bit greater than that of the standard Normal distribution. The t distributions have more probability in the tails and less in the center than does the standard Normal. This is true because substituting the estimate of the sample standard deviation for the fixed parameter population standard deviation introduces more variation into the statistic. 3. As the degrees of freedom increase, the t density curve approached the standard Normal curve ever more closely. This happens because the sample standard deviation estimates the population standard deviation more accurately as the sample size increases. So using the sample standard deviation in place of the population standard deviation causes little extra variation when the sample is large.

Choosing Sample Size for a Desired Margin of Error When Estimating Population Mean

To determine the sample size n that will yield a level C confidence interval for a population mean with a specified margin of error ME: 1. Get a reasonable value for the population standard deviation from an earlier or pilot study. 2. Find the critical value z* from a standard Normal curve for confidence level C. 3. Set the expression for the margin of error to be less than or equal to ME and solve for n. Notice that it is the size of the sample that determines the margin of error. The size of the population does not influence the sample size we need. This is true as long as the population is much larger than the sample.

Sample Size for Desired Margin of Error

To determine the sample size n that will yield a level C confidence interval for a population proportion p with a maximum margin of error ME, solve the inequality for n in the diagram, where p-hat is a guessed value for the sample proportion. The margin of error will always be less than or equal to ME if you take the guess p-hat to be 0.5.


Related study sets

Business Law I Final Preparation Chapter 18 - Formation of Sales and Lease Contracts

View Set

COMM 325 Ch 4: The Power of Our Passions

View Set

E-Commerce Test 2 Chapter 6 and 7

View Set

Chapter 22. Organization of the Body

View Set

Organizational Behavior Chapter 3

View Set