Chapter 4

¡Supera tus tareas y exámenes ahora con Quizwiz!

Hypothesis testing using the normal model

1. First write the hypotheses in plain language, then set them up in mathemat- ical notation. 2. Identify an appropriate point estimate of the parameter of interest. 3. Verify conditions to ensure the standard error estimate is reasonable and the point estimate is nearly normal and unbiased. 4. Compute the standard error. Draw a picture depicting the distribution of the estimate under the idea that H0 is true. Shade areas representing the p-value. 5. Using the picture and normal model, compute the test statistic (Z-score) and identify the p-value to evaluate the hypotheses. Write a conclusion in plain language.

When to retreat

1. The individual observations must be independent. 2. Other conditions focus on sample size and skew

Confidence intervals for nearly normal point estimates

A confidence interval based on an unbiased and nearly normal point estimate is point estimate ± z?SE (4.43) where z? is selected to correspond to the confidence level, and SE represents the standard error. The value z?SE is called the margin of error.

General confidence interval for the normal sampling distribution case

A confidence interval based on an unbiased and nearly normal point estimate is point estimate ± z?SE (4.43) where z? is selected to correspond to the confidence level, and SE represents the standard error. The value z?SE is called the margin of error.

confidence interval

A plausible range of values for the population parameter. If we report a point estimate, we probably will not hit the exact population parameter. On the other hand, if we report a range of plausible values - a confidence interval - we have a good shot at capturing the parameter.

The individual observations must be independent.

A random sample from less than 10% of the population ensures the observations are independent. In experiments, we generally require that subjects are randomized into groups. If independence fails, then advanced techniques must be used, and in some such cases, inference may not be possible.

Test statistic

A test statistic is a summary statistic that is particularly useful for evaluating a hypothesis test or identifying the p-value. When a point estimate is nearly normal, we use the Z-score of the point estimate as the test statistic. In later chapters we encounter situations where other test statistics are helpful.

Caution: One-sided hypotheses are allowed only before seeing data

After observing data, it is tempting to turn a two-sided test into a one-sided test. Avoid this temptation. Hypotheses must be set up before observing the data. If they are not, the test should be two-sided.

significance level

As a general rule of thumb, for those cases where the null hypothesis is actually true, we do not want to incorrectly reject H0 more than 5% of the time. This corresponds to a significance level of 0.05.

TIP: With larger n, the sampling distribution of x ̄ becomes more normal

As the sample size increases, the normal model for x ̄ becomes more reasonable. We can also relax our condition on skew when the sample size is very large.

Interpreting confidence intervals

Correct interpretation: We are XX% confident that the population parameter is between... Incorrect language might try to describe the confidence interval as capturing the population parameter with a certain probability. This is a common error: while it might be useful to think of it as a probability, the confidence level only quantifies how plausible it is that the parameter is in the interval. Another important consideration of confidence intervals is that they only try to cap- ture the population parameter. A confidence interval says nothing about the confidence of capturing individual observations, a proportion of the observations, or about capturing point estimates. Confidence intervals only attempt to capture population parameters.

Point estimates are not exact

Estimates are usually not exactly equal to the truth, but they get better as more data become available

sampling variation

Estimates generally vary from one sample to another, and this sampling variation suggests our estimate may be close, but it will not be exactly equal to the parameter.

Other conditions focus on sample size and skew.

For example, if the sample size is too small, the skew too strong, or extreme outliers are present, then the normal model for the sample mean will fail.

Computing SE for the sample mean

Given n independent observations from a population with standard deviation , the standard error of the sample mean is equal to A reliable method to ensure sample observations are independent is to conduct a simple random sample consisting of less than 10% of the population.

Central Limit Theorem, informal description

If a sample consists of at least 30 independent observations and the data are not strongly skewed, then the distribution of the sample mean is well approximated by a normal model.

Significance levels should reflect consequences of errors

If making a Type 1 Error is dangerous or especially costly, we should choose a small significance level (e.g. 0.01). Under this scenario we want to be very cautious about rejecting the null hypothesis, so we demand very strong evidence favoring HA before we would reject H0. If a Type 2 Error is relatively more dangerous or much more costly than a Type 1 Error, then we should choose a higher significance level (e.g. 0.10). Here we want to be cautious about failing to reject H0 when the null is actually false. The significance level selected for a test should reflect the consequences associated with Type 1 and Type 2 Errors.

How to verify sample observations are independent

If the observations are from a simple random sample and consist of fewer than 10% of the population, then they are independent. Subjects in an experiment are considered independent if they undergo random assignment to the treatment groups. If a sample is from a seemingly random process, e.g. the lifetimes of wrenches used in a particular manufacturing process, checking independence is more di cult. In this case, use your best judgement.

Confidence interval for any confidence level

If the point estimate follows the normal model with standard error SE, then a confidence interval for the population parameter is point estimate ± z?SE where z? corresponds to the confidence level selected.

Conditions for x ̄ being nearly normal and SE being accurate

Important conditions to help ensure the sampling distribution of x ̄ is nearly normal and the estimate of SE su ciently accurate: • The sample observations are independent. • The sample size is large: n 30 is a good rule of thumb. • The population distribution is not strongly skewed. This condition can be di cult to evaluate, so just use your best judgement. Additionally, the larger the sample size, the more lenient we can be with the sample's skew.

Margin of error

In a confidence interval, z? ⇥ SE is called the margin of error.

TIP: It is useful to first draw a picture to find the p-value

It is useful to draw a picture of the distribution of x ̄ as though H0 was true (i.e. μ equals the null value), and shade the region (or regions) of sample means that are at least as favorable to the alternative hypothesis. These shaded regions represent the p-value.

Caution: Examine data structure when considering independence

Some data sets are collected in such a way that they have a natural underlying structure between observations, e.g. when observations occur consecutively. Be especially cautious about independence assumptions regarding such data sets.

Caution: Watch out for strong skew and outliers

Strong skew is often identified by the presence of clear outliers. If a data set has prominent outliers, or such observations are somewhat common for the type of data under study, then it is useful to collect a sample with many more than 30 observations if the normal model will be used for x ̄.

Central Limit Theorem, informal definition

The distribution of x ̄ is approximately normal. The approximation can be poor if the sample size is small, but it improves with larger sample sizes.

Testing hypotheses using confidence intervals

The di↵erence between x ̄13 and 3.09 could be due to sampling variation, i.e. the variability associated with the point estimate when we take a random sample. Entering the sample mean, z?, and the standard error into the confidence interval formula results in (2.27, 3.29). We are 95% confident that the average number of days per week that all students from the 2013 YRBSS lifted weights was between 2.27 and 3.29 days. Because the average of all students from the 2011 YRBSS survey is 3.09, which falls within the range of plausible values from the confidence interval, we cannot say the null hypothesis is implausible. That is, we fail to reject the null hypothesis, H0.

Null and alternative hypotheses

The null hypothesis (H0) often represents either a skeptical perspective or a claim to be tested. The alternative hypothesis (HA) represents an alternative claim under consideration and is often represented by a range of possible parameter values. The null hypothesis often represents a skeptical position or a perspective of no di↵er- ence. The alternative hypothesis often represents a new perspective, such as the possibility that there has been a change.

evaluating hypothesis tests with p-values:

The null hypothesis represents a skeptic's position or a position of no di↵erence. We reject this position only if the evidence strongly favors HA. • A small p-value means that if the null hypothesis is true, there is a low probability of seeing a point estimate at least as extreme as the one we saw. We interpret this as strong evidence in favor of the alternative. • We reject the null hypothesis if the p-value is smaller than the significance level, ↵, which is usually 0.05. Otherwise, we fail to reject H0. • We should always state the conclusion of the hypothesis test in plain language so non-statisticians can also understand the results. This method ensures that the Type 1 Error rate does not exceed the significance level standard.

p-value

The p-value is a way of quantifying the strength of the evidence against the null hypothesis and in favor of the alternative. Formally the p-value is a conditional probability. The p-value is the probability of observing data at least as favorable to the al- ternative hypothesis as our current data set, if the null hypothesis is true. We typically use a summary statistic of the data, in this chapter the sample mean, to help compute the p-value and evaluate the hypotheses.

Point Estimate

The sample mean x ̄ = 1.697 meters (5 feet, 6.8 inches) is called a point estimate of the population mean: if we can only choose one value to estimate the population mean, this is our best guess

Sampling distribution

The sampling distribution represents the distribution of the point estimates based on samples of a fixed size from a certain population. It is useful to think of a par- ticular point estimate as being drawn from such a distribution. Understanding the concept of a sampling distribution is central to understanding statistical inference. The sample means should tend to "fall around" the population mean.

p-value as a tool in hypothesis testing

The smaller the p-value, the stronger the data favor HA over H0. A small p-value (usually < 0.05) corresponds to su cient evidence to reject H0 in favor of HA.

Standard error of an estimate

The standard deviation associated with an estimate is called the standard error. It describes the typical error or uncertainty associated with the estimate.

unbiased

We make another important assumption about each point estimate encountered in this section: the estimate is unbiased. A point estimate is unbiased if the sampling distribution of the estimate is centered at the parameter it estimates. That is, an unbiased estimate does not naturally over or underestimate the parameter. Rather, it tends to provide a "good" estimate. The sample mean is an example of an unbiased point estimate, as are each of the examples we introduce in this section

TIP: Always write the null hypothesis as an equality

We will find it most useful if we always list the null hypothesis as an equality (e.g. μ = 7) while the alternative always uses an inequality (e.g. μ 6= 7, μ > 7, or μ < 7)

Checking for strong skew usually means checking for obvious outliers

When there are prominent outliers present, the sample should contain at least 100 observations, and in some cases, much more. This is a first course in statistics, so you won't have perfect judgement on assessing skew. That's okay. If you're in a bind, either consult a statistician or learn about the studentized bootstrap (bootstrap-t) method.

TIP: One-sided and two-sided tests

When you are interested in checking for an increase or a decrease, but not both, use a one-sided test. When you are interested in any di↵erence from the null value - an increase or decrease - then the test should be two-sided.

Purpose of Statistical Inference

concerned primarily with understanding the quality of parameter estimates. For example, a classic inferential question is, "How sure are we that the estimated mean, x ̄, is near the true population mean, μ?

How to Estimate population mean based on sample

estimate the population mean based on the sample. The most intuitive way to go about doing this is to simply take the sample mean

A Type 2 Error

is failing to reject the null hypothesis when the alternative is actually true

Type 1 Error

is rejecting the null hypothesis when H0 is actually true.

Basic properties of point estimates

point estimates from a sample may be used to estimate population parameters. We also determined that these point estimates are not exact: they vary from one sample to another. Lastly, we quantified the uncertainty of the sample mean using what we call the standard error, mathematically represented in Equation

null value

represents the value of the parameter if the null hypothesis is true. H0: μ13 = 3.09 HA: μ13 6= 3.09 where 3.09 is the average number of days per week that students from the 2011 YRBSS lifted weights.

running mean

running mean is a sequence of means, where each mean uses one more observation in its calculation than the mean directly before it in the sequence. For example, the second mean in the sequence is the average of the first two observations and the third in the sequence is the average of the first three.

population parameters,

the population median or population standard deviation

Statistical Inference

the theory, methods, and practice of forming judgments about the parameters of a population and the reliability of statistical relationships, typically on the basis of random sampling.


Conjuntos de estudio relacionados

Nursing Care of the Child With an Alteration in Cellular Regulation/Hematologic or Neoplastic Disorder

View Set

ch 2-10 critical reading guide quiz

View Set

SCM 14: Supply Chain Risk and Security

View Set

Anatomy Endocrine System Ch. 10 Test

View Set

S7: Chapter 4.4: DNA study questions

View Set