Statistics Ch. 16

¡Supera tus tareas y exámenes ahora con Quizwiz!

if a study fails to reject Ho, either:

Ho was true: no error was made Ho is false: type II error was made

we have used critical values of 1, 2, and 3 when we talked about the 68-95-99.7 Rule

In testing hypotheses, people are more likely to choose alpha levels like 0.1, 0.05, 0.01, or 0.001 with corresponding z* critical values of 1.645, 1.96, 2.576, and 3.29 For a one-sided alternative, divide the alpha-level in half

How often will a type I error occur?

It happens when the null hypothesis is true but we've had the bad luck to draw an unusual sample To reject Ho, the P-value must fall below alpha When Ho is true, that happens exactly with probability alpha So when you choose level alpha, you're setting the probability of a Type I error to alpha P(Type I error) = a What if Ho is not true? Then we can't possibly make a Type I error You can't get a false positive from a sick person A Type I error can happen only when Ho is true.

Statistical Significance:

P-value < a

Alpha Level

The threshold P-value that determines when we reject a null hypothesis labeled Greek letter a a = 0.05 is commonly used have to select alpha level before you look at the data

Significance Level

alpha level is also called this when we reject the null hypothesis, we say that the test is "significant at that level" reject the null hypothesis "at the 5% level of significance"

we could reduce B for all alternative parameter values by increasing a:

by making it easier to reject the null, we'd be more likely to reject it whether it's true or not so we'd reduce B, the chance that we fail to reject a false null, but we'd make more type I errors

If the sample is not large enough...

even a large financially or scientifically important effect may not be statistically significant

Large Samples

even a small, unimportant effect size can be statistically significant

Critical Value

for every null model, an a-level specifies an area under the curve the cut-point for that area is called this for the Normal and t-models, the critical values are z* and t*

knowing the effect size and the sample size...

helps determine the power but when we design a study we won't know the effect size, so we can only imagine possible effect sizes and look at their consequences

the smaller the P-value is, the more confident we can be in rejecting the null hypothesis

however, it does not make the null hypothesis any more false

Lower Figure shows the true model

if the true value of p is greater than po, then we are more likely to observe a value that exceeds the critical value, p* and make the correct decision to reject the null hypothesis

the automatic nature of the reject/fail-to-reject decision when we use the alpha level may make you uncomfortable:

if your P-value falls just slightly above your alpha level, you are not allowed to reject the null a P-value barley below the alpha level leads to rejection report the P-value rather than to base a decision on an arbitrary alpha level

Lower Model Example: suppose the null hypothesis is not true

in reality, the null hypothesis is rarely exactly true the model supposes that the true value is p, not po the lower figure shows a distribution of possible observed p-hat values around this true value sample proportions will vary sample to sample because of this sampling variability, sometimes p-hat will be less than p* and we fail to reject the (false) null hypothesis a low p-hat (near po) results in a type II error

Visualize these concepts using a simple one-sided hypothesis test about a proportion for illustration:

in testing Ho: p = po against the one-sided alternative Ha: p > po, we'll reject the null if the observed proportion, p-hat, is big enough by big enough we mean p-hat is greater than p* for some critical value p* (shown as the red region in the right tail of the upper curve)

P-value > 0.5

indicates that the sample was on the wrong side of the inequality

P-Value

is a conditional probability it tells us the probability of getting results at least as unusual as the observed statistic, given that the null hypothesis is true its a probability about the data

the power of the test, the probability that we make the right decision...

is shown as the region to the right of P* (in the lower model example) it is 1-B

the only way to reduce both types of errors:

is to collect more evidence or to collect more data increase the sample size

the effect size is central to how we think about the power of a hypothesis test:

it is easier to detect larger effects, so for the same sample size, a larger effect size results in higher power

Large P-Values

just mean that what we've observed isn't surprising given the null model does not prove that the null hypothesis is true, but it offers no evidence that it's not true all we can say is that we do not reject the null hypothesis

making the SD smaller...

makes it possible to increase the power while reducing the type 1 error rate as well

we calculate p* based on the upper model because p* depends only on the null model and the alpha level:

no matter what the true proportion, p* does not change we always reject Ho when p-hat is greater than p*

the power of a test is the probability that it rejects a false null hypothesis:

obtaining a larger sample size decreases the probability of a type II error, so it increases the power the more willing we are to accept a type I error, the less likely we will be to make a type II error

Specificity

of a test measures its ability to correctly identify only the healthy (the ones who should test negative) # of true negatives / (# of true negatives + # of false positives) specificity = 1 - a gives you the probability that the test will correctly identify you as healthy when you are healthy

when designing a study it is important to know what effect size you would consider meaningful and how variable your data are likely to be:

once you know both of these things, a bit of algebra gives an estimate of n, the sample size you need you can be sure to gather a large enough sample to know how variable data might be, you may need a small pilot study or base calculations on previous similar studies

how small the P-value needs to be in order to reject the null

one strategy is to set an arbitrary threshold before collecting the data

How big a difference would matter?

one way to think about the effect size the answer to this question depends on who is asking it and why

Small P-Values

provides stronger evidence against Ho this does not mean that Ho is less true how small the P-value has to be for you to reject the null hypothesis depends on a lot of things, not all of which can be precisely quantified P-value should serve as a measure of the strength of the evidence against the null hypothesis, but should never serve as a hard and fast rule for decisions

how do we make both curves narrower?

reduce the SD, which we can do by increasing the sample size the means haven't moved, but if we keep a the same size, the critical value p* moves closer to po and farther from p that means that larger sample size has increased the power of the test as you can see by the smaller B

decreasing a...

results in an increase of B

Upper Model Example: suppose the null hypothesis is true:

sampling distribution model for the proportion of the null hypothesis were true we'd make a type I error whenever the sample gave us p-hat is greater than p* because we would reject the (true) null hypothesis unusual samples like that would happen only with probability a

Statistically Significant

set an arbitrary threshold before collecting the data then if the P-value falls below that "bright line", reject the null hypothesis and call the result __________ ___________

report the P-value as an indication of the strength of the evidence:

sometimes it's best to report that the conclusion is not yet clear and to suggest that more data be gathered the P-value is the best summary we have of what the data say or fail to say about the null hypothesis

Significance

statistical result call "significant" all that means is that the test statistic had a P-value lower than the specified alpha level if you were not told the alpha level, it is hard to know exactly what "significant" means sometimes the term is used to suggest that the result is meaningful or important a test with a small P-value may be surprising, but it says nothing about the size of the effect--and that is what determines whether the result actually makes a difference don't be lulled into thinking that statistical significance carries with it any sense of practical importance or impact

Sensitivity and Specificity:

terms used in medical studies the null hypothesis is that the person is healthy the alternative is that the person is sick

report a confidence interval for the parameter along with the P-value to indicate the range of plausible values for the parameter

the CI is centered on the observed effect and puts bounds on how big or small the effect size may actually be

Sensitivity

the ability to detect the disease # of true positives / (# of true positives + # of false negatives) sensitivity = 1 - B = power of the test gives you the probability hat the test will correctly identify you as sick when you are sick

Type II Error

the null hypothesis is false, but we fail to reject it is a false negative patient is told he is disease-free when in fact he has the disease

Type I Error

the null hypothesis is true, but we mistakenly reject it the 1st kind of error we could make a false positive patient thinks he has the disease when he does not

Bottom Model: the area under the curve to the left of P* represents how often we make a type II error:

the probability is beta (B) in this picture, B is less than half, so most of the time we do make the right decision

the P-value is NOT...

the probability that the null hypothesis is true the conditional probability that the null hypothesis is true given the data

the larger the real difference between the hypothesized value, po, and the true population value, p.....

the smaller the change of making a type II error and the greater the power of the test if the two proportions are very far apart, the two models will barley overlap, and we will not be likely to make any type II errors

if we can make both curves narrower...

then both the probability of type I errors and probability of type II errors will decrease and the power of the test will increase

small effects are more difficult to detect:

they result in more type II errors (higher probability) and therefore lower power

P(Type I error) = alpha

this represents the probability that if Ho is true then we will reject Ho

the small P-value by itself says nothing about how much it changed, or in what direction...

to answer that question, construct a confidence interval confidence interval provides the additional information

if we reduce type I error...

we automatically increase type II error

P(Type II error) = B

we cannot calculate B saying Ho is false does not tell us what the parameter is

What is the value of B?

we don't know the true value of the parameter so its harder to assess when Ho is true, it specifies a single parameter value but when Ho is false, we don't know the parameter value and there are many possible values we can compute the probability B for any parameter value in Ha but the one we choose depends on the situation

The Power of the Test

what we really want to do is to detect a false null hypothesis when Ho is false and we reject it, we have done the right thing a test's ability to detect a false null hypothesis is called....

we assign beta (B) to the probability of this mistake

when Ho is false but we fail to reject it, we have made a type II error

Power of the Test

when the null hypothesis is actually false, we hope our test is strong enough to reject it the probability of making that correct decision is called the power of the test thats the probability that it succeeds in rejecting the false null hypothesis 1-B

Effect Size

when we reject a null hypothesis, what we care about is the actual change or difference in the data the difference between the value you see in your data and the null value is this

reducing alpha to lower the chance of a type I error...

will move the critical value p* to the right (in this example) this will have the effect of increasing B, the probability of a type II error, reducing the power of the test

whenever you fail to reject your null hypothesis...

you should think about whether your test had sufficient power


Conjuntos de estudio relacionados

ATI Fundamentals Proctored Exam Study Guide

View Set

Classic Mythology (Powell) Final Practice Questions Chapters 19-24

View Set

Economics, Unit 2: Rational Decision-Making and Cost-Benefit Analysis

View Set