Business Statistics Ch 17 & 18 (Week 8)

Ace your homework & exams now with Quizwiz!

Cautions about confidence intervals

**A sampling distribution shows how a statistic varies in repeated random sampling. **This variation causes random sampling error, because the statistic misses the true parameter by a random amount. **The margin of error in a confidence interval ignores everything, except the sample-to-sample variation due to choosing the sample randomly. THE MARGIN OF ERROR DOESN'T COVER ALL ERRORS **The margin of error in a confidence interval covers only random sampling errors. **Practical difficulties (such as undercoverage and nonresponse) are often more serious than random sampling error. The margin of error does not take such difficulties into account.

Stating hypotheses

**A significance test starts with a careful statement of the claims we want to compare. **The claim tested by a statistical test is called the null hypothesis (H0). The test is designed to assess the strength of the evidence against the null hypothesis. Often the null hypothesis is a statement of "no difference." **The claim about the population that we are trying to find evidence for is the alternative hypothesis (Ha). The alternative is one-sided if it states that a parameter is larger or smaller than the null hypothesis value. It is two-sided if it states that the parameter is different from the null value (it could be either smaller or larger). **In the sweetness example, our hypotheses are: H_0: μ = 0 Ha: μ > 0 **The alternative hypothesis is one-sided because we are interested only in whether the cola lost sweetness.

Planning studies: sample size for confidence intervals

**A wise user of statistics never plans a sample or an experiment without also planning the inference. The number of observations is a critical part of planning the study. **The margin of error ME of the confidence interval for the population mean μ is: m=z^∗ σ/√n **To obtain a desired margin of error m, put in the value of z^∗ for your desired confidence level, and solve for the sample size n. SAMPLE SIZE FOR DESIRED MARGIN OF ERROR **The z confidence interval for the mean of a Normal population will have a specified margin of error m when the sample size is: n=((z^∗ σ)/m)^2

Sweetening colas—one-sided P-value

Cola sweeteners STATE: In a matched-pairs study of 10 trained tasters, the researchers found that x ̅=0.3, with values of x ̅>0 favoring Ha over H0. Is this strong enough evidence to decide that the sweetener is losing sweetness? PLAN: Take μ to be the mean sweetness loss rating difference. We want to test the hypotheses: H_0:μ=0 H_a:μ>0 SOLVE: The illustration shows the calculation of the P-value; the P(x ̅≥0.3)=0.1714 if we assume μ=0, in other words, the probability of a value of the test statistic at least as extreme, if the null hypothesis were true. CONCLUDE: More than 17% of the time, an SRS of size 10 of trained testers would have a mean sweetness difference at least as great as that observed. The observed x ̅=0.3 is therefore not good evidence that this cola experienced loss of sweetness.

Stating hypotheses: two-tailed example (part 1)

Does the job satisfaction of assembly workers differ when their work is machine-paced rather than self-paced? Assign workers either to an assembly line moving at a fixed pace or to a self-paced setting. All subjects work in both settings, in random order. After two weeks in each work setting, the workers take a test of job satisfaction. **This is a matched pairs design. **The response variable is the difference in satisfaction scores: self-paced minus machine-paced. **The parameter of interest is the mean μ of the differences in scores in the population of all assembly workers.

Planning studies: the power of a statistical test (part 2)

Example (using an app): If we have 10 tasters for this study, how likely are we to detect a 0.8 point change? Significance level. Requiring significance at the 5% level is enough protection against declaring there is a loss in sweetness when, in fact, there is no change, if we could look at the entire population. Effect size. A mean sweetness loss of 0.8 point on the 10-point scale will be noticed by consumers and so is important in practice. Power. We would be 81.2% confident that our test will detect a mean loss of 0.8 point in the population of all tasters.

Planning studies: the power of a statistical test (part 4)

Example (using software): Now we ask software: if there were a 0.8 point change, how many tasters would be required to be 90% confident of detecting that change? Significance level. Requiring significance at the 5% level is enough protection against declaring there is a loss in sweetness when, in fact, there is no change, if we could look at the entire population. Effect size. A mean sweetness loss of 0.8 point on the 10-point scale will be noticed by consumers and so is important in practice. Power. The output at left gives the various sample sizes required to be 90% confident that our test will detect a mean loss of 0.8 point in the population of all tasters.

alternative hypothesis (Ha)

The claim about the population that we are trying to find evidence for is the alternative hypothesis (Ha). The alternative is one-sided if it states that a parameter is larger or smaller than the null hypothesis value. It is two-sided if it states that the parameter is different from the null value (it could be either smaller or larger).

null hypothesis (H0)

The claim tested by a statistical test is called the null hypothesis (H0). The test is designed to assess the strength of the evidence against the null hypothesis. Often the null hypothesis is a statement of "no difference."

Conditions for inference in practice (part 2)

What is the shape of the population distribution? **Many of the basic methods of inference are designed for Normal populations. **Fortunately, this condition is less essential than where the data come from. **Any inference procedure based on sample statistics (like the sample mean, x ̅) that are not resistant to outliers can be strongly influenced by a few extreme observations.

The Survey of Study Habits and Attitudes (SSHA) is a psychological test that measures the motivation, attitudes, and study habits of college students. Scores range from 0 to 200 and follow (approximately) a Normal distribution, with mean of 110 and standard deviation σ = 20. You suspect that incoming freshman have a mean µ that is different from 110, because they are often excited yet anxious about entering college. To verify your suspicion, you test the hypotheses H0: µ = 110, Ha: µ 110. You give the SSHA to 50 students who are incoming freshman and find their mean score. If you observe a sample mean of = 115.35, what is the corresponding P-value? a) 0.058 b) 0.029 c) 0.787 d) None of the answer options is correct.

a) 0.058

A "false positive," in which an effect is detected when, in reality, there is none, is an example of: a) a Type I error. b) a Type II error. c) the power of the test. d) None of the answer options is correct.

a) a Type I error.

In formulating hypotheses for a statistical test of significance, the null hypothesis is often: a) a statement of "no effect" or "no difference." b) the probability of observing the data you actually obtained. c) a statement that the data are all 0. d) 0.05.

a) a statement of "no effect" or "no difference."

In a test of hypothesis, a small P-value provides evidence: a) against the null hypothesis in favor of the alternative hypothesis. b) against the alternative hypothesis in favor of the null hypothesis. c) against the null hypothesis and the alternative hypothesis. d) for the null hypothesis and the alternative hypothesis.

a) against the null hypothesis in favor of the alternative hypothesis.

The P-value measures the strength of evidence: a) against the null hypothesis. b) against the alternative hypothesis. c) sometimes against the null hypothesis and sometimes against the alternative hypothesis. d) The interpretation depends on whether we reject the null hypothesis.

a) against the null hypothesis.

The margin of error in a confidence interval refers to: a) sampling error. b) nonresponse error. c) measurement error. d) All of the answer options are correct.

a) sampling error.

Which of the following quantities must be known before calculating the margin of error? a) the population variance b) the population mean c) the population size d) All of the answer options are correct.

a) the population variance

The significance level is defined as: a) the probability of a Type I error. b) the probability of a Type II error. c) the probability that the power of the test is at least 0.9. d) None of the answer options is correct.

a) the probability of a Type I error.

A marketing consultant is hired by a major restaurant chain wishing to investigate the preferences and spending patterns of lunch customers. The CEO of the chain hypothesized that the average customer spends at least $13.50 on lunch. A survey of 25 customers sampled at one of the restaurants found the average lunch bill per customer to be = $14.50. Based on previous surveys, the restaurant informs the marketing manager that the standard deviation is σ = $3.50. Lunch bills are Normally distributed. To address the CEO's conjecture, the marketing manager carried out a hypothesis test of H0: µ = 13.50 vs. Ha: µ > 13.50. Using the test statistic , the P-value for the hypothesis test equals: a) 0.025 b) 0.077 c) 0.144 d) 0.923

b) 0.077

Is the mean age at which American children first read now under four years? If the population of all American children has a mean age of µ years until they begin to read, which of the following null and alternative hypotheses would be tested to answer this question? a) H0: µ = 4 vs. Ha: µ > 4 b) H0: µ = 4 vs. Ha: µ < 4 c) H0: µ = 4 vs. Ha: µ≠4 d) H0: µ = 4 vs. Ha: µ = 4 ±, assuming our sample size is n

b) H0: µ = 4 vs. Ha: µ < 4

In their advertisements, the manufacturers of a certain brand of breakfast cereal would like to claim that eating their oatmeal for breakfast daily will produce a mean decrease in cholesterol of more than 10 points in one month for people with cholesterol levels over 200. To determine if this is a valid claim, they hire an independent testing agency, which then selects 25 people with cholesterol levels over 200 to eat the manufacturer&#39;s cereal for breakfast daily for a month. The agency should be testing the null hypothesis H0: µ = 10 and the alternative hypothesis: a) Ha: µ</em> < 10. b) Ha: µ > 10. c) Ha: µ 10. d) Ha: µ 10 ±

b) Ha: µ > 10.

A "false negative," in which no effect is detected when, in reality, one exists, is an example of: a) a Type I error. b) a Type II error. c) the power of the test. d) None of the answer options is correct.

b) a Type II error.

A marketing consultant is hired by a major restaurant chain wishing to investigate the preferences and spending patterns of lunch customers. The CEO of the chain hypothesized that the average customer spends at least $13.50 on lunch. A survey of 25 customers sampled at one of the restaurants found the average lunch bill per customer to be = $14.50. Based on previous surveys, the restaurant informs the marketing manager that the standard deviation is σ = $3.50. To address the CEO&#39;s conjecture, the marketing manager carried out a hypothesis test of H0: µ = 13.50 vs. Ha: µ > 13.50 and obtained a P-value = 0.077. At level of significance α = 0.05, the null hypothesis is not rejected. However, the marketing director later finds that, in fact, the average lunch price is above $13.50. The failure of the sample of 25 lunch prices to detect this fact was: a) a Type I error. b) a Type II error. c) an error of the third kind. d) just bad data.

b) a Type II error.

Which of the following will reduce the value of the power in a statistical test of hypotheses? a) decreasing the Type II error probability b) decreasing the sample size c) rejecting the null hypothesis only if the P-value is smaller than the level of significance d) All of the answer options are correct.

b) decreasing the sample size

A certain make of automobile has an average highway gas mileage of 30 miles per gallon (mpg). An engineer designs an improved engine, which has an average highway gas mileage of 30.2 mpg, based on a sample of 3600 cars with the new engine. Although the difference is quite small, the effect is statistically significant because: a) new designs typically have less variability than standard designs, so small differences can appear to be statistically significant. b) the sample size is very large. c) the mean of 30.2 is large compared to the gas mileage of most cars. d) All of the answer options are correct.

b) the sample size is very large.

Tests for a population mean

z TEST FOR A POPULATION mean **Draw an SRS of size n from a Normal population that has unknown mean μ and known standard deviation σ. To test the null hypothesis that μ has a specified value: H0: μ = μ0 **Compute the one-sample z test statistic. z=(x ̅-μ_0)/(σ⁄√n) **In terms of a variable Z having the standard Normal distribution, the approximate P-value for a test of H0 against

significance level

If the P-value is smaller than alpha, we say that the data are statistically significant at level α. The quantity α is called the significance level or the level of significance.

The reasoning of tests of significance (part 1)

**Artificial sweeteners in colas gradually lose their sweetness over time. Manufacturers test for loss of sweetness on a scale of -10 to 10, with negative scores corresponding to a gain in sweetness, positive to a loss of sweetness. **Suppose we know that for any cola, the sweetness loss scores vary from taster to taster according to a Normal distribution with standard deviation σ= 1. The mean μ for all tasters measures loss of sweetness and is different for different colas. **Here are the sweetness losses for a cola currently on the market, as measured by 10 trained tasters: 1.6 0.4 0.5 -2.0 1.5 -1.1 1.3 -0.1 -0.3 1.2 **The average sweetness loss is given by the sample mean x ̅=0.3. Most scores were positive. That is, most tasters found a loss of sweetness. But the losses are small, and two tasters (the negative scores) thought the cola gained sweetness. Are these data good evidence that the cola lost sweetness in storage?

Conditions for inference in practice (part 1)

**Caution! Any confidence interval or significance test can be trusted only under specific conditions. Simple Conditions for Inference about a Mean **We have a simple random sample (SRS) from the population of interest. There is no nonresponse or other practical difficulty. The population is large compared to the size of the sample. **The variable we measure has an exactly Normal distribution N(μ,σ) in the population. **We don't know the population mean μ. But we do know the population standard deviation σ. WHERE THE DATA COME FROM MATTERS **When you use statistical inference, you are acting as if your data are a random sample or come from a randomized comparative experiment.

Statistical inference

**Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population parameter. **The second common type of inference, called tests of significance, has a different goal: to assess the evidence provided by data about some claim concerning a population. o A test of significance is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to assess. o Significance tests use an elaborate vocabulary, but the basic idea is simple: an outcome that would rarely happen if a claim were true is good evidence that the claim is not true.

Stating hypotheses: two-tailed example (part 2)

**Here, our response variable is the difference in satisfaction scores; the parameter of interest is the mean μ of the differences in scores in the population of all assembly workers. **The null hypothesis says that there is no difference between self-paced work and machine-paced work: H_0: μ = 0 **The authors of the study wanted to know if the two work conditions have different levels of job satisfaction. They did not specify the direction of the difference. The alternative hypothesis is therefore two-sided: Ha: μ≠0 **Caution! The hypotheses should express the hopes or suspicions we have before we see the data. It is cheating to first look at the data and then frame hypotheses to fit what the data show.

Planning studies: the power of a statistical test (part 5)

**How large a sample should we take when we plan to carry out a significance test? The answer depends on what alternative values of the parameter are important to detect. **Here is an overview of influences on "How many observations do I need?" **If you insist on a smaller significance level (such as 1% rather than 5%), you have to take a larger sample. A smaller significance level requires stronger evidence to reject the null hypothesis. **If you insist on higher power (such as 99% rather than 90%), you will need a larger sample. Higher power gives a better chance of detecting a difference when it is really there. **At any significance level and desired power, a two-sided alternative requires a larger sample than a one-sided alternative. **At any significance level and desired power, detecting a small effect requires a larger sample than detecting a large effect.

Cautions about significance tests (part 2)

**Significance depends on the alternative hypothesis. **The P-value for a one-sided test is one-half the P-value for the two-sided test of the same null hypothesis based on the same data. ~~The evidence against the null hypothesis is stronger when the alternative is one-sided, because it is based on the data plus information about the direction of possible deviations from the null. ~~If you lack this added information, always use a two-sided alternative hypothesis.

z procedures

**So far, we have met two procedures for statistical inference. When the simple conditions are true—the data are an SRS, the population has a Normal distribution, and we know the standard deviation s of the population—a confidence interval for the mean m is: x ̅±z^∗ σ⁄√n **To test a hypothesis H_0: μ = μ_0, we use the one-sample z-statistic: (x ̅-μ_0)/(σ⁄√n) **These are called z procedures because they both involve a one-sample z-statistic and use the standard Normal distribution.

Significance from a table

**Statistics in practice uses technology to get P-values quickly and accurately. In the absence of suitable technology, you can get approximate P-values by comparing your test statistic with critical values from a table. SIGNIFICANCE FROM A TABLE OF CRITICAL VALUES **To find the approximate P-value for any z statistic, compare z (ignoring its sign) with the critical values z* at the bottom of Table C. If z falls between two values of z*, the P-value falls between the two corresponding values of P in the "One-sided P" or the "Two-sided P" row of Table C.

P-value and statistical significance (part 2)

**Tests of significance assess the evidence against H_0. If the evidence is strong, we can confidently reject H_0 in favor of the alternative. **Our conclusion in a significance test comes down to: P-value small → reject H_0 → conclude Ha (in context) P-value large → fail to reject H0 → cannot conclude Ha (in context) **There is no rule for how small a P-value we should require in order to reject H0—it's a matter of judgment and depends on the specific circumstances. But we can compare the P-value with a fixed value that we regard as decisive, called the significance level. We write it as α, the Greek letter alpha. When our P-value is less than the chosen α, we say that the result is statistically significant. **If the P-value is smaller than alpha, we say that the data are statistically significant at level α. The quantity α is called the significance level or the level of significance.

P-value and statistical significance (part 1)

**The null hypothesis H_0 states the claim that we are seeking evidence against. The probability that measures the strength of the evidence against a null hypothesis is called a P-value. **A test statistic calculated from the sample data measures how far the data diverge from what we would expect if the null hypothesis H0 were true. Large values of the statistic show that the data are not consistent with H_0. **The probability, computed assuming H_0 is true, that the statistic would take a value as extreme as or more extreme than the one actually observed is called the P-value of the test. The smaller the P-value, the stronger the evidence against H_0 provided by the data. **Small P-values are evidence against H_0, because they say that the observed result is unlikely to occur when H_0 is true. **Large P-values fail to give convincing evidence against H_0, because they say that the observed result could have occurred by chance if H_0 were true.

Planning studies: the power of a statistical test (part 3)

**The power of a test against a specific alternative is the probability that the test will reject H_0 at a chosen significance level α when the specified alternative value of the parameter is true. **Example: a matched-pairs experiment to test for loss of sweetness—the bigger the difference, the bigger the loss of sweetness. We know that the sweetness loss scores vary from scorer to scorer according to a Normal distribution with standard deviation about s = 1. To see if there is a loss in sweetness, we test: H_0:μ=0 H_a:μ>0

The reasoning of tests of significance (part 2)

**We make a claim and ask if the data give evidence against it. **We seek evidence that there is a sweetness loss, so the claim we test is that there is not a loss. In that case, the mean loss for the population of all trained testers would be μ = 0. **If the claim that μ = 0 is true, the sampling distribution of x ̅ from 10 tasters is Normal, with mean μ = 0 and standard deviation σ/√n=1/√10=0.316.

statistically significant at level α

If the P-value is smaller than alpha, we say that the data are statistically significant at level α. The quantity α is called the significance level or the level of significance.

level of significance.

If the P-value is smaller than alpha, we say that the data are statistically significant at level α. The quantity α is called the significance level or the level of significance.

Planning studies: the power of a statistical test (part 1)

How large a sample should we take when we plan to carry out a test of significance? Here are the questions we must answer to decide how many observations we need: *Significance level. How much protection do we want against getting a significant result from our sample when there really is no effect in the population? *Effect size. How large an effect in the population is important in practice? *Power. How confident do we want to be that our study will detect an effect of the size we think is important?

Job satisfaction—two-sided P-value

Job satisfaction STATE: In a matched-pairs study of 18 workers, the analysis found: x ̅=17, with values of x ̅>0 favoring Ha over H_0; that is, workers preferring the self-paced over the machine-paced environment. PLAN: Take μ to be the mean difference for all workers; because the researchers didn't know which environment the workers would prefer, we have to test the hypotheses: H_0:μ=0 H_a:μ≠0 SOLVE: The illustration shows the calculation of the P-value; the P(|x ̅ |≥17)=0.2293 if we assume μ=0, in other words, the probability of a value of the test statistic at least as extreme, if the null hypothesis were true. CONCLUDE: More than 17% of the time, an SRS of size 10 of trained testers would have a mean sweetness difference at least as great as that observed. The observed x ̅=0.3 is therefore not good evidence that this cola experienced loss of sweetness.

Cautions about significance tests (part 3)

Sample Size Affects Statistical Significance **Because large random samples have small chance variation, very small population effects can be highly significant if the sample is large. **Because small random samples have a lot of chance variation, even large population effects can fail to be significant if the sample is small. **Statistical significance does not tell us whether an effect is large enough to be important. That is, statistical significance is not the same as practical significance. Beware of Multiple Analyses **The reasoning of statistical significance works well if you decide what effect you are seeking, design a study to search for it, and use a test of significance to weigh the evidence you get.

Cautions about significance tests (part 1)

Significance tests are widely used in most areas of statistical work. Some points to keep in mind when you use or interpret significance tests are: **How small a P is convincing? **The purpose of a test of significance is to describe the degree of evidence provided by the sample against the null hypothesis. How small a P-value is convincing evidence against the null hypothesis depends mainly on two circumstances: ~~If H0 represents an assumption that has been believed for years, strong evidence (a small P) will be needed. ~~If rejecting H0 means making a costly changeover, you need strong evidence.

Tests of significance

THE FOUR-STEP PROCESS **STATE: What is the practical question that requires a statistical test? **PLAN: Identify the parameter, state null and alternative hypotheses, and choose the type of test that fits your situation. **SOLVE: Carry out the test in three phases: 1. Check the conditions for the test you plan to use. 2. Calculate the test statistic. 3. Find the P-value. **CONCLUDE: Return to the practical question to describe your results in this setting.

The reasoning of tests of significance (illustrated)

This is like calculations we did in Chapter 15—we can locate our x ̅ of 0.3 in this distribution and comment on whether it is surprising.

A statistics teacher taught a large introductory statistics class, with 500 students having enrolled over many years. The mean score over all those students on the first midterm was µ = 68 with standard deviation σ = 20. One year, the teacher taught a much smaller class of only 25 students. The teacher wanted to know if teaching a smaller class affected scores in any way. We can consider the small class as a SRS of the students who took the large class over the years. The average midterm score was = 78. The hypothesis the teacher tested was H0: µ = 68 vs. Ha: µ ≠ 68. A power calculation done prior to collecting the sample showed the probability of Type II error to be 0.1. If the null hypothesis is not rejected, the teacher can conclude that: a) the null hypothesis is true, no doubt about it. b) there is no evidence against the null or in favor of the alternative, and the study supports the null hypothesis. c) the alternative hypothesis is proved wrong. d) the null hypothesis is proved wrong.

b) there is no evidence against the null or in favor of the alternative, and the study supports the null hypothesis.

Which of the following questions can be used to draw conclusions from a test of significance? a) If p is less than the significance level ⍺, then reject the null hypothesis. b) If p is greater than the significance level ⍺</em>, then reject the null hypothesis. c) If the P-value is less than the significance level ⍺</em>, then reject the null hypothesis. d) If the P-value is greater than the significance level ⍺</em>, then reject the null hypothesis.

c) If the P-value is less than the significance level ⍺</em>, then reject the null hypothesis.

If you are willing to reject the null hypothesis one time out of every 20 times that you conduct the same test of significance, which of the following is always true? a) The power of the test is too low. b) The effect size must be small for the power of the test to be high. c) The level of significance that you have set is α = 0.05. d) The level of significance that you have set is α = 0.025.

c) The level of significance that you have set is α = 0.05.

In a statistical test of hypotheses, we say the data are statistically significant at level ⍺ if: a) ⍺ = 0.05. b) ⍺ is small. c) the P-value is less than ⍺. d) the P-value is larger than ⍺</em>.

c) the P-value is less than ⍺.

A marketing consultant is hired by a major restaurant chain wishing to investigate the preferences and spending patterns of lunch customers. The CEO of the chain hypothesized that the average customer spends at least $13.50 on lunch. A survey of 25 customers sampled at one of the restaurants found the average lunch bill per customer to be = $14.50. Based on previous surveys, the restaurant informs the marketing manager that the standard deviation is σ = $3.50. To address the CEO&#39;s conjecture, the marketing manager carried out a hypothesis test of H0: µ = 13.50 vs. Ha: µ > 13.50 and obtained a P-value = 0.077. At a significance level of α = 0.05, this result: a) proves without a doubt that the average lunch bill exceeds $13.50. b) proves without a doubt that the average lunch bill does not exceed $13.50. c) provides evidence against the alternative hypothesis in favor of the null hypothesis. d) does not provide evidence against the null hypothesis in favor of the alternative hypothesis.

d) does not provide evidence against the null hypothesis in favor of the alternative hypothesis.


Related study sets

Intro to Physical Anthropology Chapter 6

View Set

Lab # 9 Structural Stains (Endospore, Capsule, and Flagella)

View Set

Biology Quiz Phospholipids and Steroids

View Set

P.E. B Test 2- Health Body Composition 1

View Set

FINA 4325 Unit 5: Traditional & Roth IRAs

View Set

Geometry - INVERSE AND IDENTITY TRANSFORMATION

View Set

APES Energy and Mining Test (Ch. 14, 15, & 16)

View Set

Chapter 6 Interest Groups and Lobbying

View Set