Session 9 Green Belt

Ace your homework & exams now with Quizwiz!

A nutritionist conducted an experiment to test whether there is any difference between a high-protein based diet and a complex-carbohydrate based diet with respect to average weight loss. The two-sided 95% confidence interval for the difference in average weight loss from the two diets was (-2.69, 15.20). What is the correct conclusion from this study? A) There is no evidence of a significant difference in the mean weight loss between the two diets at the 5% level. B) There is a significant difference in the mean weight loss between the two diets at the 5% level. C) The high-protein based diet is better than the complex-carbohydrate based diet at the 95% level. D) The complex-carbohydrate based diet is better than the high-protein based diet at the 5% level.

...

Parameter

A measurable characteristic of the population. There are usually two types of parameters - 'location' parameters (e.g. Population mean) and 'shape' parameters (e.g. Population variance or standard deviation)

Statistic

A measurable characteristic of the sample, used as an estimate of the corresponding parameter (e.g. Sample mean is an estimate of the population mean)

Hypothesis Test

A statistical technique used to help disprove or reject a particular conjecture about a process or population based on evidence from the sample data. This conjecture or hypothesis usually assumes a stand of status quo or "no difference", called the null hypothesis. The statement held to be true if the null is false is called the alternative hypothesis. Rejection of the null hypothesis results in concluding in favor of the alternative hypothesis. The goal of hypothesis testing is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of an alternative hypothesis - the theory or claim. An hypothesis test is a statistical decision that we make based on sample data; the decision to either reject the null hypothesis in favor of the alternative, or to fail to reject the null hypothesis.

Sample

A subset of the population, used for the purpose of testing.

The probability of obtaining a test statistic value at least as extreme as the observed test statistic (computed from the sample) under the assumption that the null hypothesis is true, is called the: A) P-value B) Level of significance C) Type I error D) Level of confidence E) Type II error

Answer A

What is the rejection region for the test? A) Reject H0 if p-value < 0.05 B) Reject H0 if p-value > 0.05 C) Reject H0 if p-value < 0.95 D) Reject H0 if p-value > 0.95 E) Reject H0 if p-value ≠ 0.95

Answer A

Which test should be used to compare two means when the population variances are unknown but assumed equal? A) Pooled variances t-test B) Unequal variances t-test C) Paired samples t-test D) Pooled variances z-test

Answer A

To determine whether or not a fertilizer is beneficial for plant growth, a gardener randomly selects 20 seeds to be planted under identical conditions. Ten seeds will be given no fertilizer (sample 2) and ten seeds will be given fertilizer (sample 1) according to the manufacturer's suggestion for use. At the end of a six-week period, the heights of all the plants will be measured and compared at a 5% significance level to see if the fertilized plants are taller. It is expected that the fertilized plants will grow taller than the unfertilized plants. The data are summarized in this table. Assume normality is a reasonable assumption. What is the null hypothesis for this situation (Fertilizer = Group 1, No Fertilizer = Group 2)? A) H0: µ1 - µ2 < 0 B) H0: µ1 - µ2 = 0 C) H0: µ1 - µ2 > 0 D) H0: µ1 - µ2 ≠ 0

Answer B

What is the false rejection of the null hypothesis called? A) Level of significance B) Type I error C) Level of confidence D) Confidence interval E) Type II error

Answer B

What is the p-value and the appropriate conclusion for this situation? A) p-value = 0.1156; Plan 1 is preferred. B) p-value = 0.2312; No difference in plans. C) p-value = 0.1156; Plan 2 is preferred. D) p-value = 0.1156; No difference in plans. E) p-value = 0.2312; Plan 2 is preferred.

Answer B Practice - 2 Proportions Test - Solution Null Hypothesis H0: p1 = p2 Alternative Hypothesis H1: p1 ≠ p2 P-value 0.2312 (The p-value is greater than our significance level of 0.05, so we fail to reject Ho) Conclusion There is not enough evidence to reject the null hypothesis. If the consumer were using this information to decide which provider to get his cellular service from, there is no difference as to his definition of quality of service. He should probably look to the pricing of the plans instead. View Solution with EngineRoom

What is the alternative hypothesis for this situation (Fertilizer = Group 1, No Fertilizer = Group 2)? A) H1: µ1 - µ2 ≥ 0 B) H1: µ1 - µ2 ≤ 0 C) H1: µ1 - µ2 > 0 D) H1: µ1 - µ2 = 0 E) H1: µ1 - µ2 < 0

Answer C

What is the sampling distribution of the test statistic called, assuming the null hypothesis to be true? A) Normal distribution B) Test distribution C) Null distribution D) True distribution

Answer C

Why is the Normal distribution appropriate when comparing two population proportion defectives (with sample sizes greater than 30 and p close to 0.5, for both populations)? A) The data are attribute (discrete). B) p1 and p2 are small. C) It provides a good approximation to the binomial distribution when np and n(1-p) are greater than 5 for each population. D) It is the easiest distribution to work with. E) All of the above.

Answer C

A researcher believes that Drug A is more effective (has a larger mean) than Drug B. What is an appropriate alternative hypothesis for any test he may conduct (note: µ = mean)? A) µ A > µ B B) µ A - µ B > 0 C) µ B < µ A D) All of the above E) None of the above

Answer D

What is the alternate hypothesis for this situation? Let p1 = proportion of defective locations for provider 1 and p2 = proportion of defective locations for provider 2. A) H1: p1 ≤ p2 B) H1: p1 > p2 C) H1: p1 < p2 D) H1: p1 ≠ p2 E) H1: p1 = p2

Answer D

What is the null hypothesis for this situation? Let p1 = proportion of defective locations for provider 1 and p2 = proportion of defective locations for provider 2. A) H0: p1 > p2 B) H0: p1 < p2 C) H0: p1 &#8800; p2 D) H0: p1 = p2

Answer D

Which set-up is an example of paired data? A) Two volleyball teams participate in a study to determine if more practice is related to getting more serves in play. One team practices two hours a day and the other practices one hour. After two weeks, each player gets 20 serves. The mean number of serves for each team is calculated and the two numbers are compared. B) To find out if there is a difference between housing prices in two neighborhoods, a realtor randomly chooses 10 houses in each of two neighborhoods. She calculates the mean price of houses in each neighborhood and compares the two values. C) To test the effect of a diet pill, a researcher randomly selects 20 subjects of various weights. Subjects are split into 2 groups with the groups being very similar as to the weight of the subjects. The researcher gives all subjects in one group the diet pill and calculates their mean weight loss after 2 weeks. The subjects in the second group receive a placebo and the researcher calculates their mean weight loss after 2 weeks. The researcher then uses the appropriate technique to compare the diet pill with the placebo. D) A company has 2 sets of scales and believes they give different weight measurements. A Black Belt uses standard weights of 5, 10, 15, 20, 25, 30, 35, 40 , 45, and 50 pounds to get a read-out for each set of scales. She then calculates the difference in the read-outs for each weight and uses the appropriate technique to see if the scales differ from one another as suspected.

Answer D

What is the p-value and the appropriate conclusion for this test? A) p-value = 0.011 Do not reject H0. The fertilizer didn't work. B) p-value = 0.006 Do not reject H1. The fertilizer worked. C) p-value = 0.994 Reject H1. The fertilizer worked. D) p-value = 0.006 Reject H0. The fertilizer worked. E) p-value = 0.011 Reject H0. The fertilizer didn't work.

Answer D Practice - 2 Sample t-test - Solution Null Hypothesis H0: µ1 - µ2 = 0 (δ0 = 0) Alternative Hypothesis H1: µ1 - µ2 > 0 (δ0 > 0) Degrees of Freedom 13.00 P-value P-value = 0.0057 The p-value is smaller than the significance level (α = 0.05), so we reject H0. Conclusion There is enough evidence to reject the null hypothesis. Also, because this was a controlled and randomized experiment and not an observational study, the gardener may conclude that the fertilizer did cause the plants to be taller under the conditions of his experiment.

What is the decision criteria for the test? A) Reject H0 if p-value ≠ significance level (α = 0.95) B) Reject H0 if p-value > significance level (α = 0.95) C) Reject H0 if p-value > significance level (α = 0.05) D) Reject H0 if p-value < significance level (α = 0.95) E) Reject H0 if p-value < significance level (α = 0.05)

Answer E

Binary Data

Data in which each individual trial can result in one of two outcomes only. Examples: A single unit of production is either defective or non-defective. A student taking a test will either pass it or not pass it. A pizza delivery is either on-time, or late.

Statistical Significance

If the sample size is small, a difference that is large enough to warrant attention may not be statistically significant. Hence, sample size and significance level should both be picked with care. Also keep in mind that hypothesis tests should not be applied in all situations. Sometimes, enough information is gained from the statistics themselves for the intended purpose.

Hypothesis Testing

Is a scientific method to test the validity of a claim made about a population under study, based on observed data. This claim, or theory, is called a hypothesis, and the population under study usually represents the input or output of a process. The hypothesis statement specifies certain characteristics or 'parameters' of the process, such as the location (e.g. mean) or spread (e.g. variance), or both. The null hypothesis states what we presume is currently true, or the 'status quo' situation, and is denoted by H0. The alternative hypothesis is the claim or theory that we believe or wish to prove about the process, denoted by H1.

Type of Inference can be used to answer the following questions

Is there a difference in the observed number or proportion of defects from two different machines or processes producing the same product? Is there a difference in the observed proportion of accepted data from two different populations?

Practical Significance

Sometimes, when the sample size is large enough, a test may find that there is a statistically significant difference even though the actual difference is too small to have any practical significance. In that case, the selected significance level may have been too high. And even though you can make your sample size large enough to reject the null hypothesis for small deviations, it is not always wise to do this.

Hypolthesis Test

Step 1 - State the null and alternative hypotheses Generate a hypothesis of interest that can be tested against an alternative hypothesis. More Step 2 - State the decision criteria This step is where we make the choice of alpha (&#945;), or the significance level. More Step 3 - Collect the data and calculate the test statistic Data is collected through a sample and the relevant test statistic is calculated using the sample data. More Step 4 - State a conclusion The appropriate test statistic is compared to its corresponding reference distribution (null distribution) which shows how the test statistic would be distributed if the null hypothesis were true. More

Hypothesis Testing - Step 2 - State the decision criteria

The level of significance or significance level of a test, denoted as α (Greek - alpha), is the probability of a Type I error (rejecting the null hypothesis when it is true). It denotes the risk we are prepared to take of falsely rejecting the null (understandably, this value is small, usually 1% or 5%). It is expressed as a value between 0 and 1. So, if alpha=0.05 we are saying that if we picked 100 samples from the process and did the same test on each of them, we would falsely reject the null hypothesis (commit a Type I error) 5% of the time. In other words, we would make the correct decision 95% percent of the time. This is known as the confidence level of the test, denoted as (1-alpha). Obviously, the lower the significance level, higher is the confidence level of the test. The convention is to use a 1%, 5% or rarely, a 10% significance level. The alpha we choose for our test determines the critical region, or the rejection region of H0, i.e., the set of values of the test statistic that will lead to rejection of the null hypothesis. The cut-off values for the rejection region come from the null distribution of the test statistic, i.e., the distribution of the test statistic under the assumption that the null hypothesis is true. In each of the hypothesis testing sections, the null distribution will be given, along with a statement of how to find the appropriate cut-off values.

P-Value

The probability of obtaining by random chance a value of the test statistic as extreme as or more extreme than that calculated from the sample, assuming the null hypothesis is true. In other words, it is the probability of rejecting the null hypothesis assuming the null hypothesis is true. We reject the null hypothesis in favor of the alternative when the p-value of our test is smaller than the significance (alpha) level.

Population

The set of all individuals/items about which we wish to make an inference.

Significance Level of A (Alpha)

The significance level of a test is defined as the probability of rejecting the null hypothesis when the null hypothesis is true. It is denoted by the Greek letter alpha. We choose the alpha level to be a suitable small value, generally 0.05, to evaluate our test. This is done before conducting the test. The null hypothesis is rejected if the p-value of the test is smaller than the pre-specified alpha.

Hypothesis Testing - Step 3 - Collect the data and calculate the test statistic

The test statistic is what we will actually use to test the null hypothesis. It is a measure that condenses all the information from the sample into a single value, relative to the value stated under the null. There are different test statistics for different hypothesis testing situations, as outlined in the following lesson. Most test statistics are in the form of the difference between the center of the sample data and the center of the null distribution, divided by the standard deviation of the null distribution, or its estimate. Or, they can be seen as simply a measure of location (center) divided by a measure of spread (standard deviation).

Hypothesis Testing - Step 4 - State a conclusion

This can be done in three different ways: 1) by comparing the statistic to the cut-off values which were determined in Step 2, 2) by calculating the p-value* and comparing it to the significance level, 3) and by computing confidence intervals. * The p-value is the probability of observing a value of the test statistic as extreme or more extreme than the calculated value, purely by random chance, given that the null hypothesis is true. Larger absolute values of the test statistic usually correspond to smaller p-values, providing stronger evidence that the null hypothesis is not true. MoreSteam Note: Finally, it is important to state the results of the hypothesis test in terms of the problem. You either reject or fail to reject the null hypothesis, but it should be stated how the results of the test affect decisions that will be made pertaining to the process.

Critical Questions that should be answered after Analyze Phase

What are the significant inputs (Xs) affecting the output of concern (also known as Ys or CTQCs)? What are the target levels of those inputs (Xs) that optimize the output of concern? Are the input processes stable and capable? What are the underlying sources of process variability? Have alternate methods been validated as effective? Are the interactions between inputs identified, understood, and optimized?

Hypothesis Testing Step 1 - State the Null and Alternative hypotheses

he hypothesis to be tested is called the Null Hypothesis, denoted as Ho, while the alternative to the null is called the alternative hypothesis, denoted as H1 or HA. Together, the null and alternative hypotheses account for the entire parameter space. For example, if the null states that the average time taken to fill out an order form is 20 minutes, then the alternative will state that the average time taken to fill out an order form is not equal to 20 minutes. If the null states that the average time is less than or equal to 20 minutes, then the alternative will state that it is (strictly) greater than 20 minutes. A typical formal hypothesis statement follows: Ho: µ = µo H1: µ ≠ µo In plain English, the null hypothesis, (H0) states that the population average, µ (Greek - mu), equals a stated value, µ0 (for example the industry standard). The alternative (H1) is a two-sided hypothesis that the population mean does not equal the stated value - thus, this is a test for the population mean being either greater than or less than the stated value (hence, two-sided or two-tailed). The other choices for the alternative hypothesis are one-sided alternatives for when it is believed before sampling that the population mean is either strictly greater than (H1: µ > µ0) or strictly less than (H1: µ < µ0) the stated value. The formal hypothesis statements may vary slightly with each hypothesis testing situation. In this course, we follow the convention of always stating the null as a strict equality and gauging the context from the alternative. This is because for continuous variables, the test procedure remains the same whether the null uses 'strictly equal to' or 'less-than-or-equal-to', as long as the alternative says 'greater than'. The goal of the hypothesis test is to establish strong enough evidence to reject the null hypothesis in favor of the stated alternative hypothesis.

Statistical dependence

ndustrial or business process outputs are often statistically dependent in that one datum may influence the next - if a process drifts upward, values will tend to be successively higher. This is called serial dependence. Many statistical tests assume independence, which is not usual in production settings. Experiments can often be randomized to offset this tendency toward serial dependence. One method is to collect samples at wider intervals rather than sequentially.


Related study sets

Cell Chapter 17 Cytoskeleton and chapter 11 membranes

View Set

ACG2071 - Chapter 6: Cost-volume-profit Analysis

View Set

Chapter52: Assessment of the Gastrointestinal System

View Set

The Risks and Future Benefits of GMOs

View Set

Malala: My Story of Standing up for Girls' Rights

View Set

Chapter 3 Analyzing business Transactions Using T Accounts

View Set