OSC Midterm

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Variance

(n.) - a difference between what is expected and what actually occurs Std Dev Squared

If we are testing for the difference between the means of two independent populations with samples of n1 = 20 and n2 = 20, the number of degrees of freedom is equal to:

20+20-2=38 degrees of freedom.

left-tailed test

A one-tailed test in which the sample outcome is hypothesized to be at the left tail of the sampling distribution.

Left-tailed vs right tailed

A right-tailed test is used when we want to check if something is bigger than a certain value, and a left-tailed test is used when we want to check if something is smaller than a certain value.

One-Way ANOVA

A statistical test used to analyze data from an experimental design with one independent variable that has three or more groups (levels).

A hospital emergency room has collected a sample of n = 40 to estimate the mean number of visits per day. It has found the standard deviation is 32. Using a 90 percent confidence level, what is its margin of error?

E = 1.645 * std dev / square root of n = 8.3231

If you reject the null hypothesis in ANOVA, you do not need to conduct any further analysis? T/F?

False

Suppose that two population proportions are being compared to test whether there is any difference between them. Assume that the test statistic has been calculated to be z = 2.21. Find the p-value for this situation.

Given that, This is the two tailed test , & z = 2.21 P(z > 2.21) = 1 - P(z < 2.21) = 0.0136 P-value = 2 * 0.0136 = 0.0272

In testing for differences between the means of two paired populations, an appropriate null hypothesis would be:

H0 : μD = 0

Ho

Null hypothesis

The total sum of squares (SST) is comprised of what two components?

SSB + SSW

Ratio Data

Same as ordinal but the zero is meaningful, e.g. age, income, speed, etc.

In conducting a hypothesis test for the difference between two population means where the standard deviations are known and the null hypothesis is: H0 : μA - μβ ≥ 0 What is the p-value assuming that the test statistic has been found to be z = 2.52?

The p-value= P(Z> 2.52) =0.0059

All other things held constant, increasing the level of confidence for a confidence interval estimate for the difference between two population means will result in a wider confidence interval estimate. T/F?

True

What is the null hypothesis in the ANOVA setting

all means are equal

Ordinal Data

data exists in categories that are ordered but differences cannot be determined or they are meaningless. (Example: 1st, 2nd, 3rd)

Nominal Data

data of categories only. Data cannot be arranged in an ordering scheme. (Gender, Race, Religion)

A researcher is using a chi-square test to determine whether there are any preferences among 4 brands of orange juice. With alpha = 0.05 and n = 30, the critical region for the hypothesis test would have a boundary of:

from chi square table with 4-1 = 3 DF at 0.05 level, = 7.81

µ

population mean

If a contingency analysis test is performed with a 4 × 6 design, and if alpha = .05, the critical value from the chi-square distribution is 24.9958 T/F?

#contigency table has 4=row and 6 column df=(r-1)*(c-1)=(4-1)*(6-1)=15 # critical value=0.05,df=0.05,15=24.996 #value of chi-square is obtain from chi-square table with correspoding df=15 or unig Excell =CHIINV(0.05,15) TRUE

Type 2 Error

Also known as false negative or beta (β) 1 −β is the power of a test

Type 1 Error

Also known as false positive or alpha (α) α is chosen by researcher Occurs if an investigator rejects a null hypothesis that is actually true in the population

Construct a 95% percent confidence interval for the difference between two population means using the following sample data that have been selected from normally distributed populations with different population variances: Sample 1: 473, 386, 406, 379, 346, 438, 391, 328, 388, 388, 456, 429 Sample 2: 349, 359, 346, 395, 398, 401, 411, 384, 363, 437, 388, 273

Formula 2 (-10.4, 61.07)

Hypothesis Testing

Hypothesis testing is used to determine whether a statement about the value of a population parameter should or should not be rejected To determine whether we believe the statement or if we should reject it, we will compare the statistics we get from a sample with the parameters that would be measured IF the hypothesized values were true

What if the interval contains 0?

If the interval contains 0, we say that it is not statistically significant The actual difference between the two populations could be anywhere in the interval-positive, negative, or 0 In such a case, the data will suggest that there may be no difference between the two population parameters in question

When the variables of interest are both categorical and the decision maker is interested in determining whether a relationship exists between the two, a statistical technique known as contingency analysis is useful. T/F?

True. Contingency tables are used in statistics to summarize the relationship between several categorical variables. A contingency table is a special type of frequency distribution table.

If SST = 4000 and SSB = 2000, what is SSW?

4000 - 2000 = 2000

A cell phone service provider has selected a random sample of 20 of its customers in an effort to estimate the mean number of minutes used per day. The results of the sample included a sample mean of 34.5 minutes and a sample standard deviation equal to 11.5 minutes. Based on this information, and using a 95 percent confidence level: what is the critical value?

Degrees of freedom = n - 1 = 20 - 1 = 19 Using Excel, we'd get 2.093

The U.S. Post Office is interested in estimating the mean weight of packages shipped using the overnight service. They plan to sample 300 packages. A pilot sample taken last year showed that the standard deviation in weight was about 0.15 pound. If they are interested in an estimate that has 95 percent confidence, what margin of error can they expect?

E = (1.96 * 0.15)/square root of 300 = 0.017

Which test statistic is used for testing hypothesis in ANOVA?

F-statistic

Construct a 90% confidence interval estimate for the difference between two population means given the following sample data selected from two normally distributed populations with equal variances: Sample 1: 29, 25, 31, 35, 35, 37, 21, 29, 34 Sample 2: 42, 39, 38, 42, 40, 43, 46, 39, 35

Formula 4 (-13.34, -6.2)

Given the following null and alternative hypotheses, conduct a hypothesis test using an alpha equal to 0.05. (Note: The population standard deviations are assumed to be known.) H0: u1 ≤ u2 HA: u1 > u2 x1 = 144 x2 = 129 S1 = 11 S2 = 16 n1 = 40 n2 = 50

Formula 5 =5.26 Since z = 5.26 > 1.645, we reject the null hypothesis. Based on the sample data we conclude that the mean for population 1 exceeds the mean for population 2.

The Post Office in Mobile, Alabama wants to see if its operational efficiency has improved. Specifically they want to know if the average time to serve customers has dropped below 540 seconds (9 minutes). They select a sample of 16 customer visits and find that ̄x = 510 and s = 45. Use α = 0.01

H0 : μ ≥ 540, HA : μ < 540 t = (510 - 540)/(45/square root of 16) ≈ −2.667 Find the critical value in Excel t.inv (.01,15) ≈ −2.602 Because the test statistic is more extreme than the critical value −2.667 < −2.602, we reject the null hypothesis If the negative numbers are confusing, just compare absolute values

Hypothesis Table

NA | Null | Alternative Left-Tailed | μ1 ≥ μ2 | μ1 < μ2 Right-Tailed | μ1 ≤μ2 | μ1 > μ2 Two-Tailed | μ1 = μ2 | μ1 ̸= μ2

The goodness-of-fit test is always a one-tail test with the rejection region in the upper tail. T/F?

True

A pet food producer manufactures and then fills 25-pound bags of dog food on two different production lines located in separate cities. In an effort to determine whether differences exist between the average fill rates for the two lines, a random sample of 19 bags from line 1 and a random sample of 23 bags from line 2 were recently selected. Each bag's weight was measured and the following summary measures from the samples were reported: n1 = 19 n2 = 23 x1 = 24.96 x2 = 25.01 S1 = 0.07 S2 = 0.08 Management believes that the fill rates of the two lines are normally distributed with equal variances. a) Calculate the point estimate for the difference between the population means of the two lines. b) Develop a 95% confidence interval estimate of the true mean difference between the two lines. c) Based on the 95% confidence interval estimate calculated in part b, what can the managers of the production lines conclude about the differences between the average fill rates for the two lines?

a) 24.96 - 25.01 = -0.05 b) Formula 4 (-0.0974, -0.0026) c) Since the interval does not contain zero, the managers can conclude the two lines do not fill bags with equal average amounts. However, the difference is at most about 0.1 lb.

You are given the following null and alternative hypotheses: H0: u1 - u2 = 0 HA: u1 - u2 ≠ 0 n1 = 125 n2 = 120 S1 = 31 S2 = 38 x1 = 130 x2 = 105 a) Develop the appropriate decision rule, assuming a significance level of 0.05 is to be used. b) Test the null hypothesis and indicate whether the sample information leads you to reject or fail to reject the null hypothesis. Use the test statistic approach.

a) Formula 6 5.652 Since 5.652 > 1.9698 reject H0

A walk-in medical clinic believes that arrivals are uniformly distributed over weekdays (Monday through Friday). It has collected the following data based on a random sample of 100 days. Frequency: Mon | 25 Tue | 22 Wed | 19 Thu | 18 Fri | 16 Total | 100 Based on this information how many degrees for freedom are involved in this goodness of fit test?

degrees of freedom = 5 - 1 = 4

The Conrad Real Estate Company recently conducted a statistical test to determine whether the number of days that homes are on the market prior to selling is normally distributed with a mean equal to 50 days and a standard deviation equal to 10 days. The sample of 200 homes was divided into 8 groups to form a grouped data frequency distribution. If a chi-square goodness-of-fit test is to be conducted using an α = 0.05, the critical value is 14.0671. T/F?

df = n-1= 7 The critical value of chi square test using excel function "=CHIINV(0.05,7)" is 14.0671. Hence, correct option is true.

Paired Samples

in hypothesis testing, the observations are paired so that the two sets of observations relate to the same respondents ie Pre-test & Post-Test scores Paired samples are used to control for outside sources of variation

Always assume α = ____ unless told otherwise

0.05

Hypothesis Testing - p-value Method

1. Same first four steps as the critical method 2. Instead of calculating a critical value, find the p-value 3. p-value tells us the probability of observing our test statistic given the assumption that the null hypothesis is true 4. Compare the p-value to α 5. If p-value is smaller than α reject the null hypothesis 6. If p is low, H0 must go 7. Critical value and p-value method always result in the same conclusion

Construct a 98% confidence interval estimate for the population mean given the following values: x = 120, std dev = 20, n = 50

120 +- (2.33 * 20 / square root of 50) = (113.41 , 126.59)

If we are testing for the difference between the means of two paired populations with samples of n1 = 20 and n2 = 20, the number of degrees of freedom is equal to:

20+20-2=38 degrees of freedom. Since it is a paired sample we only use the size of one of the samples to calculate degrees of freedom: 20-1=19 degrees of freedom.

A manager is interested in testing whether three populations have equal population means. Simple random samples of size 10 were selected from each population. The following ANOVA table and related statistics were computed: Anova: Single Factor Groups | Count | Sum | Average | Variance Sample 1 | 10 | 507,18 | 50.72 | 35.06 Sample 2 | 10 | 405.79 | 40.58 | 30.08 Sample 3 | 10 | 487.64 | 48.76 | 23.13 Anova: Source | SS | dF | MS | F | p-value | F-ratio Between groups | 578.78 | 2 | 289.39 | 9.84 | 0.0006 | 3.354 Within Groups | 794.36 | 2.7 | 29.42 | -- | -- | -- Total | 1,373.14 | 29 | -- | -- | -- | -- a) State the appropriate null and alternative hypotheses. b) Based on your answer to part a, what conclusions can you reach about the null and alternative hypotheses? Use a 0.05 level of significance. c) If warranted, use the Tukey-Kramer procedure for multiple comparisons to determine which populations have different means. (Assume alpha = 0.05) 12-3

...

A paired sample study has been conducted to determine whether two populations have equal means. Twenty paired samples were obtained with the following sample results: d-bar = 12.45 Sd = 11 Based on these sample data and a significance level of 0.05, what conclusion should be made about the population means? 10-40

...

A student club wishes to determine whether there are differences in new textbook prices at on-campus bookstores, off-campus bookstores, and Internet bookstores. To control for differences in textbook prices that might exist across disciplines, suppose the students randomly selected 12 textbooks and recorded the price of each of the 12 books at each of the three retailers. You are given that normality and equal-variance assumptions have been met. The partially completed ANOVA table based on the study's findings is shown here: Source of Variation | SS | dF | MS | F Textbooks | 16624 | -- | -- " -- Retailer | 2.4 | -- | -- | -- Error | -- | -- | -- | -- Total | 17477.6 | -- | -- | -- a) Complete the ANOVA table by filling in the missing sums of squares, the degrees of freedom for each source, the mean square, and the calculated F-test statistic for each possible hypothesis test. b) Based on the study's findings, was it correct to block for differences in textbooks? Conduct the appropriate test at the alpha = 0.10 level of significance. c) Based on the study's findings, can you conclude that there is a difference in the average price of textbooks across the three retail outlets? Conduct the appropriate hypothesis test at the alpha = 0.10 level of significance. 12-18

...

An article in The American Statistician (M. L. R. Ernst, et al., "Scatterplots for unordered pairs," 50 (1996), pp. 260-265) reports on the difference in the measurements by two evaluators of the cardiac output of 23 patients using Doppler echocardiography. Both observers took measurements from the same patients. The measured outcomes were as follows: Conduct a hypothesis test to determine if the average cardiac outputs measured by the two evaluators differ. Use a significance level of 0.02. Assume the population variances to be equal. 10-47

...

One of the advances that helped to diminish carpal tunnel syndrome is ergonomic keyboards. The ergonomic keyboards may also increase typing speed. Ten administrative assistants were chosen to type on both standard and ergonomic keyboards. The resulting word-per-minute typing speeds follow: a) Were the two samples obtained independently? Support your assertion. b) Conduct a hypothesis test to determine if the ergonomic keyboards increase the average words per minute attained while typing. Use a p-value approach with a significance level of 0.01. Assume equal population variances. 10-43

...

Respond to each of the following questions using this partially completed one-way ANOVA table: Source | SS | dF | MS | F-ratio Between Samples | -- | 3 | -- | -- Within Samples | 405 | &#160; | -- | -- Total | 888 | 31 | -- | -- a) How many different populations are being considered in this analysis? b) How many different populations are being considered in this analysis? c) State the appropriate null and alternative hypotheses. d) Based on the analysis of variance F-test, what conclusion should you reach regarding the null hypothesis? Test using alpha = 0.05 12-5

...

Respond to each of the following questions using this partially completed one-way ANOVA table: Source | SS | dF | MS | F-ratio Between Samples | 1,745 | -- | -- | -- Within Samples | &#160; | 240 | -- | -- Total | 6504 | 246 | -- | -- a) How many different populations are being considered in this analysis? b) Fill in the ANOVA table with the missing values c) State the appropriate null and alternative hypotheses. d) Based on the analysis of variance F-test, what conclusion should you reach regarding the null hypothesis? Test using a significance level of 0.01. 12-4

...

Suppose a project was undertaken by the Department of Labor in a southern state to determine whether the most recent economic recession affected men and women differently. Assume that random samples of 485 adult males and 242 adult females were selected. The subjects were asked whether they had been involuntarily unemployed for four or more consecutive weeks at any time during the past 8 years. Fifty of the men responded "Yes" to the question, and 36 women responded "Yes." Based on these sample data, what should the analysis by the Department of Labor conclude about whether there is a statistical difference in the proportions of men and women who were unemployed? Test at an alpha = 0.05 level. 10-59

...

Suppose, as part of a national study of economic competitiveness, a marketing research firm randomly sampled 200 adults between the ages of 27 and 35 living in metropolitan Seattle and 180 adults between the ages of 27 and 35 living in metropolitan Minneapolis. Each adult selected in the sample was asked, among other things, whether he or she had a college degree. From the Seattle sample, 66 adults answered yes, and from the Minneapolis sample, 63 adults answered yes. Based on the sample data, can we conclude that there is a difference between the population proportions of adults between the ages of 27 and 35 in the two cities with college degrees? Use a level of significance of 0.01 to conduct the appropriate hypothesis test. 10-57

...

The United Way raises money for community charity activities. In one community, the fundraising committee was concerned about whether there is a difference in the proportion of employees who give to United Way depending on whether the employer is a private business or a government agency. A random sample of people who had been contacted about contributing last year was selected. Of those contacted, 70 worked for a private business and 50 worked for a government agency. For the 70 private-sector employees, the mean contribution was $230.25 with a standard deviation equal to $55.52. For the 50 government employees in the sample, the mean and standard deviation were $309.45 and $61.75, respectively. Assume equal population variances. a) Based on these sample data and α = 0.05, what should the committee conclude? Be sure to show the decision rule. b) Construct a 95% confidence interval for the difference between the mean contributions of private business and government agency employees who contribute to United Way. Do the hypothesis test and the confidence interval produce compatible results? Explain and give reasons for your answer. 10-45

...

The following data were collected for a randomized block analysis of variance design with four populations and eight blocks: NA | Group 1 | Group 2 | Group 3 | Group 4 Block 1 | 56 | 44 | 57 | 84 Block 2 | 34 | 30 | 38 | 50 Block 3 | 50 | 41 | 48 | 52 Block 4 | 19 | 17 | 21 | 30 Block 5 | 33 | 30 | 35 | 38 Block 6 | 74 | 72 | 78 | 79 Block 7 | 33 | 24 | 27 | 33 Block 8 | 56 | 44 | 56 | 71 a) State the appropriate null and alternative hypotheses for the treatments and determine whether blocking is necessary. b) Construct the appropriate ANOVA table. c) Using a significance level equal to 0.05, can you conclude that blocking was necessary in this case? Use a test-statistic approach. d) Based on the data and a significance level equal to 0.05, is there a difference in population means for the four groups? Use a p-value approach. e) If you found that a difference exists in part d, use the LSD approach to determine which populations have different means. 12-19

...

The following samples are observations taken from the same elements at two different times: a) Assume that the populations are normally distributed and construct a 90% confidence interval for the difference in the means of the distribution at the times in which the samples were taken. b) Perform a test of hypothesis to determine if the difference in the means of the distribution at the first time period is 10 units larger than at the second time period. Use a level of significance equal to 0.10. 10-41

...

You are given the following sample data: Item | Group 1 | Group 2 | Group 3 | Group 4 1 | 20.9 | 28.2 | 17.8 | 21.2 2 | 27.2 | 26.2 | 15.9 | 23.9 3 | 26.6 | 21.6 | 18.4 | 19.5 4 | 22.1 | 29.7 | 20.2 | 17.4 5 | 25.3 | 30.3 | 14.1 | -- 6 | 30.1 | 25.9 | -- | -- 7 | 23.8 | -- | -- a) Based on the computations for the within- and between-sample variation, develop the ANOVA table and test the appropriate null hypothesis using alpha = 0.05. Use the p-value approach. b) If warranted, use the Tukey-Kramer procedure to determine which populations have different means. Use alpha = 0.05. 12-6

...

Critical Value Method

1. Formulate the null and alternative hypotheses and determine if the test is 1 or 2 sided 2. Set the sample size, collect the data and compute the appropriate test statistic 3. Determine the level of error that can be tolerated (α) 4. Draw a picture 5. Find the critical value(s) and label the rejection zone 6. Compare the critical value with the test statistic. If the test statistic is more extreme than the critical value, we reject the null hypothesis

Finding the Confidence Interval for Paired Samples

1. Paired Difference: d = x1 −x2 2. Point Estimate for the Population Mean Paired Difference: d-bar = Σdi / n 3. Std Dev for Paired Differences: Sd = square root of [ Σ(xi − ̄x )^2 / n - 1 4. Confidence Interval for Paired Differences: d ±t * (Sd/square root of n)

Discount Sounds has 260 retail outlets throughout the UnitedStates. The firm is evaluating a potential location for a new outlet, based in part, on the mean annual income of the individuals in the marketing area of the new location. A sample of size n = 36 was taken; the sample mean incomeis $41,100. The population standard deviation is estimated tobe $4,500, and the confidence level to be used in the intervalestimate is .95. Determine the confidence interval.

41,100 ± 1.96 * (4,500/square root of 36) 41,100 ± 1,470 (39,630 ≤μ ≤42,570) This means that we are 95% sure that the actual populationmean falls between 39,630 and 42,570

A chi-square test for goodness-of-fit is used to test whether or not there are any preferences among 3 brands of peas. If the study uses a sample of n = 60 subjects, then the expected frequency for each category would be:

60/3 = 20

A walk-in medical clinic believes that arrivals are uniformly distributed over weekdays (Monday through Friday). It has collected the following data based on a random sample of 100 days. Frequency: Mon | 25 Tue | 22 Wed | 19 Thu | 18 Fri | 16 Total | 100 Assuming that a goodness-of-fit test is to be conducted using a 0.10 level of significance, the critical value is:

7.7794 Idk how

Consider a goodness-of-fit test with a computed value of chi-square = 1.273 and a critical value = 13.388, the appropriate conclusion would be to:

A goodness of fit test with a computed value of chi-square = 1.273 critical value = 13.388 We know reject Ho is if tabulated value > critical value. But here tabulated value < critical value So we fail to reject Ho

Two-Tailed Test

A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in either tail of its sampling distribution.

One-Tailed Test

A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in one tail of its sampling distribution.

right-tailed test

A one-tailed test in which the sample outcome is hypothesized to be at the right tail of the sampling distribution.

A walk-in medical clinic believes that arrivals are uniformly distributed over weekdays (Monday through Friday). It has collected the following data based on a random sample of 100 days. Frequency: Mon | 25 Tue | 22 Wed | 19 Thu | 18 Fri | 16 Total | 100 Based on these data, conduct a goodness-of-fit test using a 0.10 level of significance. Which conclusion is correct?

Arrivals are uniformly distributed over the weekday because (test statistic) < (critical value).

Hypothesis Testing vs. Confidence Intervals

Confidence intervals and 2-tailed hypothesis tests are directly comparable Can you create a confidence interval based on the sample data using 90% confidence? A market research consultant hired by the Pepsi-Cola Co. is interested in knowing if the proportion of consumers who favor Pepsi-Cola over Coke Classic is different than 50%. A random sample of 250 consumers from the market under investigation provided a sample proportion of .464

Interval Data

Differences between values can be found, but there is no absolute 0. (Temp. and Time)

The categories of a factor in ANOVA are referred to as?

Levels

A survey was recently conducted in which males and females were asked whether they owned a laptop personal computer. The following data were observed: Males | Females Have Laptop | 120 | 70 No Laptop | 50 | 60 Given this information, if an alpha level of .05 is used, the test statistic for determining whether having a laptop is independent of gender is approximately 14.23. T/F?

False

Choosing an alpha of 0.01 will cause beta to equal 0.99. T/F?

False

In conducting a test of independence for a contingency table that has 4 rows and 3 columns, the number of degrees of freedom is 11. T/F?

False

In estimating the difference between two population means, if a 95 percent confidence interval includes zero, then we can conclude that there is a 95 percent chance that the difference between the two population means is zero. T/F?

False

Recently the managers for a large retail department store stated that a study has revealed that female shoppers spend on average 23.5 minutes longer in the store per visit than do male shoppers. Based on this information, the managers can be confident that female shoppers, as a population, do spend longer times in the store than do males shoppers, as a population. T/F?

False

The NCAA is interested in estimating the difference in mean number of daily training hours for men and women athletes on college campuses. They want 95 percent confidence and will select a sample of 10 men and 10 women for the study. If the NCAA assumes that the population standard deviations are known, the critical value for the confidence interval is t = 2.1009. T/F?

False

The Tukey-Kramer method for multiple comparisons can only be used when the analysis of variance design is balanced. T/F?

False

In each of the following cases, determine if the sample sizes are large enough so that the sampling distribution of the differences in the sample proportions can be approximated with a normal distribution: a) n1 = 15, n2 = 20, x1 = 6, & x2 = 16 b) n1 = 10, n2 = 30, p1 = 0.6, x2 = 19 c) n1 = 25, n2 = 16, x1 = 6, p2 = 0.4 d) n1 = 100, n2 = 75, p1 = 0.05, p2 = 0.05

For all parts of this problem this problem let q = 1 - p a) n1p1 = x1 = 6 > 5, n1q1 = n1 - x1 = 15 - 6 = 9 > 5; n2p2 = x2 = 16 > 5, n2q2 = n2 - x2 = 20 - 16 = 4 < 5. The last test failed. Therefore, the sampling distribution cannot be approximated with a normal distribution. b) n1p1 = x1 = 10(0.60) = 6 > 5, n1q1 = 10(.4) = 4 < 5; n2p2 = x2 = 19 > 5, n2q2 = n2 - x2 = 30 - 19 = 11 > 5. The second test failed. Therefore, the sampling distribution cannot be approximated with a normal distribution. c) n1p1 = x1 = 6 > 5, n1q1 = n1 - x1 = 25 - 6 = 19 >5; n2p2 = x2 = 16(0.4) = 6.4 > 5, n2q2 = n2 - x2 = 16 - 6.4 = 9.6 > 5. There were no tests that failed. Therefore, the sampling distribution can be approximated with a normal distribution. d) n1p1 = 100(0.05) = 5 5, n1q1 = 100(0.95) = 95 >5; n2p2 = x2 = 75(0.05) = 3.75 < 5, n2q2 = 75(0.95) = 71.25 > 5. The third test failed. Therefore, the sampling distribution cannot be approximated with a normal distribution.

Barton Industries operates two manufacturing facilities that specialize in doing custom manufacturing work for the semiconductor industry. The facility in Denton, Texas, is highly automated, whereas the facility in Lincoln, Nebraska, has more manual functions. For the past few months, both facilities have been working on a large order for a specialized product. The vice president of operations is interested in estimating the difference in mean times it takes to complete a part on the two lines. To do this, he has requested that a random sample of 15 parts at each facility be tracked from start to finish and the times required be recorded. The following sample data were recorded: Denton, Texas: x1 = 56.7 hrs S1 = 7.1 hours Lincoln, Nebraska: x2 = 70.4 hours S1 = 8.3 hours Assuming that the populations are normally distributed with equal population variances, construct and interpret a 95% confidence interval estimate.

Formula 4 (-19.47, -7.93)

The following null and alternative hypotheses have been stated: H0: u1 - u2 = 0 HA: u1 - u2 ≠ 0 To test the null hypothesis, random samples have been selected from the two normally distributed populations with equal variances. The following sample data were observed: Sample 1 (Population): 33, 29, 35, 39, 39, 41, 25, 33, 38 Sample 2 (Population): 46, 43, 42, 46, 44, 47, 50, 43, 39 Test the null hypothesis using an alpha level equal to 0.05.

Formula 6 -4.80 Since t = -4.80 < -2.1199, we reject Based on the sample data, we conclude that the mean for population 1 is not equal to the mean for population 2.

High Mountain produces a variety of climbing and mountaineering equipment. One of its products is a traditional three-strand climbing rope. An important characteristic of any climbing rope is its tensile strength. High Mountain produces the three-strand rope on two separate production lines: one in Bozeman and the other in Challis. The Bozeman line has recently installed new production equipment. High Mountain regularly tests the tensile strength of its ropes by randomly selecting ropes from production and subjecting them to various tests. The most recent random sample of ropes, taken after the new equipment was installed at the Bozeman plant, revealed the following: x1 = 7200 lb x2 = 7087 lb S1 = 425 lb S2 = 415 lb n1 = 25 n2 = 20 High Mountain's production managers are willing to assume that the population of tensile strengths for each plant is approximately normally distributed with equal variances. Based on the sample results, can High Mountain's managers conclude that there is a difference between the mean tensile strengths of ropes produced in Bozeman and Challis? Conduct the appropriate hypothesis test at the 0.05 level of significance.

Formula 6 =0.896 Because the calculated value of t = 0.896 is neither less than the lower tail critical value of t = -2.0167, nor greater than the upper tail critical value of t = 2.0167, do not reject the null hypothesis. Based on these sample data, at the α = 0.05 level of significance there is not sufficient evidence to conclude that the average tensile strength of ropes produced at the two plants is different.

A store believes that its biggest spenders per transaction has switched from young shoppers (under 30) to middle aged shoppers (30-50). To test this, they pulled the last 400 purchases to determine how much money was spent on an average purchase by each group The sample of 311 young shopper purchases had a mean of$42 with a standard deviation of $18. The 89 middle aged shoppers spent on average $47 with a standard deviation of$19.5 With α = 0.05, determine if there is conclusive evidence that middle aged shoppers spend more on average per purchase than young shoppers do?

H0 : μY ≥μMA, HA : μY < μMA, left-tailed test xY = 42, sy = 18, ny = 311 xMA = 47, sMA = 19.5, nMA = 89 Sp = square root of [ (311 - 1) * 18^2 + (89 - 1) * 19.5^2 / (311 + 89 −2) ] ≈ 18.342 Square root of [ (1/311) + (1/89) ] ≈ 0.120 Calculate the test statistic: t = [(42 - 47) - 0] / (18.342 * 0.120) ≈ −2.268 Find the critical value: t.inv (.05,398) ≈ −1.649 Find the p-value: t .dist (−2.268,398) ≈ 0.012 Didn't give rest of answer...

An analyst is interested in testing whether four populations have equal means. The following sample data have been collected from populations that are assumed to be normally distributed with equal variances: Sample 1: 9, 6, 11, 14, 14 Sample 2: 12, 16, 16, 12, 9 Sample 3: 8, 8, 12, 7, 10 Sample 4: 17, 15, 17, 16, 13 Conduct the appropriate hypothesis test using a significance level equal to 0.05.

H0: u1 = u2 = u3 = u4 HA: not all u are equal x1 = 10.8 x2 = 13 x3 = 9 x4 = 15.6 The grand mean is x = 12.1 The F critical value from the F- distribution for alpha = 0.05 and with D1 = 3 and D2 = 16 degrees of freedom is 3.239. Thus, the decision rule is: If the test statistic F > 3.239, reject the null hypothesis Otherwise do not reject Table on answer sheet, rest should be solved in excel Since the test statistic = F = 5.905 > 3.239, reject the null hypothesis. Also, using the p-value approach, because p-value = 0.0065 < 0.05, we reject the null hypothesis. 12-2

We are interested in determining whether the opinions of the individuals on gun control (as to Yes, No, and No Opinion) are uniformly distributed.A sample of 150 was taken and the following data were obtained. Do you support gun control? | # of responses Yes | 40 No | 60 No Opinion | 50 The conclusion of the test with alpha = 0.05 is that the views of people on gun control are:

If they are uniformly distributed, then the expected frequency for each of these would be 150/3 = 50 Expected frequency of each is 50 Test statistic = (40 - 50)^2 / 50 + (60 - 50) ^ 2 / 50 + (50 - 50) ^ 2 / 50 = 4 Solving critical value (idk how) = 5.9915 Since test statistic is less than critical value, we fail to reject Ho at x=0.05 and conclude that the views are uniformly distributed

The produce manager for a large retail food chain is interested in estimating the percentage of potatoes that arrive on a shipment with bruises. A random sample of 150 potatoes showed 14 with bruises. Based on this information, what is the margin of error for a 95 percent confidence interval estimate?

Level of significance is 0.95, remaining 5% is divided into 2 halves, 2.5% for left region and 2.5% for right, it is a 2 tailed test so we only need to consider 1 half of the critical region so 2.5% Total probability = 1 - 0.025 = 0.975 normsinv(0.975) = 1.96 Margin of error = square root of [ p (1 - p) / n ] n = 150 Then do 1.96 * the margin of error (0.0237) = 0.046

A reporter for a student newspaper is writing an article on the cost of off-campus housing. A sample of 16 efficiency apartments within a half-mile of campus resulted in a sample mean of $750 per month and a sample standard deviation of $55. Let us provide a 95% confidence interval estimate of the mean rent per month for the population of efficiency apartments within a half-mile of campus. We will assume this population to be normally distributed.

Need to find the t-value Look it up in a book or use Excel, = t.inv.2t (1 −α,d .f .) In this case = t.inv.2t (.05,15) = 2.131 750 ± 2.131 * (55/square root of 16) 750 ± 29.3 (720.7 ≤μ ≤779.3)

There are a number of highly touted search engines for finding things of interest on the Internet. Recently a consumer rating system ranked two search engines ahead of the others. Now, a computer user's magazine wishes to make the final determination regarding which one is actually better at finding particular information. To do this, each search engine was used in an attempt to locate specific information using specified keywords. Both search engines were subjected to 100 queries. Search engine 1 successfully located the information 88 times and search engine 2 located the information 80 times. Using a significance level equal to 0.05, what is the null hypothesis to be tested?

P1=88/100 =0.88 p2 =80/100 =0.8 SE1 =sqrt(0.88*(1-0.88)/100) = 0.0325 SE2 =sqrt(0.8*(1-0.8)/100) =0.04 SE =sqrt(SE1^2+SE2^2) =sqrt( (0.0325^2+0.04^2) SE =0.05154 Z =(p1-p2)/SE =(0.88-0.80)/0.05154 =1.54 now at 0.05 significance level z =1.96 since test statistics Z (1.55) lies between -1.96 and 1.96 Based on the sample data, there is not sufficient evidence to conclude that a difference exists between the proportion of search hits since the test statistic, z = 1.54, does not fall in the rejection region.

A decision maker wishes to test the following null and alternative hypotheses using an alpha level equal to 0.05: H0 : μ1 - μ2 = 0HA : μ1 - μ2 ≠ 0 The population standard deviations are assumed to be known. After collecting the sample data, the test statistic is computed to be z = 1.78Using the p-value approach, what decision should be reached about the null hypothesis?

Since p-value = 0.0375 > α/2 = 0.025, do not reject the null hypothesis.

The following information is based on independent random samples taken from two normally distributed populations having equal variances: n1 = 15 n2 = 13 x1 = 50 x2 = 53 S1 = 5 S2 = 6 Based on the sample information, determine the 90% confidence interval estimate for the difference between the two population means.

Sp = sqrt [ (n1 - 1)S1^2 + (n2 - 1)S2^2]/n1 + n2 - 2 Interval estimate = (x1 - x2) +-

What does it mean to reject the null hypothesis?

That there is significant difference between control group and intervention group(s)

The following paired sample data have been obtained from normally distributed populations. Construct a 90% confidence interval estimate for the mean paired difference between the two population means. Sample # | Pop 1 | Pop 2 1 | 3693 | 4635 2 | 3679 | 4262 3 | 3921 | 4293 4 | 4106 | 4197 5 | 3808 | 4536 6 | 4394 | 4494 7 | 3878 | 4094

The confidence interval estimate can be developed using the following steps: 1. Define the population value of interest. The samples are paired so the population value of interest is , the mean paired difference between the two populations. 2. Specify the desired confidence level. 3. Collect the sample data and compute the point estimate, and the standard deviation of the paired differences. The first thing we must do is compute the paired differences shown as follows: d1 = -942 d2 = -583 d3 = -372 d4 = -91 d5 = -728 d6 = -100 d7 = -216 4. Calculate std dev (formula 9) 5. Determine the critical value, t, from the t-distribution table The confidence level is specified to be 90 percent. The critical value will be a t-value from the t distribution with 7-1 = 6 degrees of freedom. From the t-table we get t = 1.9432 6. Compute the confidence interval estimate. (Formula 10) (-674, -191)

Which of the following statements is true? a) The alternative hypothesis should contain the equality. b) Alpha represents the probability of making a Type II error. c) Alpha and beta are directly related such that when one is increased the other will increase also. d) The decision maker controls the probability of making a Type I statistical error.

The decision maker controls the probability of making a Type I statistical error.

A decision maker wishes to test the following null and alternative hypotheses using an alpha level equal to 0.05: H0 : μ1- μ2= 0 HA : μ1- μ2≠ 0 The population standard deviations are assumed to be known. After collecting the sample data, the test statistic is computed to be z = 1.78Using the test statistic approach, what conclusion should be reached about the null hypothesis?

The p-value= P(Z>1.78) =0.0375 (from standard normal table) Since p-value = 0.0375 > ?/2 = 0.025, do not reject the null hypothesis.

The logic behind the chi-square goodness-of-fit test is based on determining how far the actual observed frequencies are from the expected frequencies. T/F?

True

Given the following null and alternative hypotheses and level of significance. H0: p1 = p2 HA: p1 ≠ p2 Alpha = 0.1 together with the sample information n1 = 120 n2 = 150 x1 = 42 x2 = 57 conduct the appropriate hypothesis test using the p-value approach. What conclusion should be reached concerning the null hypothesis?

The sample proportions are p1 = 42/120= 0.35 and p2 = 57/150= 0.38 Use formula 14 for test statistic The probability of finding a z-value this small or smaller when the null hypothesis is true is approximately 0.5- 0.1950 =0.3050. Because this is a two-tailed test the p-value is twice this amount. Therefore, the p-value is 2*0.3050 = 0.61. There is evidence to reject the null hypothesis when the p-value is smaller than α. Here, because the p-value is greater than α, we do not reject the null hypothesis. Conclude there is no difference in the two population proportions.

To calculate beta requires making a "what if" assumption about the true population parameter, where the "what-if" value is one that would cause the null hypothesis to be false. T/F?

True

The cost of a college education has increased at a much faster rate than costs in general over the past twenty years. In order to compensate for this, many students work part- or full-time in addition to attending classes. At one university, it is believed that the average hours students work per week exceeds 20. To test this at a significance level of 0.05, a random sample of n = 20 students was selected and the following values were observed: 26 | 15 | 10 | 40 10 | 20 | 30 | 36 40 | 0 | 5 | 10 20 | 32 | 16 | 12 40 | 36 | 10 | 0 Based on these sample data, the critical value expressed in hours:

This is a right tailed test t.score = invt (.95, df = 19 ) = 1.72913 s = 13.609 1.72913*13.609/sqrt 20 + 20 = 25.26

Trading off Precision and Confidence

To be able to be more confident, we need to have a wider range (less precise) If we want a narrower range (more precision), we will not be able to be as confident So, if we move from a 95% interval to a 90% confidence level, the interval will get narrower. To be more confident, we need a wider interval to cover more possibilities To become more precise without losing confidence, we can increase our sample size n

A survey was recently conducted in which males and females were asked whether they owned a laptop personal computer. The following data were observed: Males | Females Have Laptop | 120 | 70 No Laptop | 50 | 60 Given this information, if an alpha level of .05 is used, the critical value for testing whether the two variables are independent is x2 = 3.8415. T/F?

True

A survey was recently conducted in which males and females were asked whether they owned a laptop personal computer. The following data were observed: Males | Females Have Laptop | 120 | 70 No Laptop | 50 | 60 Given this information, if an alpha level of .05 is used, the sum of the expected cell frequencies will be equal to the sum of the observed cell frequencies. T/F?

True

If the test statistic for a chi-square goodness-of-fit test is larger than the critical value, the null hypothesis should be rejected. T/F?

True

A major retail clothing store is interested in estimating the difference in mean monthly purchases by customers who use the store's in-house credit card versus using a Visa, Mastercard, or one of the other major credit cards. To do this, it has randomly selected a sample of customers who have made one or more purchases with each of the types of credit cards. The following represents the results of the sampling: NA | In-House Credit Card | National Credit Card Sample Size: | 86 | 113 Mean Monthly Purchases: | 45.67 | 39.87 Std Dev | 10.90 | 12.47 Suppose that the managers wished to test whether there is a statistical difference in the mean monthly purchases by customers using the two types of credit cards, using a significance level of .05, what is the value of the test statistic assuming the standard deviations are known?

Z=(xbar1-xbar2)/sqrt(s1^2/n1+s2^2/n2) =(45.67-39.87)/sqrt(10.9^2/86+12.47^2/113) =3.49

Standard Deviation

a computed measure of how much scores vary around the mean score

Null Hypothesis

a statement or idea that can be falsified, or proved wrong

Independent random samples of size 50 and 75 are selected. The sampling results in 35 and 35 successes, respectively. Test the following hypotheses: a) H0: p1 - p2 = 0 vs HA: p1 - p2 ≠ 0 b) H0: p1 - p2 ≥ 0 vs HA: p1 - p2 < 0 c) H0: p1 - p2 ≤ 0 vs HA: p1 - p2 > 0 d) H0: p1 - p2 = 0.05 vs HA: p1 - p2 ≠ 0.05 Use alpha = 0.02

a) Formula 14 Since z = 2.538 > 1.96, reject HO b) Using the steps found the text: (1). p1 - p2, (2) HO:p1 - p2 0 vs. HA: p1 - p2 < 0, (3) α = 0.05, (4) the critical value is - 1.645, Reject HO if z < -1.645, (5) z = 2.538, (6) Since z = 2.538 > -1.645, fail to reject HO, (7) There is not sufficient evidence to conclude that p1 - p2 < 0. c) Using the steps found in the text: (1). p1 - p2, (2) HO: p1 - p2 = 0 vs. HA: p1 - p2 > 0, (3) α = 0.025, (4) the critical value is 1.96, Reject HO if z > 1.96, (5) z = 2.538, (6) Since z = 2.538 > 1.96, reject HO, (7) There is sufficient evidence to conclude that p1 - p2 > 0. d) Using the steps found in the text: (1). p1 - p2, (2) H0: p1 - p2 = 0.05 vs. HA: p1 - p2 0.05, (3) α = 0.02, (4) the critical values are 2.33, Reject HO if z < -2.33 or z > 2.33, (5) Use formula 14 (6) Since z = 1.987 < 2.33, fail to reject H0, (7) There is not sufficient evidence to conclude that p1 - p2 0.05.

One of the key factors in concrete work is the time it takes for the concrete to "set up." Suppose a concrete supplier is considering a new additive that can be put in the concrete mix to help reduce the setup time. Before going ahead with the additive, the company plans to test it against the current additive. To do this, 14 batches of concrete are mixed using each of the additives. The following results are observed: x1 = 17.2 x2 = 15.9 S1 = 2.5 S2 = 1.8 a) Use these sample data to construct a 90% confidence interval estimate for the difference in mean setup times for the two concrete additives. On the basis of the confidence interval produced, do you agree that the new additive helps reduce the setup time for cement? (Assume the populations are normally distributed with equal variances.) Explain your answer. b) Assuming that the new additive is slightly more expensive than the old additive, do the data support switching to the new additive if the managers of the company are primarily interested in reducing the average setup time?

a) Formula 4 (-0.1043, 2.7043) Because the interval contains the value 0 you cannot say that there is a difference setup time for the two additives. b) No because again you cannot say that there is a difference in the setup time for the two additives.

A grocery store manager believes that the mean amount spent by customers on dairy products per visit is higher in stores in which the dairy section is in the central part of the store compared with stores that have the dairy section at the rear of the store. To consider relocating the dairy products, the manager feels that the increase in the mean amount spent by customers must be at least 25 cents. To determine whether relocation is justified, her staff selected a random sample of 25 customers at stores in which the dairy section is central in the store. She selected a second sample of 25 customers in stores with the dairy section at the rear of the store. The following sample results were observed: x1 = 3.74 x2 = 3.26 S1 = 0.87 S2 = 0.79 a) Conduct a hypothesis test with a significance level of 0.05 to determine if the manager should relocate the dairy products in those stores that have their dairy products in the rear of the store. Assume equal population variances. b) If a statistical error associated with hypothesis testing was made in this hypothesis test, what error could it have been? Explain.

a) Formula 6 H0: μC - μR > 0.25 HA: μC - μR < 0.25 = 0.9785 Since 0.9785 > -1.677 do not reject H0 and conclude that the difference is at least $0.25 b) Since you accepted the null hypothesis the type of error that could occur is accepting a false null hypothesis that is a Type II error.

The following sample data have been collected from a paired sample from two populations. The claim is that the first population mean will be at least as large as the mean of the second population. This claim will be assumed to be true unless the data strongly suggest otherwise. Sample 1: 4.4, 2.7, 1.0, 3.5, 2.8 Sample 2: 3.7, 3.5, 4.0, 4.9, 3.1 Sample 1: 2.6, 2.4, 2.0, 2.8 Sample 2: 4.2, 5.2, 4.4, 4.3 a) State the appropriate null and alternative hypotheses. b) Based on the sample data, what should you conclude about the null hypothesis? Test using α = 0.10. c) Calculate a 90% confidence interval for the difference in the population means. Are the results from the confidence interval consistent with the outcome of your hypothesis test? Explain why or why not.

a) If the difference is Sample 1 - Sample 2, the hypotheses are: H0: Ud > 0 HA: Ud < 0 b) Find the differences. Combine both sample 1s and both sample 2s. 0.7, -0.8, -3, -1.4, -0.3, -1.6, -2.8, -2.4, -1.5 Then find the average = -1.456. Solve for T (formula 11) = -3.64 Since -3.64 < -1.3968, reject H0. c) Idk...

A decision maker wishes to test the following null and alternative hypotheses using an alpha level equal to 0.05: H0: u1 - u2 = 0 HA: u1 - u2 ≠ 0 The population standard deviations are assumed to be known. After the sample data are collected, the test statistic is computed to be z = 1.78 a) Using the test statistic approach, what conclusion should be reached about the null hypothesis? b) Using the p-value approach, what decision should be reached about the null hypothesis? c) Will the two approaches (test statistic and p-value) ever provide different conclusions based on the same sample data? Explain.

a) Since the null hypothesis is formed as an equality, the test will be a two-tailed test. That means that the alpha level will be split into the two tails of the sampling distribution. Also, because the test statistic is a z-value, the critical value will come from the standard normal distribution. For a one-tail area of 0.025, the critical z = 1.96. Then the decision rule is: If the test statistic z > 1.96, reject the null hypothesis If the test statistic z < -1.96, reject the null hypothesis Otherwise, do not reject Because z = 1.78 < 1.96, we do not reject the null hypothesis. b) Using the p-value approach, we set up the following decision rule for a two-tailed test when alpha = 0.05 is used: If p-value < α/2 = 0.025, reject the null hypothesis To calculate the p-value, we use the test statistic, z = 1.78, and go to the standard normal table. The probability associated with z = 1.78 is 0.4625. Thus, p-value = 0.5000 - 0.4625 = 0.0375 The, since the p-value = 0.0375 > α/2 = 0.025, do not reject the null hypothesis c) The test statistic and p-value approaches will provide the same results when using the same sample data to test the null hypothesis because they are equivalent methods.

A startup cell phone applications company is interested in determining whether household incomes are different for subscribers to three different service providers. A random sample of 25 subscribers to each of the three service providers was taken, and the annual household income for each subscriber was recorded. The partially completed ANOVA table for the analysis is shown here: Source of Variation | SS | df | MS | F Between Groups | 2,949,085, 157 | -- | -- | -- Within Groups | -- | -- | -- | -- Total | 9,271,678,090 | -- | -- | -- a) Complete the ANOVA table by filling in the missing sums of squares, the degrees of freedom for each source, the mean square, and the calculated F-test statistic. b) Based on the sample results, can the startup firm conclude that there is a difference in household incomes for subscribers to the three service providers? You may assume normal distributions and equal variances. Conduct your test at the alpha = 0.10 level of significance. Be sure to state a critical F-statistic, a decision rule, and a conclusion.

a) The calculations for the completed ANOVA table below are: Between groups df = k-1 where k is the number of magazines = 3-1 = 2 Within groups df = nt -k, where nt = 25 subscribers * 3 magazines = 75; 75 - 3 = 72 SSW = SST-SSB = 9,271,678,090 - 2,949,085,157 = 6,322,592,933 MSB = 2,949,085,157/2 = 1,474,542,579 MSW = 6,322,592,933/72 = 87,813,791 F = 1,474,542,579/87,813,791 = 16.79 Source of Variation | SS | df | MS | F Between Groups | 2,949,085,157 | 2 | 1,474,542, 579 | 16.79 Within Groups | 6,322,592,933 | 72 | 87,813,791 | -- Total | 9,271,678,090 | 74 | -- | -- b) Ho: μ1 = μ2 = μ3 HA: Not all populations have the same mean F = MSB/MSW = 1,474,542,579/87,813,791 = 16.79 Because the F test statistic = 16.79 > Fα = 2.3778, we do reject the null hypothesis based on these sample data at the α = 0.10 level..

HA

alternative hypothesis

Ha

alternative hypothesis

ANOVA

analysis of variance allows us to compare means between multiple populations such as Market/Customer Research: In researching e-business customer profiles, which of several types of customers has the best 'e-business potential' ?

When someone is on trial for suspicion of committing a crime, the hypotheses are:H0 : innocentHA : guiltyWhich of the following is correct? a) Type I error is acquitting a guilty person. b) Type I error is convicting an innocent person c) Type II error is convicting an innocent person. d) Type II error is acquitting an innocent person.

b) Type I error is convicting an innocent person

Which of the following is NOT an assumption of ANOVA models? a) Populations are normally distributed b) Population variances are equal c) Balanced design d) Data are interval or ratio

c) Balanced design

You are given the following results of a paired-difference test: d = -4.6 sd = 0.25 n = 16 Construct a 90% confidence interval estimate for the paired difference in mean values.

d +- t (sd / sqrt n) =(-4.71, -4.49)

A walk-in medical clinic believes that arrivals are uniformly distributed over weekdays (Monday through Friday). It has collected the following data based on a random sample of 100 days. Frequency: Mon | 25 Tue | 22 Wed | 19 Thu | 18 Fri | 16 Total | 100 To conduct a goodness-of-fit test, what is the expected value for Friday?

expected value = 0.2*100 = 20

A hotel chain has four hotels in Oregon. The general manager is interested in determining whether the mean length of stay is the same or different for the four hotels. She selects a random sample of n = 20 guests at each hotel and determines the number of nights they stayed. Assuming that she plans to test this using an alpha level equal to 0.05, which of the following is the correct critical value?

k = 4 If the manager selects 20 guests from each hotel, n = 4 x20 = 80 Degree of freedoms is (4-1, 80-4) =(3, 76) Alpha = 0.05 So =F.INV.RT(0.05, 3, 76))

According to USA Today, customers are not settling for automobiles straight off the production lines. As an example, those who purchase a $355,000 Rolls-Royce typically add $25,000 in accessories. One of the affordable automobiles to receive additions is BMW's Mini Cooper. A sample of 179 recent Mini purchasers yielded a sample mean of $5,000 above the $20,200 base sticker price. Suppose the cost of accessories purchased for all Mini Coopers has a standard deviation of $1,500.Calculate a 95% confidence interval for the average cost of accessories on Mini Coopers.

n = 179 x = 5000 Population std dev = 1500 Confidence level = 1 - α = 0.95 Level of significance (α) = 0.05 From z-table, z value is given as 1.95 x +- z * (std dev / square root of n) 5000 +- 1.96 x (1500/square root of 179) 5000 +- 219.7459 (4790.254, 5219.746)

H0

null hypothesis

If you have 6 levels in your ANOVA, how many comparisons do you need to do for the Tukey-Kramer procedure?

number of comparisons =N(choosing 2 from 6) =6C2 =6!/(2!*4!) =15

A sample of 250 people resulted in a confidence interval estimate for the proportion of people who believe that the federal government's proposed tax increase is justified is between 0.14 and 0.20. Based on this information, what was the confidence level used in this estimation?

p = (0.14+0.20)/2 = 0.17 E = Z * sqrt(p * p' / n) Margin of error = upper bound - .17 = 0.20 - 0.17 = 0.03 So, 0.03 = z*sqrt(.17*.83/250) z = 0.03/sqrt(.17*.83/250) = 1.2628 This is basically a Confidence percentage of (1-2*(1-NORMDIST(1.2728)))*100% , which comes out to be 79.33% (HOW????)

A market research consultant hired by the Pepsi-Cola Co. is interested in knowing if the proportion of consumers who favor Pepsi-Cola over Coke Classic is different than 50%. A random sample of 250 consumers from the market under investigation provided a sample proportion of .464. Using a.10 level of significance, test the appropriate hypothesis to help answer his question.

p = .464, p0 = 0.5, n = 250, α = 0.1 H0 : p = 0.5, HA : p ̸= 0.5, two-tailed test z = (0.464 - 0.5)/square root of [0.5 * (1 - 0.5)/250] ≈ −1.14 Critical value method = norm.s.inv (.95) ≈ ± 1.645 p-value method, = 2 ·norm.s.dist (−1.14,TRUE ) ≈ .255 Either way, we fail to reject the null

Confidence Interval for a Proportion Formula

p ± z * square root of [p * (1 - p)/n]

A study was recently conducted at a major university to determine whether there is a difference in the proportion of business school graduates who go on to graduate school within five years after graduation and the proportion of non-business school graduates who attend graduate school. A random sample of 400 business school graduates showed that 75 had gone to graduate school while in a random sample of 500 non-business graduates, 137 had gone on to graduate school. Based on these sample data, and testing at the 0.10 level of significance, what is the value of the test statistic?

p1=75/400 =0.189 p2=137/500 =0.274 The test statistic is Z=(p1-p2)/sqrt(p1*(1-p1)/n1+p2*(1-p2)/n2) =(0.189-0.274)/sqrt(0.189*(1-0.189)/400+0.274*(1-0.274)/500) =-3.04

Formula for comparing 2 populations

point estimate = ̄x1 − ̄x2

Calculate the Test Statistic (σ Unknown)

t = ( ̄x1 − ̄x2) −(μ1 −μ2) / Sp * square root of [ (1/n1) + (1/n2) ] Where Sp = square root of (n1 - 1) * S1^2 + (n2 - 1) * S2^2 / (n1 + n2 - 2)

Test Statistic Formula (Mean unknown)

t = (x - μ0)/(s/square root of n)

L.L. Manufacturing produces battery operated watches. They have used two battery suppliers in the past, but Company A will now offer a bulk ordering discount if they get all their batteries from Company A The production manager would like to determine if Company A's batteries are performing as well as Company B's before he makes a decision He has obtained a sample of 41 batteries from Company Aand 41 from Company B. The results are listed below.Assume the population variances are equal Construct a 90% confidence interval to determine which companies batteries last longer Company A | Company B x = 17001 | x = 18559 s = 3400 | s = 4400 n = 41 | n = 41

t = t.inv .2t (1 −0.9,41 + 41 −2) ≈1.664 Sp = square root of [ (41 - 1) * 3400^2 + (41 - 1) * 4400^2 / (41 + 41 - 2) ] ≈ 3931.921 Square root of (1 / 41) + (1 / 41) ≈ 0.221 17001 - 18559 ± 1.664 * 3931.921 * 0.221 (−3003.15,−112.85)

A walk-in medical clinic believes that arrivals are uniformly distributed over weekdays (Monday through Friday). It has collected the following data based on a random sample of 100 days. Frequency: Mon | 25 Tue | 22 Wed | 19 Thu | 18 Fri | 16 Total | 100 What is the value of the test statistic needed to conduct a goodness-of-fit test?

test statisic = 2.5 First, we can calculate the expected frequency for each to be 20 (100/5) We can calculate the observed frequency of each day being 0.225, 0.222, 0.219, 0.218, and 0.216. (Oi - Ei)^2/Ei Oi =. observed frequency Ei = Expected frequency We can solve for each day as being 1.25, 0.2, 0.05, 0.2, 0.8 which all adds to 2.5

Median

the middle score in a distribution; half the scores are above it and half are below it

Mode

the most frequently occurring score(s) in a distribution

A hypothesis test for the difference between two means is considered a two-tailed test when:

the null hypothesis states that the population means are equal.

A hypothesis test is to be conducted using an alpha = .05 level. This means:

there is a maximum 5 percent chance that a true null hypothesis will be rejected

According to statistics reported on CNBC, a surprising number of motor vehicles are not covered by insurance.Sample results showed 46 of 200 vehicles were not covered by insurance. Develop a 95% confidence interval for the population proportion and explain what your findings mean.

x = 46,n = 200,z = 1.96 p = 46/200 = 0.23 0.23 ± 1.96 * square root of [0.23 * (1 - 0.23)/200] = 0.23 ± 0.058 = (0.172,0.288)

Confidence Interval Formula (Mean, Population Standard Deviation Known)

x ± z * (σ/square root of n)

Two Samples Confidence Interval (σ Unknown)

x1 - x2 ± t * Sp * square root of (1 / n1) + (1 / n2) Sp = square root of [(n1 - 1) * S1^2 + (n2 - 1) * S2^2 / (n1 + n2 -2)]

A chain of hair salon hires stylists from two different types of programs: 2 year associate degree programs and tech programs. They want to estimate the difference in the average time that it takes each group to cut a client's hair. Previous studies have indicated that the standard deviation is 6 minutes for stylists with an associate's degree and 9 minutes for stylists from a tech program Develop a 95% confidence interval estimate for the difference in mean times if a sample of 65 stylists with associate's degrees produced an mean time of 19 minutes and a sample of 90 tech stylists resulted in a mean time of 23 minutes.

x1 = 19, σ1 = 6, n1 = 65 x2 = 23, σ2 = 9, n2 = 90 19 - 23 ± 1.96 * square root of (6^2 / 65) + (9^2 / 90) = −4 ± 2.36 (−6.36,−1.64) Conclusion: We are 95% confident that stylists with anassociate's degree take less time on average than stylists witha tech degree

Collect Data and Calculate the Test Statistic: Population σ Known

z = ( ̄x1 − ̄x2) −(μ1 −μ2) / square root of (σ1^2 / n1) + (σ2^2 / n2)

Test Statistic Formula (Proportion)

z = (p - p0)/square root of (p0 * [1 - p0])/n)

Test Statistic Fomula (Mean known)

z = (x - μ0)/(σ/square root of n)

Two Samples Confidence Interval (σ Known)

̄x1 − ̄x2 ± z * square root of (σ1 ^ 2 / n1) + (σ2 ^ 2 / n2)

A health care provider claims that the mean overpayment on claims from an insurance company was 0. The insurance company is skeptical, believing that they were overcharged.They want to use sample data to prove their belief. A sample of 16 claims was collected, and a sample mean over payment of $4 was calculated with a sample standard deviation of $12. Suppose that we can tolerate a chance of5% of rejecting the claim when it is true. Can we conclude(prove) that the insurance company was overcharged?

μ0 = 0 n = 16 ̄x = 4 s = 12 Calculate test statistic: t = (4-0)/(12/square root of 16) = 4/3 ≈ 1.333 Find the critical value using Excel t.inv(.95,15) ≈ 1.753 Because 1.753 > 1.333, we fail to reject the null hypothesis

Hypothesized Difference in Means

μ1 −μ2

To find a confidence interval for the difference between the means of independent samples, when the variances are unknown but assumed equal, the sample sizes of the two groups must be the same. T/F?

False

A health care provider claims that the mean overpayment on claims from an insurance company was 0. The insurance company is skeptical, believing that they were overcharged.They want to use sample data to prove their belief. A sample of 16 claims was collected, and a sample mean over payment of $4 was calculated with a sample standard deviation of $12. Suppose that we can tolerate a chance of5% of rejecting the claim when it is true. Can we conclude(prove) that the insurance company was overcharged? Now do this with p-value method.

H0 ≤0, HA > 0 so it's a right-tailed test t = 1.333 t .dist .rt = (1.333,15) ≈ 0.101 0.101 > 0.05, we fail to reject the null hypothesis

The Post Office in Mobile, Alabama wants to see if its operational efficiency has improved. Specifically they want to know if the average time to serve customers has dropped below 540 seconds (9 minutes). They select a sample of 16 customer visits and find that ̄x = 510 and s = 45. Use α = 0.01 Now use the p-value method

H0 ≥ 540, HA < 540 so it's a left tailed test t = −2.667 t.dist (−2.667,15,TRUE ) ≈0.009 0.009 < 0.01, we reject the null hypothesis

Errors Table

If the null is true and you don't reject H0 --> correct If the null is false and you don't reject H0 --> Type 2 (β) If the null is true and you reject H0 --> Type 1 (α) If the null is false and you reject H0 --> Correct

Suppose as part of a national study of economic competitiveness a marketing research firm randomly sampled 200 adults between the ages of 27 and 35 living in metropolitan Seattle and 180 adults between the ages of 27 and 35 living in metropolitan Minneapolis. Each adult selected in the sample was asked, among other things, whether they had a college degree. From the Seattle sample 66 adults answered yes and from the Minneapolis sample 63 adults answered yes when asked if they had a college degree. Based on the sample data, can we conclude that there is a difference between the population proportions of adults between the ages of 27 and 35 in the two cities with college degrees? Use a level of significance of 0.10 to conduct the appropriate hypothesis test.

Seattle: p1=66/200 =0.33 Minneapolis: p2=63/180 = 0.35 Ho: p1=p2 Ha: p1 not equal to p2 The test statistic is Z=(p1-p2)/sqrt(p1*(1-p1)/n1+p2*(1-p2)/n2) =(0.33-0.35)/sqrt(0.33*(1-0.33)/200 +0.35*(1-0.35)/180) =-0.41 Given a=0.01, the critical values are Z(0.005)=-2.58 or 2.58 (from standard normal table) Since Z=-0.41 is between -2.58 and 2.58, we do not reject Ho.

When the Hypothesized Difference ̸= 0

Sometimes, instead of just asking if population A different than population B, we might want to know if they are different by a specific amount For example, we might want to test whether Brand A's golf balls go at least 10 yards further than Brand B's balls on average Or we might want to know if college-aged students' credit card debt is $1000 more than middle-aged people's credit card debt on average We would substitute μ1 −μ2 = 10 or μ1 −μ2 = 1000 for 0 in these examples

Statistically significant

The difference between what ishypothesized and what is observed in the data is unlikely tohave arisen by chance

Confidence Interval for Proportion

To estimate the population proportion p in a binomial distribution, we assume that the sample size is large enough so that the central limit theorem applies (this is why we can use z) Make sure that np(1 −p) ≥5 Keep in mind that p can never be less than 0 or greater than1, so the upper limit is bounded by 1 and lower limit isbounded by 0

Confidence Intervals in Excel

To find a z value given a probability:= norm.s.inv (probability) to find a t value given a probability: = t.inv (α,d .f .) or, for t value (two tail): = t .inv .2t (1 −α,d .f .) To find margin of error for means only:= confidence.norm(α,σ,n) or = confidence.t (α,s ,n)

Finding the p-value in Excel

To find the p-value in Excel, use the formulas: - t.dist (test statistic ,d .f ,TRUE ) left-tailed - t.dist.rt (test statistic ,d .f .) right-tailed - t.dist .2t (test statistic ,d .f .) two-tailed

The Cranston Hardware Company is interested in estimating the difference in the mean purchase for men customers versus women customers. It wishes to estimate this difference using a 95 percent confidence level. If the sample size is n = 10 from each population, the samples are independent, and sample standard deviations are used, and the variances are assumed equal, then the critical value will be t = 2.1009. T/F?

True

Binomial Distribution

a frequency distribution of the possible number of successful outcomes in a given number of trials in each of which there is the same probability of success.

Confidence Interval

a range of values so defined that there is a specified probability that the value of a parameter lies within it.

Test Statistic

a statistic whose value helps determine whether a null hypothesis should be rejected

If the p value is less than α in a two-tailed test, a) the null hypothesis should be rejected. b) a one-tailed test should be used. c) More information is needed to reach a conclusion about the null hypothesis. d) the null hypothesis should not be rejected.

a) the null hypothesis should be rejected.

Mean

average

An oil company wants to compare the average mileage of cars using gasoline versus cars using a gas-ethanol blend. They use a paired-sample approach to control for variation in mileage due to differences in cars and drivers. A random sample of 10 motorists and their cars was selected. The drivers drove 200 miles on a selected course using gasoline for fuel, their average mileage was then calculated. This process was repeated using an ethanol-gas mixture. Do you reject or fail to reject the null hypothesis?

d = 2.274 =, sd = 4.382, n = 10 H0 : μd = 0, HA : μd ̸= 0 t = 2.27 - 0 / [4.382 / square root of 10 ] ≈ 1.638 Find critical value: = t .inv .2t (.05,9) ≈ 2.262 Find p-value: = t .dist .2t (1.638,9) ≈0.136 Didn't give an answer...

A paired sample study has been conducted to determine whether two populations have equal means. Twenty paired samples were obtained with the following sample results: d-bar = 12.45 Sd = 11 Based on these sample data and a significance level of 0.05, what conclusion should be made about the population means?

n = 20 t = (12.45 - 0) / [11 / square root of 20] = 5.06 With alpha of 0.05, 20-1 = 19 degrees of freedom t.005 = 2.262 Because t = 5.06 > 2.0930, reject the null hypothesis.

Even before the record gas prices during the summer of 2008, an article written by Will Lester of the Associated Press reported on a poll in which 80% of those surveyed say that Americans who currently own a SUV (sport utility vehicle) should switch to a more fuel-efficient vehicle to ease America's dependency on foreign oil. This study was conducted by the Pew Research Center for the People & the Press. As a follow-up to this report, a consumer group conducted a study of SUV owners to estimate the mean mileage for their vehicles. A simple random sample of 91 SUV owners was selected, and the owners were asked to report their highway mileage. The following results were summarized from the sample data: x = 18.2 mpg s = 6.3 mpg Based on these sample data, compute and interpret a 90% confidence interval estimate for the mean highway mileage for SUVs.

n = 91 x = 18.2 std dev = 6.3 alpha = 1-0.9 = 0.1 Z of 0.05 = 1.645 Confidence interval = x +- Z * (std dev / square root of n) 18.2 +- 1.0864 = ( 17.1136 , 19.2864 )

The management of a department store is interested to estimate the difference in the amount of money spent by female and male shoppers. You are given the following information. NA | Female Shoppers | Male Shoppers Sample Size | 64 | 49 Sample Mean | 140 | 125 Pop Std Dev | 10 | 8 A 95 percent confidence interval estimate for the difference between the average purchases of the customers using the two different credit cards is:

n1 = 64 n2 = 49 x1 = 140 x2 = 125 std dev 1 = 10 std dev 2 = 8 alpha = 0.05 z = 1.96 (x1 - x2) +- z * square root [(std dev 1 ^ 2/n1) + (std dev 2 ^ 2 / n2)] =11.68 to 18.32

A company that makes shampoo wants to test whether the average amount of shampoo per bottle is 16 ounces. The standard deviation is known to be 0.20 ounces. Assuming that the hypothesis test is to be performed using 0.10 level of significance and a random sample of n = 64 bottles, how large could the sample mean be before they would reject the null hypothesis?

n=64 is large enough to use a z-test. The two-tailed 90% confidence interval (5% in each tail) ispop. mean +/- 1.64 (s.d. / sqrt(n) ) = 16 +/- 1.64 * 0.2/8 oz = [15.959, 16.041] So 16.041 would be the largest

Confidence Interval Formula

point estimate ± z or t ×standard error z or t = margin of error point estimate = mean, proportion

The NCAA is interested in estimating the difference in mean number of daily training hours for men and women athletes on college campuses. It wants 95 percent confidence and will select a sample of 10 men and 10 women for the study. The variances are assumed equal and the populations normally distributed. The sample results are: Men | Women n1 = 10 students | n2 = 10 students x1 = 2.7 hours | x2 = 2.4 hours S1 = 0.3 hours | S2 = 0.4 hours Based on these sample data, the critical value for developing the confidence interval is z = 1.96. True or false?

standard error, SE = sqrt((s1^2/n1)+(s2^2/n2) SE = 0.15811 t(a/2,n1+n2-2) = t(0.025,18) = 2.101 (x1-x2) +/- [t(a/2,n1+n2-2) * SE] ( -0.0322 , 0.6322 ) (x1-x2) = 0.30 TRUE

Confidence Interval Formula (Mean, Population Standard Deviation Unknown)

x ± T(n-1) * (s/square root of n)

In an article entitled "Childhood Pastimes Are Increasingly Moving Indoors," Dennis Cauchon asserts that there have been huge declines in spontaneous outdoor activities such as bike riding, swimming, and touch football. In the article, he cites separate studies by the national Sporting Goods Association and American Sports Data that indicate bike riding alone is down 31% from 1995 to 2004. According to the surveys, 68% of 7- to 11-year-olds rode a bike at least six times in 1995 and only 47% did in 2004. Assume the sample sizes were 1,500 and 2,000, respectively. Calculate a 95% confidence interval to estimate the proportion of 7- to 11-year-olds who rode their bike at least six times in 2004.

z= 1.96?? 0.47 +- 1.96 * sqrt [0.47 * (-0.47) / 2000 ] =0.47 +- 0.0219 =(0.448, 0.491)


Set pelajaran terkait

WFC198: 15: Climate Change in the Tropics

View Set

History 224 Dr. Pigott Midterm study guide

View Set

Criminology Final (needs to be finished)

View Set

IS-317.a: Introduction to Community Emergency Response Team (CERT)

View Set

Dating 👨‍❤️‍💋‍👨💘

View Set