Stats Final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

For many people, breakfast cereal is an important source of fiber in their diets. Cereals also contain potassium, a mineral shown to be associated with maintaining a healthy a blood pressure. If your cereal provides 9 grams of fiber per serving, how much potassium does the model estimate you will get?

281 milligrams

The correlation between a cereal's fiber and potassium contents is r= 0.903. What fraction of the variability in potassium is accounted for by the amount of fiber servings contain?

81.5%

Naturally you would like to know what you are going to earn in the next few years. Explain why a regression model such as the ones we fave found won't do a very good job of such a prediction. (Sorry.)

A prediction interval for an individual value will be too wide to be of much use.

Look again at Ex.27's regression output for the calorie and sodium content of hot dogs. A.)The output reports s=59.66. Explain what that means in this context B.)What's the value of the SE of the slope of the regression line? C.)Explain what that means in this context.

A.) Among all-beef hot dogs with the same number of calories, the sodium content varies, with a SD of about 60mg. B.)0.561 mg/cal C.)If we tested many other samples of all-beef hot dogs, the slopes of the resulting regression lines would be expected to vary, with a SD of about 0.56 mg of sodium per calorie

Continuing with the data form Ex.1, here's a regression with the percent of students who receive merit-based financial aid included in the model. A.) Write the regression model B.)What is the interpretation of the coefficient of SAT in this model? How does it differ form the interpretation in Ex.1?

A.) Earn= 23,974+23.188(SAT) - 8501(%need) B.) On average, Earn increases by about $23/year per SAT point after allowing for the effects of %need

Climate scientist have been observing the extent of sea ice in the northern Arctic using satellite observations. Many have expressed concern because in recent decades the extent of sea ice has declined precipitously- possibly due to global climate change. A.)Explain in context what the regression says. B.) Check the assumptions and conditions for regression inference. C.)The output reports s= 0.68295. Explain what that means in this context. D.)What's the value of the SE of the slope of the regression line. E.)Explain what that means in this context. F.) Does this analysis prove that global temperature changes are causing sea ice to melt? explain.

A.) Extent=73.793-4.421x; starting from an extent of 73.8 km2 in 1979, the model predicts a decrease in extent of 4.4km2 per degree Celsius increase in mean global temperature. B.)The scatter plot shows a moderate linear relationship. the residual plot shows a possible bend and slightly greater variation on the left than on the right. There are no striking outliers. The normal probability plot looks reasonably straight. We should proceed with caution because the conditions are almost satisfied. C.)s is the SD of the residuals D.)0.58 E.)The SE sit eh estimated SD of the sampling distribution of the slope coefficient. Over many random samples from this population(or with a bootstrap), we'd expect to see slopes of the samples varying by this much. F.) No. We can see an association, but we cannot establish causation from this study.

A second predictor in Ex.13 improved the regression model of ex. 1, so let's try a third. Here's a model with average ACT score of the enter class included. A.)The coefficient of SAT in this model is quite different from the SAT coefficient in the original model of Ex.1 or the multiple regression model of Ex.13 Why the change? B.)Find a 95% CI for the coefficient of SAT. How does it compare with the one you found in Ex.13? C.)The t-ratio associated with the SAT coefficient is now much smaller and the corresponding p-value much larger. Explain why this has happened.

A.)ACT and SAT are highly correlated with each other. After all, they are very similar measures. Thus, SAT, after allowing for the effect of ACT, is not really a measure of test performance but rather a measure of how students who take the SAT may differ from those who take the ACT at the colleges in question. B.)(1.56,18.66); the interval covers much smaller values at the plausible ones. C.) WE are less confident that this coefficient is different than zero. The co-linearity with ACT has inflated the variance of the coefficient of SAT, leading to a smaller t-ratio and larger p-value.

Healthy eating probably doesn't include hot dogs., but if your are going to have one, you'd probably hope it's low in both calories and sodium. Recently, Consumer Reports listed the number of calories and sodium content ( in milligrams) for 13 brands of all-beef hot dogs it tested. Examine the association, assuming that the data satisfy the conditions for inference. A.)State the appropriate hypotheses about the slope. B.) Test your hypotheses and state your conclusion in the proper context.

A.)Ho:B1=0; there's no association between calories and sodium content in all-beef hot dogs. Ha:B1 does not =0: there is an association. B.) Based on the low P-value (0.0018), I reject the null. There is evidence of an association between the number of calories in all-beef hot dogs and their sodium contents.

Pew research, in 2015, polled a random sample of 1060 US teens (ages 13-17) about internet use. 56% of those teens reported going online several times a day- a fact of great interest to advertisers. A.) Explain the meaning of ^p=0.56 in the context of this situation. B.)Calculate the standard error of ^p C.)Explain what this standard error means in the context of this situation.

A.)This means that 56% of the 1060 teens in the sample said they go online several times a day. THis is our best estimate of p, the proportion of all US could say they do so. B.) SE(^p)= Square root (0.56)(0.44)/1060= 0.0152 C.) Because we don't know p, we use ^p to estimate the SD of the sampling distribution. The SE is our estimate of the amount of variation in the sample proportion we expect to see from sample to sample when we ask 1060 teens.

Each year, the students at Gossett High School take a physical fitness test during their gym classes. Create a 90% confidence interval for how many more push-up boys can do then girls, on average, at that high school.

Based on these data, we are 90% confident that boys, on average, can do between 1.6 and 13.0 more push-ups than girls.

According to a Pew Research survey, 27% of American adults are pessimistic about the future of marriage and the family. That is based on a random sample of about 1500 people. Is it reasonable for Pew to use a Normal model for the sampling distribution of the sample proportion? Why or why not?

Conditions: the data come from a random sample, so the randomization condition is met. We don't know the exact value of p, but if sample is anywhere near correct, np=1500x.27=405 and nq=1500x.73=1095. So there are well over 10 successes and failures, meeting S/F condition. Since there are more than 10x1500=15000 adults in the US the 10% condition is met. A normal model is appropriate for the sampling distribution of the sampling proportion.

A larger firm is considering acquiring the bookstore of Ex.3. an analyst for the firm, noting the relationship seen in ex.3, suggest that when they acquire the store they should hire more people because that will drive higher sales. Is his conclusion justified? What alternative explanations can you offer?

Correlation does not demonstrate causation. The analyst's argument is that sales staff cause sales. however, the data may reflect the store hiring more people as sale increase so any causation would run the other way.

Fast food is often considered unhealthy because much of it is high in both fat and sodium. But are the two related? Analyze the association between fat content and sodium using correlation and scatter plots

Correlation is 0.199. There does not appear to be a relationship between sodium and fat content in burgers, especially without the low-fat, low-sodium item. Without the outlier -0.325 but the relationship is still weak.

Does attending college pay back the investment? What factors predict higher earning for graduates? Money magazine surveyed graduates, asking abut their point of view of the colleges they had attended. Here are the regression model and associated plots. Write the regression model and explain what the slope coefficient means in this context.

Earn=14,469+27.264(SAT); graduates earn, on average $27.26/year per point of SAT score.

A company with a fleet of 150 cars found that the emission systems of 7 out of 22 they tested failed to meet pollution control guidelines. Is this strong evidence that more than 20% of the fleet might be out of compliance? Test an appropriate hypothesis and state your conclusion.

Ho: 0.20; Ha: p> 0.20 SRS (not clear from information provided); 22 is more than 10% of the population of 150; (0.22)(22)<10. Do not proceed with a test.

Base on the regression output seen in ex. 27, crate a 95% CI for the slope of the regression line and interpret your interval in context.

I'm 95% confident that for every additional calorie, all-beef hot dogs have, on average, between 1.07 and 3.53 mg more sodium

For the interval given in Ex3, explain what "95% confidence" means

If we were to take repeated samples of these sizes of Canadians & Americans, & compute two-proportion CI, we would expect 95% of the intervals to contain the true difference in the proportions of foreign-born citizens.

A company's old antacid formula provided relief for 70% of the people who used it. The company tests a new formula to see if it is better and gets a p-value of 0.27. Is it reasonable to conclude that the new formula and the old one are equally effective? Explain.

No. We can say only that there is a 27% chance of seeing the observed effectiveness just from natural sampling variation. There is insufficient evidence that the new formula is more effective, but we can't conclude that they are equally effective.

Exercise 15 describes a regression model that estimates a cereal's potassium content form the amount of fiber it contains. In this context what does it mean to say that a cereal has a negative residual?

The potassium content is actually lower than the model predicts for a cereal with that much fiber.

Discuss the assumptions and conditions necessary for proceeding with the regression analysis in Ex.1. Do you think the conditions are satisfied?

The residual plot has no structure, and there are not any striking outliers. The histogram of residuals is symmetric and bell-shaped. All conditions are satisfied to proceed with the regression analysis.

A candidate for office claims that "their is a correlation between tv watching and crime." Criticize this statement on statistical grounds.

There may be and association, but not a correlation unless the variables are quantitative. There could be a correlation between average of hours of TV watched per week per person and number of crimes committed per year. even if there is a relationship, it doesn't men one causes the other.

Ex. 23 shows computer output examining the association between Arctic sea ice extent and global mean temperature. Find a 95% Ci for the slope and interpret in this context.

We are 95% confident that he slope is between -5.6 and -3.24

Using the summary statistics provided in exercise 11, researchers calculated a 95% confidence interval for the mean difference between Walmart & target. Purchase amounts. The interval was (-$14.15,-$1.85) Explain in context what this interval means.

We are 95% confident that the mean purchase amount at Walmart is between $1.85& $14.15 less than the mean purchase amount at Target

Construct a 95% CI for the slope of the regression line in Ex.1. Interpret the meaning of the interval. Be sure to state it in the context of the data and the question about the data.

We are 95% confident that the true slope relating Earn and SAT is between 24.23 and 30.3 $/year per SAT point.

The information in Ex.1 was used to create a 95% two-proportion confidence interval for the difference between Canadians and u.s. citizens who were born in foreign countries. Interpret this interval w/ a sentence in context. 95% confidence interval for Pcanadians - Pamericans is (3.24%,9.56%)

We are 95% confident, based on these data. the proportion of foreign born Canadians is between 3.24% & 9.56% higher than the proportion of foreign born Americans.

Use the info in Ex.1 to test the hypothesis Ho: B1 =0 vs. Ha:Bo does not=0. What do you conclude about the relationship between earning and SAT scores.

We can reject the null hypothesis and conclude that the slope of the relationship between ear and SAT is not zero. It seems that those who score higher on their SAT tend to earn more.

In the July 2007 issue, consumer reports examined the calorie context of two kinds of hotdogs: meat (usually a mixture of pork, turkey, chicken) & all beef. Would a 95% CI for Mmeat-Mbeef include 0? Explain.

Yes, the high P-value means that we lack evidence of a difference, so 0 is a plausible value for Mmeat-Mbeef.

Yron Hopps ran an experiment to determine optimum power and time settings for microwave popcorn. His goal was to find a comb. Of power and time that would deliver high-quality popcorn less than 10% of the kernels left unpopped, on average

Yes. The 95% confidence interval is (3.73%, 9.82%). This is below 10% desired goal.

In chapter 14, ex.) 57 we saw that Yron Hopps ran and experiment to determine optimum power and time setting for microwave popcorn. Use a test of hypothesis to decide if Yron has met his goal.

Yes. These are a random sample of bags and the nearly normal condition is met. t= -2.51 with a 7 df for a one-sided p-value of 0.0203

A new vaccine was recently tested to see if it could prevent the painful & recurrent ear infections that many infants suffer from. a)Are the conditions for inference satisfied? b)Find a 95% CI for the difference in rates of ear infection c)Use your CI to explain whether you think the vaccine is effective.

a) Yes, Subjects were randomly divided into independent groups, & more than 10 successes & failures we observed in each group. b)(4.7%,8.9%) c.) We're 95% confident that the rate of infection is 5-9 percentage points lower. That's a meaningful reduction, considering the 20% infection rate among the unvaccinated kids.

a clean air standard requires that vehicle exhaust emissions not exceed specified limits for various pollutants. a) In this context, what is type 1 error b) in this context, what is type 2 error c) which type of error would the shop's owner consider more serious? d) which type of error might environmentalists consider more serious?

a) it is decided that the shop is not meeting standards when it is. b) the shop is certified as meeting standards when it is not. c) type 1 d) type 2

As in Ex.29, state regulators are checking up on repair shops to see if they are certifying vehicles that do not meet pollution standards. a) what is meant by the power of the test the regulators are conducting? b) will the power be greater if they test 20 or 40 cars? why? c) will the power be greater if they use 5% or 10% level of significance? why? d) will the power be greater if the repair shop's inspectors are only a little out of?

a) the probability of detecting a shop that is not meeting standards. b) 40 cars. Larger n c) 10% greater chance to reject to d) A lot. A larger effect slice is easier to detect.

Before lending someone money, banks must decide whether they believe the applicant will repay the loan. a) when a person defaults a loan, which type b) which kind of error is it when the bank misses an app. to make a loan to someone who c)suppose the bank decides to lower the cutoff score from 250 pts to 200. d) what impact does this change in the cutoff value have on the chance of each type of error?

a) type 2 error b) type 1 error c) by making it easier to get the loan d) the risk of a type 1 error decreased and the risk of a type 2 error is increased.

The centers for disease control & prevention reported a survey of randomly selected Americans are 65 & older, which found that 411 off 1012 men and 535 of 1062 women suffered from some form of arthritis a)are the assumptions & conditions necessary for interference satisfied? Explain. b)Create a 95% CI for the difference in the proportions of senior men & women who have this disease. c)Interpret the interval in this context. d)Does the CI suggest that arthritis is more likely in women than in men?

a) yes. Random samples; less than 10% of the population; sample independent; more than 10 successes & failures in each sample. b)(0.055,0.140) c) We are 95% confident, based on these samples, that the proportion of American women are 65 & older who suffer from arthritis is between 5.5% & 14.0% higher than the proportion of American men of the same age who suffer from arthritis. d)Yes, the entire interval lies above 0.

A researcher developing scanners to search for hidden weapons at airports has concluded that a new device is significantly better than the current scanner

a=0.10 yes. p-value is less than 0.05

Consider the following data from a small bookstore: A.) prepare a scatter plot of sales against number of sales people working B.) What can you say about the direction of the association? C.) what can you say about the form of the relationship? D.) What can you say about the strength of the relationship? E.) Does the scatter plot show any outliers?

A.) graph B.) positive C.) linear D.) strong E.) no

If false, explain briefly A.) We choose the linear model that passes through the most data points on the scatter plot. B.) The residuals are the observed y-values minus the y-values predicted by the linear model. C.) Least squares means that the square of the largest residual is as small as it could possibly be.

A.) False. The line usually touches none of the points. We minimize the sum of the squared errors. B.) True. C.) False. It is the sum of the squares of all the residuals that is minimized.

For the births in Exercise 1 A.) If there is no seasonal effect, about how big, on average, would you expect the x2 statistic to be? B.) Does the statistic you computed in Ex.1 seem large in comparison to his mean? C.)What does that say about the null hypothesis? D.)Find the alpha=0.05 critical value for the x2 distribution with the appropriate number of df. E.) using the critical value, what do you conclude about the null hypothesis at alpha= 0.05?

A.)3 B.) No. Its smaller than the mean. C.) It would say that there is not enough evidence to reject the null hypothesis D.) 7.815 E.) do not reject the null hypothesis

Two different professor teach and intro stats course. The table showsl the distribution of final grades they reported. A.) will you test goodness, homo, independence? B.) write appropriate hypothesis. C.) Find the expected counts for each cell and explain why the chi-square procedures are not appropriate.

A.)Homogeneity B.) Ho: the grade distribution is the same for both Ha: the grade distributions are different C.) Three cells have expected frequencies less than 5

For the data in exercise 1 A.) complete the standardized residual for each season. B.) are any of these particularly large? C.) Why should you have anticipated the answer to part b?

A.) (-0.913, 0.913, 0.365, -0.365) B.) no. they are quite small for 2 values C.) Because we did not reject the null hypothesis we shouldn't expect any of the standardized residuals to be large.

An insurance company checks police records on 582 accidents selected at random and notes that teenagers were at the wheel in 91 of them. A.) Create a 95% CI for the % of all auto accidents that involve teenage drivers. B.) Explain what your interval means C.) Explain what "95% confidence" means D.) Does your confidence interval support or contradict this statement? Explain.

A.) (12.70%, 18.6%) B.) We are 95% confident, based on this sample, that the proportion of all auto accidents that involve teenage drivers is between 12.7% and 18.6%. C.) About 95% of all random sample of this size will produce CI's that contain the true population proportion D.) Contradicts. The interval is completely below 20%

Vitamin D, whether ingested as a dietary supplement or produced naturally when sunlight falls on the skin is essential for strong, healthy bones. A.) Find a 95% CI B.) Explain carefully what your interval means C.) Explain what "98% confidence" means

A.) (18.2%, 21.8%) B.) We are 98% confident, based on the sample, that between 18.2% and 21.8% of all English children are deficient in vitamin D. C.) About 98% of all random samples of this size will produce a CI that contains the true proportions of English children deficient in vitamin D.

If there is no seasonal effect on human births, we would expect equal numbers of children to be born in each season. A.) What is the expected number of births i each season if there is no "seasonal effect" on births? B.) Compute the x2 statistic. C.) How many degrees of freedom does the x2 statistic have?

A.) (30,30,30,30), 30 for each season B.) 1.933 C.) 3

After getting trounced by your little brother in a children's game, you suspect the die he gave you to roll may be unfair. A.) If the dies is fair, how many times would you expect each face to show? B.) to see if these results are unusual, will you test goodness of fit homogeneity? C.) state your hypothesis D.) check the conditions E.) how many df's are there? F.) fine x2 and p-value.

A.) 10 B.) goodness of fit C.) Ho: the dies is fair, Ha: the die is not fair D.) count data; rolls are random and independent E.) 5 F.) x2=5.600, P- value= 0.3471 G.) because of p-value is high, do not reject Ho.

The core plus mathematics project is an innovative approach to teaching math that engages students in group investigations and math modeling. A.) What's the ME for this CI? B.) IF we had created a 98% Ci, would the ME be larger or smaller? C.) Explain what the calculated interval means in this context. D.) Does this result suggest that students who learn math with CPMP will have significantly higher mean scores in algebra than those in traditional programs.

A.) 2.927 B.) Larger C.)Based on this sample, we are 95%f confident that students who learn math using the CPMP method will score, on average, between 5.57 and 11.43 points higher on a test solving applied algebra problems with a calculator than students who learn by traditional methods. D.) Yes. 0 is not in the interval

Livestock are given a special feed supplement to see if it will promote weight gain. A.) 95% between 45-67 lbs B.) 95% sure the cow fed this supplement will gain 45-67 lbs C.) 95% sure that average weight was 45-67 lbs D.) The average weight gain of cows fed this supplement will be between 45-67 lbs 95% E.) If this supplement is tested on other cows there is 95%

A.) Confidence interval for the population mean B.) Confidence interval is not for individual C.) We know the average gain was 56 lbs. D.) Average weight does not vary E.) NO.

Several factors are involved in the creation of a CI. Which statements are true? A.) for a given sample size, higher confidence means a smaller ME. B.) For a specified confidence level, larger samples provide smaller margins of error. C.) For a fixed ME, larger samples provide greater confidence D.) For a given confidence level, having the ME requires a sample twice as large

A.) False B.) True C.) True D.) False

In Exercise 53, we saw a 90% CI of (-6.5,-1.4) grams for Mmeat-Mbeef, the difference in mean fat content for meat vs. all beef hotdogs. True of False. A.) If i eat a meat hot dog instead of a beef dog, there's a 90% chance i'll consume less fat. B.) 90% of meat hot dogs C.) I'm 90% confident that meat hot dogs D.) If i were to get more samples of both kinds E.) If i tested more samples, I'd expect about 90%

A.) False. The CI is about means, not about individual hot dogs. B.) False. The CI is about means, not about individual hot dogs. C.) True D.) False. CI's based on other samples will also try to estimate the true difference in population means; there's no reason to expect other samples to conform to his result. E.)True

Which of the following are true? if false explain briefly. A.) a p-value of 0.01 means the null hypothesis is false. B.) A p-value of 0.01 means the null hypothesis has 0.01 chance of being true. C.) A p-value of 0.01 evidence against null hypothesis D.) A p-value of 0.01 means we should definitely reject the null hypothesis.

A.) False. provides evidence against it B.) False. not the probability that the null hypothesis C.) true D.) true

Which of the following are true? If false explain briefly. A.) A very high p-value is strong evidence that the null hypothesis is false. B.) A very low p-value proves that the null hypothesis is false. C.) A high p-value shows that the null hypothesis is true D.) A p-value below 0.05 is always considered sufficient evidence to reject a null hypothesis

A.) False. provides no evidence for rejecting the null hypothesis B.) False. rejecting null hypothesis C.) False. High p-value shows that the data are consistent with the null hypothesis but does not prove. D.) False. Whether p-value provides enough evidence to reject the null hypothesis depends on whether.

The national center for education Statistics monitors many aspects of elementary and secondary education nationwide. A.) Write an appropriate hypothesis B.) Check the assumptions and conditions C.) Perform the test and find the p-value D.)State your conclusion E.) Do you think this difference is meaningful? explain

A.) Ho: P2000=0.34; Ha: P2000 does not equal 0.34 B.) Students were randomly sampled and should be independent 34% and 66% of 8302 are greater than 10. 8302 students is less than 10% of the entire student population C.) P=0.054 or 0.055 using tables D.) The p-value provides weak evidence against the Ho. E.) No. A difference this small, even if statistically significantly is probably not meaningful. We might look at new data in a few years.

Write the null and alternative hypothesis you would use to test each of the following: A.) A governor is concerned about his "negatives" - the percentage of state residents who express disapproval of his job performance. B.) Is the coin fair? C.) only about 20% of people who try to quit smoking succeed.

A.) Ho: p=0.30; Ha: p<0.30 B.) Ho: p= 0.50; Ha: p does not equal 0.50 C.) Ho: p= 0.20; Ha: p>0.20

According to the 2010 Census, 16% of the people in the US are of Hispanic or Latino origin. A.) State hypothesis B.) Name the model and check appropriate condtions for a hypothesis test. C.) Draw and label sketch, and then calculate the tests statistic and p-value. D.) State your conclusion

A.) Ho: the proportion, p, of people in the country that are of hispanic or Latino origin is p=0.16, Ha: p does not equal 0.16 B.) This is a one-proportion z-test. The 437 residents were a random sample from the county of interest. 437 is almost certainly less than 10$ of the population of a country. C.) ^p=44/437=0.101;SD(^p)= square root(0.16)(0.84)/437= 0.0175 and -3.37, p-value= Zxp(z<-3.37)< 0.001 D.) Because the p-value is so low, their is evidence that the hispanic/Latino population in this country

The correlation between age and income as measured on 100 people is r=0.75 A.) when age increases, income increases as well B.) The form of the relationship between age and income is straight. C.) There are no outliers in the scatter plots of income vs age. D.) Whether we measure age in years or months, the correlation will still be 0.75.

A.) No. We don't know this from the correlation alone. There may be a nonlinear relationship or outliers. B.) No. we can't tell from the correlation what the from of the relationship is. C.) No. We don't know from the correlation D.) Yes. The correlation doesn't depend on the units used to measure the variables.

After surveying students at Dartmouth, a campus org. calculated 95% confidence interval for the mean cost of foods for 1 term is (1372,1562) A.) 95% pay between 1372 and 1562 B.) 95% of the sampled students paid between 1372 and 1562 C.)95% sure that students in the sum. avg. between 1372 and 1562 D.) 95% of all samples will average between 1372 and 1562 E.) 95% sure that all students pay between 1372 and 1562

A.) No. not about individuals in population B.) No. not about individuals in sample C.) No. We know the mean cost was 1467 D.) No. not about other sample means E.) Yes. Estimates a population parameter.

A catalog sales company promises to deliver orders placed on the internet within 3 days. A.) Between 82% and 94% of all orders arrive on time. B.) Ninety-five percent of all random samples of customers will show that 88% of orders arrive on time. C.) Between 82% to 94% of orders arrive on time D.) We are 95% sure that between 92%-94% of the orders placed by the sampled customers arrived on time. E.) On 95% of the days between 92%-94% of the orders placed by the sampled customers arrived on time

A.) Not correct. this implies certainty B.) Not correct. Different sample will give different results. C.) Not correct. The interval is about the population proportion, not the sample proportion in different samples. D.) Not correct. In this sample, we know 88% arrive and on time. E.) Not correct. The interval is about the parameter, not the days

The consumer reports article described in Ex.51 also listed the fat content( in grams) for samples of beef and meat hotdogs. The resulting 90% CI for Mmeat-Mbeef is (-6.5,-1.4). A.) The endpoints of this CI are negative numbers. What does this indicate. B.) What does the fact that the CI does not contain 0 indicate. C.)If we use this CI to test the hypothesis that Mmeat-Mbeef=0, what's the corresponding alpha levels.

A.) Plausible values of Mmeat-Mbeef are all negative, so the mean fat content is probably higher for beef hot dogs. B.) The difference in sample means is significant C.)0.10

A biology student who created a regression model to use a bird's Height when perched for predicting its wingspan made these two statements. A.) My R2 of 93% shows that this linear model is appropriate. B.)A bird 10 inches tall will have a wingspan of 17 inches.

A.) R2 does not tell whether the model is appropriate. High R2 could also be due to and outlier. B.)Predictions based on a regression line are fro average values of y for a given x. The actual wingspan will vary around the prediction.

A specialty food company sells Whole King Salmon to various customers. The mean weight of these salmon is 35 lbs. with a SD of 2lbs. A.) Find the SD of the mean weight of the salmon in each type of shipment B.)The distribution of the salmon weights turn out to be skewed to the high end. Would it be normal for boxes or pallets.

A.) SD's are 1 lb, 0.5 lb, 0.2 lb B.) The distribution of pallets. The CLT tells us that the normal model is approached in the limit regardless of the underlying distribution.

An Ipos/Reuters poll of 2214 US adults voters in April and May 2017 asked a standard polling question of whether the US was headed in the "Right Direction" or was on the "Wrong Trade". A.) Calculate the ME for the proportion of all adult US adults who think things are on the wrong track for 90% confidence. B.)Explain in a simple sentence what your ME means.

A.) SE(^p)= Square root(0.54)(0.46)/2214= 0.0106 ME= 1.645x0.0106=0.017 B.)We are 90% confident that the observed proportion responding "Wrong Track" is within 0.017 of the population proportion.

In 2015, the US census bureau reported that 62.2% of American families owned their homes. the lowest rate in 20 years. A.) in words, what will their hypothesis be? B.) what would a type 1 error be? C) What would a type 2 error be? D.) For each type of error, tell who would be harmed E.) What would the power of the test represent in this context?

A.) The null is that level of home ownership remains the same. B.) The city concludes that home ownership is on the rise. C.) The city abandons the tax breaks D.) Type 1 error causes the city to forgo tax E.) the power of the test is the city's ability

An investment company is planning to upgrade the mobile access to their website, but they'd like to know the proportion of their customers who access it from their smartphones. A.) what would you expect the shape of the sampling distribution for the sample proportion be? B.) What would be the mean of this sampling distribution? C.) What would be the SE of the sampling distribution?

A.) Unimodal and symmetric (roughly normal) B.)0.36 C.)0.034

The 95% CI for the number of teens in Ex.5 who reported that they went online several times a day is from 53%-59%. A.) Interpret the interval in this context. B.) Explain the meaning of "95% confidence" in this context.

A.) We are 95% confident that , if we were to ask all US teens whether they go online daily, between 53% and 59% of them would say they do. B.)If we were to collect many random samples of 1060 teens, about 95% of the CI's we construct would contain the proportion of all US teens who say they go online several times a day.

Consider the poll for Ex.15 A.) Are the assumptions and conditions met? B.) Would the ME be larger or smaller for 95% confidence? Explain

A.) Yes. Random sample and sufficiently large sample B.) Larger. Higher confidence requires a wider confidence interval.

In some situations where the expected cell counts are too small, as in the case of the grades given by professors alpha and beta in exercise 43, we can complete and analysis anyway. A.) find the expected counts for each cell in this new table, and explain why a chi-square. B.) with this change in the table, what has happened to the number of df's C.) test your hypothesis about the two professors and state and appropriate conclusion.

A.) all expected frequencies are now longer than 5 B.) decreased from 4 to 3 C.)x2=9.306 , p-value= 0.0255 because of the p-value is so low, we reject Ho at alpha=0.05. The grade distributions for the two professors are different.

Your economics instructor assign your class to investigate factors associated with the gross domestic product of nations. A.) "my very low concentration of -0.772 shows that there is almost no association between GDP and infant mortality rate" B.) " There was a correlation of 0.44 between GDP and continent."

A.) assuming the relation is linear, a correlation of -0.772 shows a strong relation in a negative direction. B.) continent is a categorical variable correlation does not apply

A medical researcher measured the pulse rates of a sample of randomly selected adults and found the following students t-based confidence interval: A.) Explain carefully what the software output means. B.) What's the ME for this interval? C.) If the researcher had calculated a 99% confidence interval, would the ME be larger or smaller.

A.) based on this sample, we can say, with 95% confidence that the mean pulse rate of adults is between 70.9-74.5 bpm B.) 1.8 Bpm C.) larger

A tire manufacturer tested the braking performance of one of its tire models on a test rack. A.) Write a 95% confidence interval for the mean dry pavement stopping distance B.) Write a 95% confidence interval for the mean increase in stopping distance on wet pavement.

A.) cars were probably not a simple random sample, but may be representative in terms of stopping distance B.) Data are paired by car; cars were probably not randomly chosen, but representative boxplot shows an outlier with a difference of 12.

For each of the following situations, state whether you'd see a chi-square goodness of fit test, a chi-square test of homogeniety, A.)A brokerage firm wants to see whether the type of account a customer has B.) that brokerage firm also wants to know if the ype of account affects the size C.) The academic research office at a large community college wants to see whether the distribution of course chose.

A.) chi-square test of independence. We have sample and two variables. B.) other test. account size is quantitative, not counts C.) Chi-square test of homogeneity. We want to see if the distribution of one variable courses, is the same for two groups.

It's common folk wisdom that drinking cranberry juice can help prevent urinary tract infections in women. A.) is this a survey, retro study, pro study, exp? B.) Will you test goodness, homo, or independent? C.) state the hypothesis D.) Check the conditions E.) How many df's are there? F.) Find x2 and the p-value G.)State your conclusion H.) If you concluded that the groups are not

A.) experiment-actively impose treatments B.) homogeniety C.) Ho: the rate of urinary tract infection is the same. Ha: the rate of urinary is different D.) Count data; random assignment to treatments E.) 2 F.) x2-7.776, P-value= 0.020 G.)with a p-value this low we reject Ho H.) the standardized residuals are

What are the chances your flight will leave on time? The Bureau of Transportation Stats ofthe US department of Transportation publishes info about airline performance. A.) Check the assumption and conditions for inference B.) Find 90% confidence interval for the true % of flights that depart on time. C.) Interpret this interval for a traveler planning to fly

A.) given no time trend, the monthly on-time departure rates should be independent. B.) 77.60%<M< 78.60% C.) We can be 90% confident that the interval from 77.60% to 78.60%

Homer's Iliad is an epic poem, compiled around 800 BCE, that describes several weeks of the last year of the 10 year siege of Troy by the Achaeans. A.) Under the null hypothesis, what are the expected values? B.) Compute the x2 statistic C.) How many df's does it have? D.) find the p-value E.) what do you conclude.

A.) graph B.) 52.65 C.) 2 D.) P<0.001 E.) reject Ho. Site and whether or not it was lethal are not independent.

A poll conducted by the university of Montana classified respondents by whether they were male or female and political party. A.) is this a test of homogeneity or independence? B.) Write and appropriate hypothesis. C.) are the conditions for inference satisfied? D.) find the p-value for your test. E.) state a complete conclusion.

A.) independence B.)Ho: political affiliation is independent of sex Ha: there is a relationship between political C.) counted data; probably a random sample but can't extend results to other states. D.) x2=4.851 , df=2 , p-value= 0.0884 E.) because of the high p-value, we do not reject Ho at alpha= 0.05. These do not provide evidence of a relationship between political affiliation.

Values for the labor force participation rate of women are published by the US Bureau of Labor Statistics. A.) Which of these tests is appropriate for these data? B.) Using the test your selected, state your conclusion.

A.) matched pairs-same cities in different periods B.) There is significant difference in the lavor force participation rate for women in these cities; women's participation seems to have increased between 1968 and 1972.

Consider again the stats about human body temp. in exercise 29 A.) Would 90% confidence interval be wider or narrower than the 98% confidence interval B.) What are the advantages and disadvantages of the 98% confidence interval C.) If we conduct further research, this time using a sample of 500 adults, how would you expect 98% confidence interval to change?

A.) narrower. A smaller region of error B.) advantage, more chance of including the true value. Disadvantage, wider interval C.) narrower. Due to larger sample

Which of these scatter plots show A.) little or no association? B.) A negative association? C.) a linear association? D.) a moderately strong association? E.) a very strong association?

A.) none B.) 3,4 C.)2,3,4 D.)2 E.) 3,1

Which of the following scenarios should be analyzed as paired data? A.) students take and MCAT prep course. Their before and after scores are compared. B.) 20 male and 20 female students in class take a midterm WE compare their scores. C.) A group of college freshman are asked about the quality of the university cafeteria.

A.) paired B.) not paired C.) paired

A company institutes an exercise break for its workers to see if it will improve job satisfaction, as measured by a questionaire that assesses workers' satisfaction. A.) Identify the procedure you would use to assess the effectiveness of the exercise program B.) test an appropriate hypothesis and state your conclusion. C.) If your conclusion turns out to be incorrect

A.) paired sample test. data are before/after for the same workers B.) Ho: Ma=0 vs Ha: M>0. t=3.60, p-value= 0.0029 C.) Type 1

Example 23 describes the loan score method a a bank uses to decide which applicants it will lend money. A.) In this context, what is meant by the power of the test? B.) What could the bank do to increase the power? C.) What's the disadvantage of doing that?

A.) power is the probability that the bank denies a loan that would not have been repaid B.) raise the cutoff score. C.) A larger number of trustworthy people would be denied credit, and the bank would miss the opportunity to collect interest on those loans.

In 1998 as an advertising campaign, the Nabisco company announced a "1000 Chips Challenge" claiming that every 18 oz bag of their Chips Ahoy! cookies contained at least 1000 chocolate chips. Dedicated stat students at the Air Force Academy purchased some randomly selected bags of cookies and counted the chocolate chips.

A.) random sample, the nearly normal condition seems reasonable from a normal B.) (1187.9, 1288.4) Chips C.) based on this sample, the men number of chips in an 18 oz bag is between 1187.9 and 1288.4, with 95% confidence

An analyst at a local bank wonders if the age distribution of customers coming for service at his branch in town is the same as at the branch located near the mall. A.) what is the null hypothesis? B.) What type of test is this? C.) What are the expected numbers for each cell if the null hypothesis is true? D.) find the x2 statistic E.) How many df's does it have? F.) find the p-value G.) what do you conclude?

A.) the age distributions of customers at the two branches are the same. B.) Chi-square test of homogeneity C.) graph D.) 9.778 E.) 2 F.) 0.0075 G.) reject the Ho and conclude that the age distribution.

Here are the residuals for a regression of Sales on Number of Sales People Working for the bookstore in Ex.5 A.) What are the units of the residuals? B.) Which residual contributes the most to the sum that was minimized according to the least squares criterion to find this regression? C.) Which residual contributes least to that sum?

A.) thousands of dollars B.) 2.77 C.)0.07

If we assume that the conditions for correlation are met, which of the following are true? A.) a correlation of -0.98 indicates a strong, negative association. B.) multiplying every value of x by 2 will double the correlation. C.) the units of the correlation are the same as the units o 4.

A.) true B.) false C.) false

Which of the following are true? If false explain briefly. A.) using a alpha level of 0.05 p-value of 0.04 results in rejecting the null hypothesis. B) using alpha level depends on the sample size. C.) with an alpha level of 0.01, a p-value of 0.10 results in rejecting the null hypothesis D.) using an alpha level of 0.05, a p-value of 0.06 means the null hypothesis is true.

A.) true. B.) false. the alpha level is set independent and does not depend on sample size C.) false. the p-value would have to be less than 0.01 to reject the null hypothesis D.) false. it simply means we do not have enough evidence at the alpha level to reject the null hypothesis

For each of the following situations, state whether a Type 1, 2 or neither error has been made. A.) A bank wants to know if the enrollment on their website is above 30% based on small sample. B.) A student test 100 students to determine whether or not students on her campus prefer coke. C.) A human resource analyst wants to know if the applicants this year score on average higher. D.) A Pharmaceutical company test whether a drug lifts the headache relief rate.

A.) type 1 error. not greater than 0.3 B.) no error. the actual value is 0.50 C.) type 2 error. the actual value was 55.3 D.) type 2 error. The null was not rejected.

It's believed that as many as 25% of adults over 50 never graduated from high school. A.) How many of this younger age group must we survey in order to estimate the proportion of non-grads to within 6% with 90% confidence? B.) Suppose we want to cut the ME to 4%. What's the necessary sample size? C.) What sample size would produce a ME of 3%.

A.)141 using ^p=0.25 B.) 318 C.) 564

Recall the data we saw in chapter 6, exercise 3 for a bookstore. The manager wants to predict Sales from Number of Sales People Working. A.) Find the slope estimate, b1. B.) What does it mean, in this context? C.) Find the intercept, bo. D.) What does it mean, int this context? Is it meaningful? E.) Write down the equation that predicts Sales from Number of Sales People Working. F.) If 18 people are working, what Sales do you predict? G.) If sales are actually $25,000, what is the value of the residual? H.) Have we overestimated or underestimated the Sales?

A.)b1=0.914 B.)it means that an additional 0.914($1000) or $914 of sales is associated with each additional sales person working C.)b0=8.10 D.)It would mean that, on average we expect sales of 8.10 or $8100 with 0 sales people working. Doesn't really make sense in this context E.)hat Sales=8.10+.914x F.)$24.55 or $24,550 G.)0.45 or $450 H.)underestimated

As we learned in Chapter 14, Ex. 59, in 1998, as an advertising campaign, the Nabisco Company announced a "1000 Chips Challenge", claiming that every 18-ounce bag of their Chips Ahoy! cookies contained at least 1000 chocolate chips A.) Check the assumptions and conditions for interference. Comment on any concerns you have. B.) Test their claim by performing an appropriate hypothesis test.

A.)random sample; the nearly normal condition seems reasonable from a normal probability plot. The histogram is roughly unimodal and symmetric with no outliers. This is definitely less than 10% of all bag of Chips Ahoy! B.) Ho: M=1000, Ha:M>1000, where M is the mean number of chips per bag; y bar=1238.2, s=94.3, t=10.1, df=15, p-value<0.0001. Because the p-value is so low, we reject the Ho. We have convincing evidence that the mean number of ships per bag is greater than 1000. However, their statement isn't the mean. They claim that all bags have at least 1000 chips . The test doesn't really answer the question.

A national vital statistics report indicated that about 3% of all births produced twins. Is the rate of twin births the same among very young mothers?

Ho: p=0.03; p does not equal 0.015. One mother having twins will not affect another, so observations are independent; not ans SRS: sample is less than 10% of all births. However, the mothers at this hospital may not be representative of all teenagers;(0.03)(469)=14.07>10;(0.97)(469)>10. z=-1.92; p-value=0.055. These data show some (although weak) evidence that the rate of twins born to teenage girls at this hospital may be less than the national rate of 3%. It is not clear whether this can be generalized to all teenagers.

A company is criticized because only 13 of 43 people in executive-level positions are women. The company explains that although this proportion is lower than it might wish, it's not surprising value given that only 40% of all its employees are women.

Ho: p=0.40; Ha: p< 0.40. Data are for all executives in this company and may not be able to be generalized to all companies; (0.40)(43)>10;(0.60)(43)>10. z=-1.31; p-value=0.096. Because the p-value is high, we fail to reject Ho. These data do not show that the proportion of women executives is less than 40% of women in the company in general

Census data for NYC indicate that 29.2% of the under age 18 population is white,28.2%-Black, 31.5%-Latino,9.1%- asisan, and 2% other Do the police officers reflect the ethnic composition of the city's youth? test and appropriate hypothesis and state your conclusion.

Ho: the police force represent the population Ha: the police force is not representative o fthe population. x2= 16,512.7 , df=4 , p-value< 0.0001 We reject Ho.

Is there a significant difference in calories between servings of strawberry and vanilla yogurt? based on the data shown in the table, test and appropriate hypothesis and state your conclusion.

Ho:M=0, Ha: M does not =0 Data are paired by brand; brands are independent of each other; fewer than 10% of all yogurts; boxplot of differences shows and outlier.

A researcher tests whether the mean cholesterol level among those who eat frozen pizza exceeds the valve considered to indicate a health risk. Explain in this context what the "7%" represents

If, in fact, the mean cholesterol level of pizza eaters does not indicate a health risk, then only 7 of every 100 people sampled,on average, would have a mean cholesterol level as high as (or higher than) was observed in this sample.

Looking back at Ex.11, instead of comparing two very similar stores, suppose the researchers had compared purchases at two car dealerships: one that specializes in new Italian sports car & another that carries used domestic vehicles.

No. the two-sample test is almost always the safer choice & here the variances are likely to be quite different. The purchase prices of Italian sports cars are much higher & may be more variable than the domestic prices. They should use the two sample t-test.

What fraction of cars made in Japan? z-interval for proportion w/ 90.00% confidence 0.29938661<P(Japan)<0.46984416

On the basis of the sample we are 90% confident that the proportion of Japanese cars is between 29.9% and 47.0%

For the regression model for the bookstore of Ex. 5, what is the value of R2 and what does it mean?

R2=93.2% about 93% of the variance of Sales can be accounted for by the regression of Sales on Number of Sales Workers.

After the political ad campaign described in Ex. 15, part a, pollster check the governors negatives. A.) There's a 22% chance that the ads worked B.) There's a 78% chance that the ads worked. C.) There's a 22% chance that their poll is correct D.) There's a 22% chance that natural sampling variation could produce poll results like these if there's really no change in public opinion.

Statement d is correct. It talks about the probability of seeing the data, not the probability of the hypothesis.

This ch. ex. 14 and 14.2 looked out mirex contamination in furmed salmon. WE first found a 95% confidence interval for the mean concentration to be 0.0834 to 0.0992 parts per million

The 95% confidence interval lies entirely about the 0.08 ppm limit, evidence that mirex contamination is too high and consistent with rejecting the null.

Continuing with the regression of Ex.1, write a sentence that explains the meaning of the SE of the slope of the regression line, SE(b1)=0.0125, and the corresponding p-value.

The SE of the slope is the estimated SD of the sampling distribution for the slope. It tells us how much the slope of the regression equation would vary from sample to sample. The p-value is essentially zero, which confirms that the slope is statistically significantly different than zero.

The researchers in Ex) 1 decide to test the hypothesis that the means are equal. The df formula gives 162.75 df. Test the null hypothesis a Alpha=0.05

The difference is -$8 w/ an SE of 3.115, so the t-stat is 2.569. With 162.75 (or 161) df, the p-value is 0.011, which is less than 0.05. Reject the lto that the means are equal.

In Ex. 15, the regression model Potassium= 38+ 27 Fiber relates fiber(in grams) and potassium content (in milligrams) in servings of breakfast cereals. Explain what the slope means.

The model predicts that cereals will have approximately 27 more milligrams of potassium,on average, for ever additional gram of fiber.

Repeat the test you did in ex.15, but assume that the variances of purchase amounts are the same at Target & Walmart. Did your conclusion change? Why do you think that is?

The t-statistic is still -2.561 using the pooled estimate of the SD. There are 163 df so the p-value is still 0.011. Same conclusion as before. Because the sample SD's are nearly the same & the sample size are large, the pooled test is essentially the same as the two-sample t-test.

For Ex. 15's regression model predicting potassium content (in milligrams) form the amount of fiber (in grams) in breakfast cereals, se= 30.77. Explain in this context what that means.

The true potassium contents of cereals cary fromt he predicated amounts with a SD of 30.77 milligrams.

Medical researcher followed 1435 middle-aged men for a period of 5 years measuring the amount of baldness present and presence of heart disease. They found a correlation of 0.089 between the two variable. Comment on their conclusion that this shows that baldness is not a possible cause of heart disease.

These are categorical data even though they are represented by numbers. The correlation is meaningless.

How much extra is having a waterfront property worth? A student took a random sample of 170 recently sold properties in Upstate New York to examine the question. Construct and interpret a 95% Ci for the mean additional amount that a waterfront property is worth

These were random samples, both less than 10% of properties sold. Prices of houses should be independent, and random sampling makes the two groups independent. The boxplots make the price distributions appear to be reasonably symmetric, and with the large sample sizes the few outliers don't affect the means much. We are 95% confident that , in NY, having a waterfront is worth, on average, about $59,121-$140898 more in sale price

If the info in Ex.1 is to be used to make inferences about the proportion all canadians & all u.s. citizens born in other countries, what conditions must be met before proceeding? Are they met? Explain.

We must assume the data were collected randomly & that the Americans selected are independent of the Canadians selected. Both assumptions should be met. Also, for both groups, we have at least 10 national-born & foreign-born citizens & the sample sizes are less than 10% of the population sizes. All the conditions for inference are met.

Do consumers spend more on a trip to Walmart or Target. To perform inference on these two samples, what conditions must be met? Are they? Explain.

We must assume the samples were random or otherwise independent of each other. We also assume that the distributions are roughly Normal, so it would be a good idea to check a histogram to make sure there isn't strong skewness or outliers.

Researchers @ the National Cancer Institute released the results of a study that investigated the effect of weed killing herbicides on house pets. a)Whats the standard error of the difference in the two proportions. b) Construct a 95% CI for this difference. c)Stake an appropriate conclusion.

a) 0.035 b)(0.356,0.495) c)we are 95% confident, based on these data, that the proportion of pets w/ a malignant lymphoma in homes where herbicides are used is between 35.6% & 49.5% higher than the proportion of pets w/lymphoma in homes where no pesticides are used.

The researchers from Ex)1 want to test if the proportions of foreign born are the same in the u.s. & Canada. a) what is the difference in the proportions of foreign born residents in both countries b) what is the value of the z-statistic? c) what do you conclude at Alpha=0.05

a) 0.064 b) 3.964 c) P-value is <0.001. very strong evidence, so reject the null hypothesis that the proportions are the same in the two countries

Using the regression output in Ex.1 , identify the residual SD and explain what it means in the context of the problem

s= $5603. This is the SD of the residuals and thus indicates how much the the data points vary about the linear regression model.


Set pelajaran terkait

To Kill a Mockingbird Ch. 20-25 Vocab

View Set

Inquisitive - Chapter 15: "What Is Freedom?": Reconstruction, 1865-1877

View Set

Geo 2152 December Exam Reading Questions

View Set

MLO National Exam Chapter 4 Review Questions

View Set

Sharing Health Information With Family Members and Friends

View Set

Теорія держави і права екзамен

View Set