Statistics

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

A new drug is being tested to see whether it can increase the chance of a quick recovery in people who have come down with the flu in the past week. The rate of quick recovery in the population of concern is 0.87. The null hypothesis is that p​ (the population proportion using the new drug that have a quick recovery​) is 0.87. What is the correct alternative​ hypothesis?

p>0.87

If a researcher wants to claim causation between an explanatory variable and a response​ variable, which of the following should they​ use?

Designed experiment

Review the accompanying scatterplots. Which of the four scatterplots corresponds to the highest R-squared​-value?

III

When analyzing two quantitative​ variables, what is the first thing that should be​ done?

Make a scatterplot.

What purpose does randomization serve in an​ experiment?

Randomization insures that the effect of factors whose levels cannot be controlled is minimized.

If the standard deviation for a data set is​ zero, what can you conclude about the​ data?

The data values must all be equal.

With a​ two-tailed test, if the test statistic​ (such as​ z) is far from​ 0, will the​ p-value be large​ (closer to​ 1) or small​ (closer to​ 0)?

The​ p-value will be small because a test statistic far from 0 is indicative of a very unlikely event.

Eric randomly surveyed 150 adults from a certain city and asked which team in a contest they were rooting​ for, either North High School or South High School. From the results of his​ survey, Eric obtained a​ 95% confidence interval of​ (0.52,0.68) for the proportion of all adults in the city rooting for North High. What proportion of the 150 adults in the survey said they were rooting for North High​ School?

a. 0.60

The sample space given here shows all possible sequences for a family with 4​ children, where B stands for boy and G stands for girl. If all 16 outcomes are equally​ likely, find the probability that there​ are: a. exactly 0 girls. b. exactly 1 boy. c. exactly 2 girls. d. exactly 3 girls. e. exactly 4 boys.

a. 1/16 b. ¼ c. ⅜ d. ¼ e. 1/16

A single die is rolled. Find the probability of rolling an even number or a number less than 3.

2/3 You could roll 1,2,4,6

In a certain​ state, about 85​% of drivers who are arrested for driving while intoxicated​ (DWI) are convicted. a. If 15 independently selected drivers were arrested for​ DWI, how many of them would you expect to be​ convicted? b. Find the probability that exactly 13 out of 15 independently selected drivers are convicted. c. Find the probability that 13 or fewer drivers are convicted.

a. 15(.85)=12.75 round to 13 b. b(15,.85,13) Use http://stattrek.com/online-calculator/binomial.aspx answer=0.286 c. Using the same calculations from part b, read Cumulative Probability: P(X less than or equal to 13) = 0.682

Days before a presidential​ election, a nationwide random sample of registered voters was taken. Based on this random​ sample, it was reported that​ "52% of registered voters plan on voting for Robert Smith with a margin of error of ±​3%." The margin of error was based on a​ 95% confidence level. Can we say with​ 95% confidence that Robert Smith will win the election if he needs a simple majority of votes to​ win?

No, because​ 50% is within the bounds of the confidence interval.

April calculated a correlation coefficient between sex and GPA as minus−0.25. She said there is a weak correlation between a​ person's sex and their GPA. Which of the following is an appropriate comment about​ April's statement?

The correlation coefficient does not make sense to describe the relationship between a categorical and quantitative variable.

The mean age of all 606 used cars for sale in a newspaper one Saturday last month was 7.1 ​years, with a standard deviation of 6.8 years. The distribution of ages is​ right-skewed. For a study to determine the reliability of classified​ ads, a reporter randomly selects 6060 of these used cars and plans to visit each owner to inspect the cars. He finds that the mean age of the 6060 cars he samples is 7.6 years and the standard deviation of those 60 cars is 4.9 years. a. From the problem​ statement, which of the values 7.1​, 6.8​, 7.6​, and 4.9 are parameters and which are​ statistics? Recall that a parameter is a numerical value that characterizes some aspect of the​ population, and a statistic is a numerical characteristic of a sample of data taken from the population. In this​ case, the population is all 606 used cars that were for sale in the​ newspaper, and the sample of data is the 60 cars the reporter randomly selected for his study. From the problem​ statement, the mean age of all 606 cars in the newspaper is 7.1 years. b. Find μ​, σ​, ​s, and overboard. c. Are the conditions for using the CLT​ (Central Limit​ Theorem) fulfilled? d. What would be the shape of the approximate sampling distribution of a large number of​ means, each from a sample of 60 ​cars?

a. From the problem​ statement, the mean age of all 606 cars in the newspaper is 7.1 years. The value 7.1 years is a characteristic of the population. ​Therefore, 7.1 is a population parameter. From the problem​ statement, the standard deviation of all 606 cars in the newspaper is 6.8 years. The value 6.8 years is a characteristic of the population. ​Therefore, 6.8 is a population parameter. From the problem​ statement, the mean age of the 60 cars the reporter randomly selected is 7.6 years. The value 7.6 years is a characteristic of the sample. ​Therefore, 7.6 is a sample statistic. From the problem​ statement, the standard deviation age of the 60 cars the reporter randomly selected is 4.9 years. The value 4.9 years is a characteristic of the sample. ​Therefore, 4.9 is a sample statistic. b. Recall that in this​ case, the population is all 606 used cars that were for sale in the​ newspaper, and the sample of data is the 60 cars the reporter randomly selected for his study. The population mean μ is 7.1. The population standard deviation σ is 6.8. The sample standard deviation s is 4.9. The sample mean overbarx is 7.6. c. First check the random​ sample/independence condition. The requirements to fulfill the random​ sample/independence condition are that each observation is collected randomly from the​ population, and observations are independent of each other. It is given in the problem statement that the reporter randomly selects 60 used cars from the population of 606 used cars in the newspaper. Since finding the age of any one car does not give any information about the ages of the other cars in the​ sample, it is safe to assume that the observations are independent of each other. Now check the Normal condition. The Normal condition is fulfilled if either the population distribution is Normal or the sample size is large. It is given that the population distribution is​ right-skewed, so check the sample size. For most​ applications, a sample size of 25 is large enough. Find the sample size and compare it to 25. 60≥25. What would be the shape of the approximate sampling distribution of a large number of​ means, each from a sample of 60 ​cars? ---​Yes, all the conditions for using the CLT are fulfilled. d. The CLT states that no matter what the shape of the population​ distribution, if a sample is selected such that the relevant conditions are​ met, then the distribution of sample means follows an approximately Normal distribution. --Normal

The bar chart shows market share for search engines used in some country. a. Which of the following best describes the changes from 2009 to​ 2010? b. In which time​ period, 2009 or​ 2010, was there more variability in the market share for search​ engines?

a. In year 2010​, search engine C had a much larger percentage of the market share than search engines A and B. In year 2009​, even though search engine C had the larger percentage of market​ share, the percentages were spread more evenly among the three search engines. b. 2009

Use the histograms to compare the times spent commuting for community college students who drive to school in a car with the times spent by those who take the bus. Which group typically has the longer commute​ time? Which group has the more variable commute​ time? a. Which group typically has the longer commute​ time? b. Which group has the more variable commute​ time?

a. Students who take the bus b. Students who take the bus

A national survey of 1,979 residents over the age of 18 reported the number of years of formal education of the respondents. The linked histogram shows the distribution of data. a. Describe and interpret the distribution of years of formal education. Mention any unusual features. b. Calculate the number of people in this sample who completed 18 years or more of formal education. c. The sample includes 1,979 people. What percentage of people in this sample have 18 years or more of formal​ education?

a. The distribution is multimodal and best described as left-skewed. b. There were 200 people who completed 18 years or more of formal education. c. The percentage of people that have 18 years or more of formal education is 10 percent.

A married couple plans to have four​ children, and they are wondering how many boys they should expect to have. Assume none of the children will be twins or other multiple births. Also assume the probability that a child will be a boy is 0.50. Explain why this is a binomial experiment. Check all four required conditions. a. Do the possible trial outcomes meet the binomial​ conditions? b. Does the number of trials meet the binomial​ conditions? c. Does the probability of success meet the binomial​ conditions? d. Do the trials meet the binomial​ conditions?

a. Yes, there are two complementary​ outcomes, either a boy or a girl b. Yes, there are 4 fixed​ trials, since there are 4 children. c. Yes, the probability of having a boy is 0.50 for each child. d. Yes, because it is assumed there are no​ twins, the gender of one child does not affect the gender of another.

In a television​ advertisement, a company called​ "Waist Away" claimed the workout program on their set of DVDs would help people lose weight more than any other DVD workout program. To test this​ claim, an independent​ company, called​ "Slim Down," selected one other DVD program. They then randomly assigned half the volunteers to the Waist Away program and the other half to the Slim Down program. Each participant was weighed before they started the program and then regularly participated in their assigned program for one month. After one​ month, each participant was weighed again. The percent of weight lost was recorded for each​ person, where negative values indicated a weight gain. What type of study was​ performed?

experiment

Match each description with the correct histogram of the data. 1. Heights of students in a large statistics class that contains about equal numbers of men and women. 2. Numbers of hours of sleep the previous night in the same large statistics class. 3. Numbers of driving accidents for students in a large university in the U.S.

1→​B, 2→​A, 3→C

Match each description with the correct histogram of the data. 1. The age of death of a sample of 19 typical women in the U.S. 2. The yearly tuition for 142​ colleges, 85 of which are private and 57 of which are​ state-supported. 3. The outcomes of rolling a fair die​ (with six​ sides) 5000 times.

1→​B, 2→​C, 3→A

A company advertises a mean lifespan of 1000 hours for a particular type of light bulb. If you were in charge of quality control at the​ factory, would you prefer that the standard deviation of the lifespans for the light bulbs be 5 hours or 50​ hours? Why?

5 hours would be preferable since a smaller standard deviation indicates more consistency.

The distribution of the scores on a certain exam is ​N(40​,10​), which means that the exam scores are Normally distributed with a mean of 40 and standard deviation of 10. a. Sketch the curve and​ label, on the​ x-axis, the position of the​ mean, the mean plus or minus one standard​ deviation, the mean plus or minus two standard​ deviations, and the mean plus or minus three standard deviations. b. Find the probability that a randomly selected score will be lessless than 30. Shade the region under the Normal curve whose area corresponds to this probability.

A. se picture B. Using the Empirical​ Rule, the probability that a randomly selected score will be lessless than 30 is about 16​%.

The percentage of​ left-handed people in a certain country is estimated to be 99​%. Women are about six times as likely to be​ left-handed as men. Are gender and handedness independent or​ associated? Explain.

Gender and handedness are associated because women are more likely to be​ left-handed than men.

​Historically, the percentage of residents of a certain country who support stricter gun control laws has been 54​%. A recent poll of 1053 people showed 528 in favor of stricter gun control laws. Assume the poll was given to a random sample of people. Test the claim that the proportion of those favoring stricter gun control has changed. Perform a hypothesis​ test, using a significance level of 0.05. State the null and alternative hypotheses. Compute the sample proportion ModifyingAbove p with caretp​, the standard error​ SE, the​ z-test statistic, and the​ p-value.

H0​: The population proportion that supports stricter gun control is 0.547, p=0.54. Ha​: p≠0.54 ^p=528/1053=.5014 Use https://www.easycalculation.com/statistics/standard-error-sample-proportion.php (See picture for how to enter data.) SE=0.01536 Use https://www.easycalculation.com/statistics/test-for-one-proportion.php (See picture for how to enter data.) z=-2.51 Use http://www.socscistatistics.com/pvalues/normaldistribution.aspx (See picture for how to enter data.) p-value=0.012 Reject H0, the percentage isis significantly different from 54​%.

Refer to Histograms​ A, B, and​ C, which show the relative frequencies from experiments in which a fair​ six-sided die was rolled. One histogram shows the results from 40 ​rolls, one the results for 100 ​rolls, and another the results for 20,000 rolls. Which histogram do you think was for 20,000 rolls and​ why?

Histogram A was for 20,000 rolls because the relative frequency of all the outcomes is close to the predicted probability.

Predict the shape of the distribution of the salaries of 25 chief executive officers​ (CEOs). A typical value is about 50 million per​ year, but there is an outlier at about 200 million.

It should be​ right-skewed.

Suppose the list below shows how many text messages Elyse sent each day for the last 10 days. If Elyse wants to know how many text messages she typically sends each​ day, which measure of central tendency better describes the typical number of text messages per​ day?

Median; The median of 27.5 is a better representative of the center since it is resistant to the one extreme value. The mean of 33.3 is not representative of the typical number of texts since only one number is larger than the mean.

According to the ancient Roman architect​ Vitruvius, a​ person's armspan​ (the distance from fingertip to fingertip with the arms stretched​ wide) is approximately equal to his or her height. For​ example, people 5 feet tall tend to have an armspan of 5 feet.​ Explain, then, why the distribution of armspans for a class containing roughly equal numbers of men and women might be bimodal.

Men and women tend to have different heights and therefore different armspans

If 23​% of Americans households own one or more dogs and 42​% own one or more​ cats, then from this​ information, is it possible to find the percentage of households that own a cat OR a​ dog? Why or why​ not?

No, because the event of owning a dog and the event of owning a cat are not mutually exclusive.​ Therefore, to find the percentage of people that own a cat or a​ dog, it is necessary to know the percentage of people that own a cat and a dog.

In a study conducted to examine the quality of fish after 7 days in ice​ storage, ten raw fish of the same kind and approximately the same size were caught and prepared for ice storage. The fish were placed in ice storage at different times after being caught. A measure of fish quality was given to each fish after 7 days in ice storage. Review the accompanying sample​ data, scatterplot, and Minitab output from a simple linear regression​ analysis, where​ "Time" is the number of hours after being caught that the fish was placed in ice storage and​ "Fish Quality" is the measure given to each fish after 7 days in ice storage​ (higher numbers mean better​ quality). Is it appropriate to use this regression model to predict​ "Fish Quality" for a fish put in storage 20 hours after it was​ caught?

No; the linear pattern that exists in the model from the sample data may not continue in the same fashion outside the range of the​ x-values from the sample data.

According to a study published in a reputable science​ magazine, about 7 women in​ 100,000 have cervical cancer​ (C), so ​P(C)equals=0.00007. Suppose the chance that a Pap smear will detect cervical cancer when it is present is 0.84. ​Therefore, P(test pos|C)=0.84. What is the probability that a randomly chosen woman who has this test will both have cervical cancer AND test positive for​ it?

P(have C AND test ​positive)=.000059 ​ (0.00007/100)*84

For each​ graph, indicate whether the shaded area could represent a​ p-value. Explain why or why not. If​ yes, state whether the area could represent the​ p-value for a​ one-tailed or a​ two-tailed alternative hypothesis.

The shaded area could be a​ p-value for a test with a​ two-tailed alternative hypothesis since both tails are of equal size. The shaded area could not be a​ p-value because it does not include tail areas.

A survey asked people how many years of education they had and how many years their mothers had.​ (A high school graduate with no further education would report 12 years of​ education, a​ bachelor's degree 16​ years.) 2018 people answered the question about their own​ education, and 1780 answered the question about their mothers. Use the distributions linked below to compare the years of education of the respondents with the years of education of their mothers.

Typically, respondents in this sample had more years of formal education than did their mothers.

Identify when the interquartile range is better than the standard deviation as a measure of dispersion and explain its advantage.

When the distribution is skewed left or right or contains some extreme​ observations, then the interquartile range is preferred since it is resistant.

The number of married people in a country​ (in millions) and the total number of adults in the country​ (in millions) are provided in the accompanying table for several years. Find the percentage of people married in each of the given​ years, and describe the trend over time. a. The percentage of the adult population that was married in 1990 was b. The percentage of the adult population that was married in 1997 was c. The percentage of the adult population that was married in 2000 was d. The percentage of the adult population that was married in 2007 was Which of the following best describes the trend over​ time?

a. 58.1% b. 56.2% c. 54.7% d. 53.3% The percentage of married people is decreasing over time​ (at least with these​ dates).

Some people were asked about their happiness and were also asked whether they agreed with the following​ statement: "In a​ marriage, the husband should work and the wife should take care of the​ home." The table below summarizes the data collected. Complete parts a and b below. a. Add the marginal totals and the grand total to the table. b. Is being happy independent of agreeing with the​ statement?

a. See picture. b. No

Both figures below concern the assessed value of land that includes​ homes, and both use the same dataset. a. Which do you think has a stronger relationship with value of the landlong dash—the number of acres of land or the number of rooms in the​ homes? Why? b. If you were trying to predict the value of a parcel of land in this area​ (on which there is a​ home), would you be able to make a better prediction by knowing the acreage or the number of rooms in the​ house? Explain.

a. The number of acres of land has a stronger relationship with the value of the​ land, as shown by the fact that the points are less scattered in a vertical direction. b. The acreage because the association is stronger between the value of land and acreage than with the number of rooms because the vertical spread is less.

Both figures below concern the assessed value of land that includes​ homes, and both use the same dataset. Complete parts a through b below. a. Which do you think has a stronger relationship with value of the landlong dash—the number of acres of land or the number of rooms in the​ homes? Why? b. If you were trying to predict the value of a parcel of land in this area​ (on which there is a​ home), would you be able to make a better prediction by knowing the acreage or the number of rooms in the​ house? Explain. Choose the correct answer below.

a. The number of acres of land has a stronger relationship with the value of the​ land, as shown by the fact that the points are less scattered in a vertical direction. b. The acreage because the association is stronger between the value of land and acreage than with the number of rooms because the vertical spread is less.

Determine whether each of the following variables would best be modeled as continuous or discrete. a. The number of light bulbs that burn out in the next week in a room with 16 bulbs. b. The height of a randomly selected giraffe. c. The number of statistics students now reading a book. d. The number of textbook authors now sitting at a computer. e. The distance a baseball travels in the air after being hit.

a. discreet b. continuous c. discreet d. discreet e. continuous

A friend claims hehe can predict the suit of a card drawn from a special deck of 114 cards. There are six suits and equal numbers of cards in each suit. The​ parameter, p, is the probability of​ success, and the null hypothesis is that the friend is just guessing. a. What is the correct null​ hypothesis? b. What hypothesis best fits the​ friend's claim?​ (This is the alternative​ hypothesis.)

a. p=1/6 b. p>1/6

Three scatterplots are shown below. The calculated correlations are 0.626​, −0.896, and −0.016. Determine which correlation goes with which scatterplot.

​(a) −0.016 ​(b) 0.626 ​(c) -0.896

a. In your own​ words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What​ action(s) should be taken with an​ outlier? b. Which measure of the center​ (mean or​ median) is more resistant to​ outliers, and what does​ "resistant to​ outliers" mean?

a. Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are​ mistakes, they might be removed or corrected. If they are not​ mistakes, you might do the analysis​ twice, once with and once without the outliers. b. The median is more​ resistant, which indicates that it usually changes less than the mean when comparing data with and without outliers.

Suppose a poll of 808 adults asked how to deal with an energy​ shortage: use more​ production, more​ conservation, or both. If one person is selected randomly from the 808 adults​ polled, what is the probability of each of the following​ outcomes? a. The person responded ​"More production​". b. The person responded ​"Both​". c. The person responded ​"More production​" or ​"Both​".

a. The probability that the person responded ​"More productionMore production​" is .478. b. The probability that the person responded ​"BothBoth​" is .069 c. The probability that the person responded ​"More productionMore production​" or ​"Both​" is .547.

According to a candy​ company, packages of a certain candy contain 30% orange candies. Suppose we examine 750 random candies. a. What value should we expect for our sample percentage of orange​ candies? b. What is the standard​ error? c. Use your answers to fill in the blanks below. We expect​ ____% orange​ candies, give or take​ _____%.

a. We should expect 30​% of the candies in the sample to be orange. b. see picture 0.017, or 1.7% c. We expect 30​% orange​ candies, give or take 1.7​%.

A college student asked students who had​ full-time jobs and students who had​ part-time jobs how many times they went out to eat in the last month. Write a brief comparison of the distributions of the two groups. Write a brief comparison of the distributions of the two groups.

​Full-time is a bit left-skewed, and​ part-time is a bit right-skewed. Those with​ full-time jobs typically go out to eat more than those with​ part-time jobs. The​ full-time workers have a distribution with more variability. There​ is/are no outliers in the distributions. Open in stat crunch, then create separate histograms for fun and part time (graph, histogram), then by selecting stats>summary stats>column, you can look at the averages for each.

A commuter has a choice of two routes for his morning drive to work. In an effort to determine the best​ route, he collects data on his drive time for each route. If he is interested in a route with a predictable drive​ time, which route should he choose and​ why?

He should choose the Rural Roads route. The smaller IQR indicates a smaller spread and more consistent times for that route.

Shortly after Country C invaded Country A a website posed a question to readers of their magazine on the​ Internet: "Who really poses the greatest danger to world​ peace? Country​ A, Country​ B, or Country​ C?" The site received 706 comma 706,987 ​responses: 6.6​% said Country​ A, 6.4​% said Country​ B, and 86.9​% said Country C. Identify the​ population, and explain why the results might not reflect the true opinions of the population.

The population of all readers of the magazine on the Internet. Because it was a​ voluntary-response sample, it could be that only people who were really angry with Country C took the time to respond.

Look at the histograms of real estate data and decide whether you think the standard deviation of home prices in City A was larger or smaller than the standard deviation of home prices in City B.

The prices of the houses in City B have a larger standard deviation than the prices in City​ A, because the data from City A show a lot of prices near the center of the graph and the prices in City B show a lot of prices far from the center of the graph.

Look at the histograms of real estate data and decide whether you think the standard deviation of home prices in City A was larger or smaller than the standard deviation of home prices in City B. Explain.

The prices of the houses in City B have a larger standard deviation than the prices in City​ A, because the data from City A show a lot of prices near the center of the graph and the prices in City B show a lot of prices far from the center of the graph.

The histogram shows the ages of 25 CEOs listed on a certain website. Based on the​ distribution, what is the approximate mean age of the CEOs in this data​ set? Write a sentence in context​ (using words in the​ question) interpreting the estimated mean.

The typical CEO is between 56 and 60.

Days before a presidential​ election, an article based on a nationwide random sample of registered voters reported the following​ statistic, "52% ​(±​3%) of registered voters will vote for Robert​ Smith." What is the ​"±​3%" ​called?

The ​"±​3%" is called the margin of error.

Which of the following statements best describes this​ scatterplot?

There is a positive relationship between X and Y with 3 outliers that have unusual​ X-values. The relationship seems to weaken as X increases.

In a standard Normal​ distribution, if the area to the left of a​ z-score is about 0.4000​, what is the approximate​ z-score? Draw a sketch of the Normal​ curve, showing the area and​ z-score.

Use the second line (Calculate z from cumulative probability p) on http://sampson.byu.edu/courses/z2p2z-calculator.html z=-.25

Suppose the heights of women at a college are approximately Normally distributed with a mean of 64 inches and a population standard deviation of 2.0 inches. What height is at the 45th ​percentile?

Use the second line (Calculate z from cumulative probability p) on http://sampson.byu.edu/courses/z2p2z-calculator.html Enter .45 to get z=-0.13 64-0.13(2.0)=63.7 Therefor the 45th percentile is 63.7 inches.

Have you ever wondered whether you could afford to move to another​ city? The dotplot shows rental prices​ (dollars per​ month) in three cities for units with at least one bedroom and one bathroom. Complete a through c below. a. Which city typically has the highest​ rents? b. Which​ city's distribution has the smallest​ variation? c. Which​ city's distribution is the least​ skewed?

a. City 1 b. City 3 c. City 3

State whether each of the following changes would make a confidence interval wider or narrower.​ (Assume that nothing else​ changes.) a. Changing from a 95​% confidence level to a 99​% confidence level. b. Changing from a sample size of 300 to a sample size of 25. c. Changing from a standard deviation of 25 pounds to a standard deviation of 15 pounds.

a. The interval will become wider. b. The interval will become wider. c. The interval will become narrower.

In a study conducted to examine the quality of fish after 7 days in ice​ storage, ten raw fish of the same kind and approximately the same size were caught and prepared for ice storage. The fish were placed in ice storage at different times after being caught. A measure of fish quality was given to each fish after 7 days in ice storage. The sample data are shown​ below, where​ "Time" is the number of hours after being caught that the fish was placed in ice storage and​ "Fish Quality" is the measure given to each fish after 7 days in ice storage​ (higher numbers mean better​ quality). Which variable would be considered the response​ variable?

Fish Quality

The data in the given table gives the automobile fatality rates​ (deaths per​ 100,000 residents) in 2005 for the 25 states with the highest death rates. The​ TI-83/84 output to the right shows some descriptive statistics for this data set. Make a boxplot of the data. If you find potential​ outlier(s), identify the​ state(s). The​ state(s) that​ is/are potential​ outlier(s) is/are

See picture letters o,p.

A community college faculty is negotiating a new contract with the school board. The distribution of faculty salaries is skewed right by several faculty members who make over​ $100,000 per year. If the faculty want to give the community the impression that they deserve higher​ salaries, should they advertise the mean or median of their current​ salaries?

The faculty should use the median to make their argument. The median will be lower than the mean since the mean is influenced by the few extremely high salaries.

The figure shows the total assessed value of some properties​ (the total value of the home and the​ land) and the acreage that comes with the homes. Describe what you see. Is the trend positive or​ negative? What does that​ mean?

The trend is​ positive, but the trend tends to level off at higher acreages.​ Generally, homes on larger acreages have larger assessed values.

Assume a standard Normal distribution. Draw a​ well-labeled Normal curve for each part. a. Find the​ z-score that gives a left area of 0.6768. b. Find the​ z-score that gives a left area of 0.1332.

Use the second line (Calculate z from cumulative probability p) on http://sampson.byu.edu/courses/z2p2z-calculator.html a. z=0.46 b. z=-1.11

For each of the​ following, state whether a​ one-proportion z-test or a​ two-proportion z-test would be​ appropriate, and name the populations. a. A polling agency takes a random sample to determine the proportion of people in a state who support a certain proposition to determine if it will pass. b. A student asks men and women whether they support capital punishment for some murderers. She wants to find if the proportion of women who support capital punishment is less than the proportion of men who support capital punishment.

a. A one-proportion ​z-test would be appropriate. Name the​ population(s). People from the state. b. A two-proportion z-test would be appropriate. Name the population(s). Both men and women.

A group of 50​ students, 25 men and 25​ women, reported the number of hours per week they spent studying statistics. a. Refer to the histograms. Which measure of center should be​ compared: the means or the​ medians? Why? b. Compare the distributions in context using appropriate measures.​ (Don't forget to mention ​ outliers, if​ appropriate). Refer to the table for the summary statistics.

a. Both data sets are​ right-skewed and have outliers that represent large numbers of hours​ studying; so the medians and interquartile ranges should be compared. b. The women tended to study more as measured by the​ median, and had the same variation as measured by the interquartile range.

Two surfers and statistics students collected data on the number of days on which surfers surfed in the last month for 3030 longboard​ (L) users and 30 shortboard​ (S) users. Treat these data as though they were from two independent random samples. Test the hypothesis that the mean days surfed for all longboarders is larger than the mean days surfed for all shortboarders​ (because longboards can go out in many different surfing​ conditions). Use a level of significance of 0.05. a. Determine the hypotheses for this test. Choose the correct answer below. b. Find the test statistic for this test. Find the​ p-value for this test. What is the conclusion for this​ test?

a. H0​: μL=μS Ha​: μL>μS b. Open stat crunch (stat, t-stat, two sample, with data, perform hypothesis test :H0 : μ1 - μ2 = 0 and HA : μ1 - μ2 > 0, compute). t=0.92 ​p-value=0.180 DO not reject H0. The mean days surfed for longboarders is notis not significantly larger than the mean days surfed for all short boarders.

Measurements were made for a sample of adult men. A regression line was fit to predict the​ men's armspan from their height. The output from several statistical technologies is provided in the accompanying charts. The scatterplot confirms that the association between armspan and height is linear. a. Report the equation for predicting armspan from height. Use words such as​ armspan, not just x and y. b. Explain how to find the slope and intercept for each of the provided outputs. c. Explain how to find the slope and intercept for the displayed output. Choose the correct answer below d. Explain how to find the slope and intercept for the displayed output. Choose the correct answer below. e. Explain how to find the slope and intercept for the displayed output. Choose the correct answer below.

a. Predicted Armspan=22.2+2.265 Height b. The slope is the number being multiplied by​ Height, and the intercept is the constant. c. In the fourth​ line, the slope is multiplied by the​ Height, and the intercept is the constant. d. The slope is the X Variable​ value, and the intercept is the Intercept value. e. The​ a-value is the​ intercept, and the​ b-value is the slope.

The equation for the regression line relating the salary and the year first employed is given above the figure. a. Report the slope and explain what it means. b. Either interpret the​ y-intercept of​ 4,255,424 or explain why it is not appropriate to interpret the​ y-intercept.

a. The average salary is​ $2099 less for each year later that the person was hired or an average of​ $2099 more for each year earlier. b. The​ y-intercept of​ $4,255,424 would be the salary for a person who started in the year​ 0, which is not appropriate to interpret.

Data for the ages of grooms and their brides for a random sample of 31 couples in a certain county are provided. Complete parts a through c below. a. Find and compare the sample means. The sample mean for the​ grooms, _______​, is ______ than the sample mean for their​ brides, ______. b. Test the hypothesis that there is a significant difference in mean ages of brides and​ grooms, using a significance level of 0.05. What is the conclusion for this​ test? c. If the test had been done to determine whether the mean for the grooms was significantly larger than the mean for the​ brides, how would that change the alternative hypothesis and the​ p-value? What would the alternative hypothesis​ become? How would the​ p-value change?

Open in stat crunch (stats, summary stats, columns [do bride and groom separately]). a. 27.710, greater, 25.871 b. In stat crunch complete a paired t-stat (stat, t-stat, paired). t=3.68 p=0.000 Reject H0, the mean ages are significantly different. c. ​Ha : μdifference>0 The new​ p-value would be half of the​ p-value from the test performed above.

Have you ever wondered whether you could afford to move to another​ city? The dotplot shows rental prices​ (dollars per​ month) in three cities for units with at least one bedroom and one bathroom. Complete a through c below. a. Which city typically has the highest​ rents? b. Which​ city's distribution has the smallest​ variation? c. Which​ city's distribution is the least​ skewed?

a. City 2 b. City 1 c. City 1

The table shows prices of 50 college textbooks in a community college​ bookstore, rounded to the nearest dollar. Make an appropriate graph of the distribution of the​ data, and describe the distribution. Choose the correct histogram below. Describe the distribution. Choose the correct answer below.

a. Open in stat crunch, then see picture. b. The histogram is bimodal with modes at about​ $30 and about​ $90

Two symbols are used for the​ mean: μ and overbar x. a. Which represents a parameter and which a​ statistic? b. In determining the mean age of all students at your​ school, you survey 30 students and find the mean of their ages. Is this mean overbar x or μ​?

a. The symbol μ represents a parameter and overbar x represents a statistic. b. The mean is overbar x.

The table gives information on majors at a certain college. Sketch an appropriate graph of the​ distribution, and comment on its important features. One type of graph appropriate for showing this distribution is a bar chart. Which of the following bar charts correctly shows the given​ distribution? Which of the following would also be appropriate graphs for showing the given​ distribution? Which of the following is the best description of the​ distribution's important​ features?

a. see picture b. pie chart and Pareto chart c. The mode is Math and Science (MS)

Suppose two classes take the same exam and produce the results shown in the following histograms. Which class had a higher standard deviation on the​ exam?

Class 2 had a higher standard deviation on the exam since there are more scores on the low end and more scores on the high end of the range.

The margin of error is​ _____________ the width of the confidence interval.

The margin of error is half the width of the confidence interval.

A university conducted a survey of 275 ​sophomore, junior, and senior undergraduate students regarding satisfaction with student government. Results of the survey are shown in the table by class rank. A survey participant is selected at random. What is the probability that he or she is a sophomore AND satisfied​? In other​ words, find​ P(person is a sophomore AND satisfied).

The probability that a student is a sophomore and is satisfied is .178 (49/275)

The histogram shows frequencies for the ages of 25 randomly selected CEOs. Approximately what is a typical age of a CEO in this​ sample?

The typical age of a CEO in this sample is between 56 and 60 years old.

It is recommended that adults get 8 hours of sleep each night. A researcher hypothesized college students got less than the recommended number of hours of sleep each​ night, on average. The researcher randomly sampled 20 college students and found no evidence to reject the null hypothesis at the​ 5% significance level. What is true regarding the​ p-value from this hypothesis​ test?

The​ p-value must have been greater than 0.05.

A vaccine to prevent a severe virus was given to children within the first year of life as part of a drug study. The study reported that of the 3028 children randomly assigned the​ vaccine, 55 got the virus. Of the 1718 children randomly assigned the​ placebo, 44 got the virus. a. Find the sample percentage of children who caught the virus in each group. Is the sample percentage lower for the vaccine​ group, as investigators​ hoped? b. Determine whether the vaccine is effective in reducing the chance of catching the​ virus, using a significance level of 0.01. The first few steps of the​ hypothesis-testing procedure are given. Complete the procedure. c. What is the conclusion for this​ test?

a. 1.82%, 2.56%, yes b. z=-1.73 p-value is 0.04182 http://www.socscistatistics.com/tests/ztest/Default2.aspx c. H0. There is not sufficient evidence to conclude that the vaccine is effective in reducing the chance of catching the virus at the significance level of 0.01.

A study of a population showed that​ males' body temperatures are approximately Normally distributed with a mean of 98.1°F and a population standard deviation of 0.30°F. What body temperature does a male have if he is at the 70th ​percentile? Draw a​ well-labeled sketch to support your answer.

a. 98.3 degrees F Use the second line (Calculate z from cumulative probability p) on http://sampson.byu.edu/courses/z2p2z-calculator.html Enter .70. z=.52 98.1+0.52(0.30)=98.3 ​Therefore, a male at the 70th percentile has a body temperature of 98.3°F.

Twenty girls​ (ages 9-10) competed in the​ 50-meter freestyle event at a local swim meet. The mean time was 43.70 seconds with a standard deviation of 8.07 seconds. The median time was 40.15 with an IQR of 4.98 seconds. Without looking at a graphical​ display, what shape would you expect the distribution of swim times to​ have?

The distribution is likely to be skewed to the right since the mean is greater than the median. The presence of outliers on the high end could explain the higher standard deviation and the greater mean.

Suppose a daily high temperature for a city is accidentally recorded as 700 instead of 70 degrees Fahrenheit. How would this affect the weekly mean high temperature compared to the monthly mean high​ temperature? Explain.

The monthly mean will be less affected than the weekly mean since the larger number of observations lessens the impact of that individual data value.

A researcher is testing someone who claims to have ESP by having that person predict whether a coin will come up heads or tails. The null hypothesis is that the person is guessing and does not have​ ESP, and the population proportion of success is 0.50. The researcher tests the claim with a hypothesis​ test, using a significance level of 0.05. Fill in the blanks below with an accurate statement about the potential conclusion of this test.

The probability of concluding that the person has ESP when in fact she or he does not have ESP is 0.05.

A fair coin is flipped 75 times. a. Find the expected number of heads. b. Find the standard deviation for the number of heads. c. Determine how many heads you should​ expect, give or take how many. Give the range of the number of heads based on these numbers.

a. 75(.5)=37.5 round to 38 b. Use http://vassarstats.net/binomialX.html Enter n=75 ,k=38, p=.5 answer=4.33, round to 4 c. 34-42 out of 75. Subtract 1 SD from 38 and get 34. Add one SD to 38 and get 42.

The dotplot shows the body mass index​ (BMI) for 131 randomly surveyed people from a certain country. a. A BMI of more than 40 is considered morbidly obese. Report the number of morbidly obese shown in the dotplot. b. Report the percentage of people who are morbidly obese. Compare this with a recent estimate that​ 3% of people from the country were morbidly obese. a. From the people​ surveyed, how many people would be considered morbidly​ obese? The number of people considered morbidly obese is b. Find the percentage of people from the survey who are morbidly obese. The percentage of people considered morbidly obese is about c. Compare the survey value with the estimate given. The percentage of morbidly obese people in the survey is _____ the estimated percentage of people that are considered morbidly obese.

a. 9 (count all dots past the 40 mark, but not the ones above the 40 mark). b. 7% [(9*100)/131] c. greater than

A group of 50​ students, 25 men and 25​ women, reported the number of hours per week they spent studying statistics. Use the table and histograms below to complete parts a. and b. a. Refer to the histograms. Which measure of center should be​ compared: the means or the​ medians? Why? b. Compare the distributions in context using appropriate measures.​ (Don't forget to mention ​ outliers, if​ appropriate). Refer to the table for the summary statistics.

a. Both data sets are​ right-skewed and have outliers that represent large numbers of hours​ studying; so the medians and interquartile ranges should be compared. b. The women tended to study more as measured by the​ median, and had more variation as measured by the interquartile range.

College students and surfers Rex Robinson and Sandy Hudson collected data on the​ self-reported number of days surfed in a month for 30 longboard surfers and 30 shortboard surfers. a. Compute the means for both longboards and shortboards. b. Compute the standard deviation of both longboards and shortboards.

a. The mean for longboards is 13.4 days. ​ The mean for shortboards is 11.1 days. Fill in the correct answer below. So long boarders tend to go surfing more. b. Compute the standard deviation of both longboards and shortboards. The standard deviation for longboards is 5.3 days. The standard deviation for shortboards is 4.9 days. Fill in the correct answer below. So longboarders tend to have more varied days surfing.

Find the sample size required for a margin of error of 2 percentage​ points, and then find one for a margin of error of 1 percentage​ point; for​ both, use​ 95% confidence. Find the ratio of the larger sample size to the smaller sample size. To reduce the error by​ half, what do you need to multiply the sample size​ by? a. Use the shortcut formula n=1/m^2​, where n represents the population size and m represents the margin of error in decimal​ form, to find the necessary sample size required for a margin of error of 2 percentage points. b. Find the necessary sample size required for a margin of error of 1 percentage point. c. Use the populations obtained above to find the ratio of the larger sample size to the smaller sample size. d. To reduce the error by​ half, what do you need to multiply the sample size​ by?

a. n=2500 b. n=10000 c. 4 d. 4

The mean total cholesterol for a population of men between the ages of 20 and 29 is 180 micrograms per deciliter with a standard deviation of 33.9. A healthy total cholesterol level is less than 200, 200-240 is​ borderline, and above 240 is dangerous. Assume the distribution is approximately Normal. a. For a randomly selected man from this​ group, what is the probability that his total cholesterol level is 200 or​ more? b. For a randomly selected man from this​ group, what is the probability that his total cholesterol level is 240 or​ more? c. If two randomly selected men are chosen from this​ group, what is the probability that both will have a total cholesterol level of 200 or​ more? Assume independence. d. If 800 randomly selected men are chosen from this​ group, how many​ (the count, not the​ percentage) would you expect to have a total cholesterol level of 200 or​ more?

(Use z=(x-mu)/std dev ) z=(200-180)/33.9=.59 Can use http://ncalculators.com/statistics/z-score-calculator.htm a. The probability is .2776 that his total cholesterol level is 200 or more. (Use "Area Under the Normal Distribution Chart" to find area to left of .59, then subtract that number from 1 [1-0.7224=0.2776]. Or use http://www.mathportal.org/calculators/statistics-calculator/z-score-calculator.php) b. The probability is .0384 that his total cholesterol level is 240 or more. (z=(240-180)/33.9=1.77. Find on chart and subtract from 1) c. The probability is .0771 that both men will have a total cholesterol level of 200 or more. (Use probability from part a and use equation [​(0.2776​)(0.2776​)=0.0771] d. You would expect 164 men to have a total cholesterol level of 200 or more.(0.2776•800=222)

A random and independently chosen sample of four bags of horse​ carrots, each bag labeled 20​ pounds, had weights of 20.3​, 19.8​, 20.8​, and 20.0 pounds. Assume that the distribution of weights in the population is Normal. a. Test the hypothesis that the population mean weight is not 20 pounds. Which of the following correctly states H0 and Ha​? Find the test statistic. Find the​ p-value. Reject or do not reject H0. Choose the correct answer below. b. Test the hypothesis that the population mean is less than 20 pounds. Which of the following correctly states H0 and Ha​? Find the test statistic. Find the​ p-value. Reject or do not reject H0. Choose the correct answer below. c. Test the hypothesis that the population mean is more than 20 pounds. Which of the following correctly states H0 and Ha​?

(to find the test statistic and p-value, open in stat crunch, enter values in var1, then compete a "One sample T hypothesis test" [stat, t-stat, one sample with data, compute]) a. H0​: μ=20 Ha​: μ≠20 t=1.03 p=0.337 Do not reject H0. There is no reason to believe that the population mean is not 20 pounds on the basis of these data at a significance level of 0.05. b. ​H0 : μ=20 Ha​: μ<20 t=.103 p=.812 Do not reject H0. There is nois no reason to believe that the population mean is less than 20 pounds on the basis of these data at a significance level of 0.05. c. H0 ​: μ=20 Ha​: μ>20 t=1.03 p=.19 Do not reject H0. There is nois no reason to believe that the population mean is more than 20 pounds on the basis of these data at a significance level of 0.05.

What is a placebo and what purpose does it serve in an​ experiment?

A placebo is a fake treatment that looks like the treatment being tested in the experiment. Placebos blind subjects so they do not know whether or not they are receiving the treatment.

A pair of college students have decided to test vitamin C to see whether it prevents colds. They recruit 500 students with a​ sign-up sheet, containing a numbered list. The first half of those on the sheet​ (Numbers 1-250) are asked to take 500 mg of vitamin C per​ day, and the second half are told not to use vitamin C. At the end of the school​ year, participants are asked how many colds they had. How would you improve this​ study, and​ why?

All are correct The students should not know if they are taking vitamin C or a placebo. Half of the students should have been given a placebo. The students should be randomly assigned to the treatment. The researchers should not know who is taking vitamin C or the placebo.

Juries should have the same racial distribution as the surrounding communities. About 15​% of residents in a certain region are a specific race. Suppose a local court randomly selects 100 adult citizens of the region to participate in the jury pool. Use the Central Limit Theorem​ (and the Empirical​ Rule) to find the approximate probability that the proportion of available jurors of the above specific race is more than three standard errors from the population value of 0.15. The conditions for using the Central Limit Theorem are satisfied because the sample is​ random; the population is more than 10 times 100​; n times p is 15​, and n times​ (1 minus​ p) is 85​, and both are more than 10.

Because the sampling distribution for the sample proportion is approximately​ normal, it is known that the probability of falling within three standard errors is about 0.997. Therefore, the probability of falling more than three standard errors away from the mean is about 0.003.

A group of educators want to determine how effective tutoring is in raising​ students' grades in a math​ class, so they arrange free tutoring for those who want it. Then they compare final exam grades for the group that took advantage of the tutoring and the group that did not. Suppose the group participating in the tutoring tended to receive higher grades on the exam. Does that show that the tutoring​ worked? If​ not, explain why not and suggest a confounding variable.

Because this was an observational​ study, it only shows an​ association; it does not show that the tutoring worked. It could be that more motivated students attended the tutoring and that was what caused the higher grades.

Match each boxplot with the corresponding histogram. Explain your reasoning.

Boxplot A corresponds to histogram 1 because both show a left-skewed distribution. Boxplot B corresponds to histogram 2 because both show a symmetric distribution. Boxplot C corresponds to histogram 3 because both show a right-skewed distribution.

In the​ mid-1800s, a doctor decided to make the doctors wash their hands with a strong disinfectant between patients at a clinic with a death rate of 10.110.1​%. The doctor wanted to test the hypothesis that the death rate would go down after the new​ hand-washing procedure was used. What null and alternative hypotheses should he have​ used? Explain, using both words and symbols. Explain the meaning of any symbols you use. State the null and alternative hypotheses using words. State the null and alternative hypotheses using symbols. Choose the correct answer below.

H0​: The death rate has remained the same at 10.1​% after starting​ hand-washing. Ha​: The death rate has decreased to a value less than 10.1​%. H0​: p=0.101​, Ha​: p<0.101 ​(p is the proportion of deaths at the​ clinic)

A manufacturer withdrew Drug V from the market after a study revealed that its use was associated with an increase in the risk of heart attack. The experiment was​ placebo-controlled, randomized, and​ double-blind. Out of 1290 people taking Drug V there were 48 heart​ attacks, and out of 1275 people taking the​ placebo, there were 23 heart attacks. Perform a hypothesis test to test whether those who take Drug V have a greater rate of heart attack than those who take a placebo. Use a level of significance of 0.10. Can we conclude that Drug V causes an increased risk of heart​ attack? a. Find the test statistic for this test. b. Find the​ p-value for this test. c. What is the conclusion for this​ test? d. Can we conclude that Drug V causes an increased risk of heart​ attack?

H0​: p1=p2 Ha​: p1>p 2 a. Use http://www.socscistatistics.com/tests/ztest/Default2.aspx z=2.959 (2 tailed) b. p-value is 0.00308 P-value is listed after z, on the previous website. c. Reject H0 d. Yes, because the experiment satisfies the cause-and-effect relationship conditions and because H0 was rejected.

Some experts believe that 16​% of all freshwater fish in a country have such high levels of mercury that they are dangerous to eat. Suppose a fish market has 200 fish​ tested, and 41 of them have dangerous levels of mercury. Test the hypothesis that this sample is not from a population with 16​% dangerous​ fish, assuming that this is a random sample. Use a significance level of 0.05. Comment on your conclusion. State the null and alternative hypotheses. Determine the​ z-test statistic. Find the​ p-value. Choose the correct conclusion. Comment on your conclusion. Are you saying that the percentage of dangerous fish is definitely 16​%? Explain.

H0​: p=0.16 Ha​: p≠0.16 z=1.736 https://www.easycalculation.com/statistics/test-for-one-proportion.php P-Value is 0.081859 (two tailed) Do not reject H0. The population proportion is not significantly different from 0.16 Since the null hypothesis was not​ rejected, the percentage of dangerous fish could be 16​%, but the actual population percentage is still unknown.

A biologist is studying the effects that applying insecticide to a fruit farm has on the local bat population. She collects 23 bats and finds the mean weight of this sample to be 503.4 grams. Assuming the selected bats are a random​ sample, she concludes that because the sample mean is an unbiased estimator of the population​ mean, the mean weight of bats in the population is also 503.4 grams. Explain why this is an incorrect interpretation of an unbiased estimator.

Having an​ "unbiased" estimator means that the mean of the means of all possible samples of the same size would be the same as the population mean.

An association of Realtors reports​ state-by-state median​ existing-home prices for each quarter. Why do you suppose they use the median instead of the​ mean? What might be the disadvantage of reporting the​ mean?

Home prices are probably skewed to the right and not symmetric. This makes the median a better representation of the center than the mean which would be influenced by the extremely high priced homes. Reporting the mean would give the impression that the​ "typical" home price is higher than it is.

Eric randomly surveyed 150 adults from a certain city and asked which team in a contest they were rooting​ for, either North High School or South High School. Of the surveyed​ adults, 96 said they were rooting for North High while the rest said they were rooting for South High. Eric wants to determine if this is evidence that more than half the adults in this city will root for North High School. Suppose a​ p-value from the correct hypothesis test was 0.0030. Which of the following is a correct interpretation of this​ p-value?

If half of all adults in this city root for North​ High, 3 out of every 1000 random samples of the same size from this population would produce the same result observed in this study or a result more unusual.

Suppose the equation of a​ least-squares regression line is ModifyingAbove y^=−3.17−2.4x. What can be said about the correlation​ coefficient?

It is​ negative, but its exact value cannot be determined from the given information.

According to the ancient Roman architect​ Vitruvius, a​ person's armspan​ (the distance from fingertip to fingertip with the arms stretched​ wide) is approximately equal to his or her height. For​ example, people 5 feet tall tend to have an armspan of 5 feet.​ Explain, then, why the distribution of armspans for a class containing roughly equal numbers of men and women might be bimodal.

Men and women tend to have different heights and therefore different armspans.

The accompanying table shows the​ round-trip fare for flights from one city to 10 other cities on major airlines. These were the lowest prices found online in early 2010. The airlines​ varied, but the travel dates were all the same. How much would it​ cost, on​ average, to fly 500​ miles? To answer this​ question, perform a complete regression​ analysis, including a scatterplot with a regression line. a. Draw a scatterplot for the​ round-trip flight data. Be sure that miles is the​ x-variable and cost is the​ y-variable. Choose the correct scatterplot below. b. Is the linear model​ appropriate? c. Determine the regression line for the​ round-trip flight data. d. Add the regression line onto the scatterplot. Choose the correct graph below. e. Interpret the slope and the intercept in the context of the problem. f. Using the regression​ line, how much would it​ cost, on​ average, to fly 500​ miles?

Open in stat crunch and create a scatterplot, then create a simple liner regression. a. See picture b. The linear model is appropriate because there is a linear trend in the data. c. Predicted Cost=165.84+.08 Miles d. See picture e. For every additional​ mile, on​ average, the price goes up by .08 dollars. A trip of zero miles would cost about 165.84 dollars.​ However, a trip would never be exactly zero​ miles, so this cost does not make sense in the context of the problem. f. It would​ cost, on​ average, about ​$205.84 to fly 500 miles.

The accompanying table shows the number of millionaires​ (in thousands) and the population​ (in hundreds of​ thousands) for 10 states. a. Without doing any​ calculations, predict whether the correlation and slope will be positive or negative. Explain your prediction. b. Make a scatterplot with the population​ (in hundreds of​ thousands) on the​ x-axis and the number of millionaires​ (in thousands) on the​ y-axis. Choose the correct scatterplot below. c. Find the numerical value for the correlation. d. Find the value of the slope and explain what it means in context. Be careful with the units. e. Explain why interpreting the value for the intercept does not make sense in this situation.

Open in stat crunch and perform a simple liner regression. a. The correlation and slope will be positive because states with a larger population will most likely have a larger number of millionaires. b. See the picture c. The correlation is .993 d. The slope is 1.9148. This slope indicates that for every increase of one hundred thousand in​ population, the number of millionaires will​ increase, on​ average, by 1.9148 thousand. e. In this​ situation, the intercept would be the number of millionaires that live in a state with a population of zero. Since there would be no population in the​ state, then any number of millionaires besides zero would not make sense.

The table shows a list of the weights and prices of some turkeys at different supermarkets. a. Make a scatterplot with weight on the​ x-axis and cost on the​ y-axis. Choose the correct graph below. b. Find the numerical value for the correlation between weight and price. Explain what the positive value of the correlation means. c. Report the equation of the best straight​ line, using weight as the predictor​ (x) and cost as the response​ (y). d. Insert the line on the scatterplot. e. Report the slope and intercept of the regression line and explain what they show. If the intercept is not appropriate to​ report, explain why. Report the slope of the regression line and explain what it shows. Select the correct choice below and fill in the answer box within your choice. f. Add a new point to your​ data, a​ 30-pound turkey that is free. Give the new value for r and the new regression equation. Explain what the negative correlation implies. What​ happened? State the new value for r. Determine the new regression equation What does the negative correlation​ imply? What happened when the new data point was​ added?

Open in stat crunch and preform a simple liner regression. a. See picture b. r=.935 A positive correlation suggests that larger turkeys tend to have a higher price. c. Predicted Price=-5.861+(1.637)Weight d. see picture e. see picture f. Add another turkey and complete another simple liner regression. See picture for parts one and two. A negative correlation suggests that larger turkeys tend to have a lower price. The​ 30-pound free turkey was an influential​ point, which really changed the results.

A pair of surfers collected data on the​ self-reported numbers of days surfed in a month for 30 longboard surfers and 30 shortboard surfers. a. Compare the typical number of days surfing for these two groups. b. Compare the interquartile ranges.

Open in statchrunch and perform stat>summary stat> column for each, then use data to answer. The median for the longboards was 12 ​days, and the median for the shortboards was 10.5 ​days, showing that those with longboards typically surfed more days in this month. The interquartile range for the longboards was 11 ​days, and the interquartile range for the shortboards was 9 ​days, showing more variation in the days surfed this month for the longboards.

a. If a​ rifleman's gunsight is adjusted​ incorrectly, he might shoot bullets consistently close to 2 feet left of the​ bull's-eye target. Draw a sketch of the target with the bullet holes. Does this show lack of precision or​ bias? b. Draw a second sketch of the target if the shots are both unbiased and precise​ (have little​ variation). The​ rifleman's aim is not​ perfect, so your sketches should show more than one bullethole.

Picture one shows bias

According to a candy​ company, packages of a certain candy contain 30​% orange candies. Find the approximate probability that the random sample of 50 will contain 38​% or more orange candies.

See picture for math. Therefore, the probability that the random sample of 50 candies will contain 38​% or more orange candies is approximately 0.109.

A sample of thirty users of a popular social networking site yielded the histogram on the right for the number of friends. What is the relationship between the mean and the median for this​ data?

Since the distribution is skewed to the​ right, the mean will be pulled to the right by the tail. This will make the mean larger than the median. The mean will be substantially larger than the median since the distribution is skewed right.

A researcher wants to assess the effects of taking prenatal vitamins on the health of​ newborns, using the newborn weight as the response variable. Explain why it might be inappropriate to use a designed experiment to address this research objective.

Since there is a perceived benefit to taking prenatal​ vitamins, there would be ethical issues in intentionally denying them to some pregnant women.

Assume that half of all children born are male and half are female. In all of the following​ cases, we will assume that there are no twins​ (or triplets or​ more) and that the conditions of the binomial model are satisfied. a. If a woman plans to have two​ children, what is the probability that both will be boy​s? b. If a woman plans to have eight ​children, determine the probability that all will be boys. c. If a woman plans to have eight ​children, determine the probability that she will have at least one girl. ​("At least one girl​" is the complement of​ "all boy​s.") d. Determine if this means that the more children a woman​ has, the more likely she will be to have at least one girl. Explain.

Since there will be 2​ children, n=2. Since half of all children are born​ male, p=0.5. The desired number of boys is​ two, so x=2. ​Thus, ​b(n,p,x)=​b(2​,0.5,2​). Use http://stattrek.com/online-calculator/binomial.aspx Enter .5,2,2 in the first three blanks. The fourth blank reads .25 a. 0.25 b. b(8,.5,8) = 0.0039 c. 1-0.0039 = 0.9961 d. Yes, the more children she​ has, the more likely she will be to have at least one girl.

Identify when the interquartile range is better than the standard deviation as a measure of dispersion and explain its advantage.

The IQR is resistant to extreme values in the​ data, making it a better choice for a skewed distribution. When the distribution is skewed left or right or contains some extreme​ observations, then the interquartile range is preferred since it is resistant.

A sociologist​ says, "Typically, men in a certain country still earn more than​ women." What does this statement​ mean?

The center of the distribution of salaries for men in the country is greater than the center for women.

If the correlation between height and weight of a large group of people is 0.62​, find the coefficient of determination​ (as a​ percent) and explain what it means. Assume that height is the predictor and weight is the​ response, and assume that the association between height and weight is linear.

The coefficient of determination is 38.44​%. ​Therefore, 38.44​% of the variation in weight can be explained by the regression line.

A dieter recorded the number of calories he consumed at lunch for one week. As you can​ see, a mistake was made on one entry. The calories are listed in increasing order below. 349​, 371​, 386​, 398​, 412​, 4190 When the error is corrected by removing the extra​ 0, will the mean​ change? Will the​ median? Explain without doing any calculation.

The corrected value will give a different mean but not a different median. Medians are resistant to outliers and not as affected by extreme​ values, but the more extreme a value​ is, the more the mean is affected by it.

A collection of data on class sizes at a community college produces the​ five-number summary below. Comment on the shape of the distribution of class sizes. Min=12 Q1=22 Q2=35 Q3=38 Max=40

The distribution appears to be skewed left since the median is further from the first quartile than the third quartile.​ Also, the left whisker would be longer than the right whisker in a boxplot for the data.

One histogram shows the distribution of the length of a certain​ person's cell phone calls for one​ month, and the other shows many sample​ means, in which each is the mean length for 2020 randomly selected calls from the same person during that month. Which is​ which? Explain. Which is​ which? Explain.

The distribution of the length of this​ person's cell phone calls for the month is shown in Histogram B, because this distribution has a larger standard​ deviation, and is skewed in one direction. The distribution of many sample means for randomly selected calls from the same person during the month is shown in Histogram A, because this distribution has a smaller standard​ deviation, and is approximately Normal.

Exam 1 scores have a mean of 530 and a standard deviation of 100​, while exam 2 scores have a mean of 23 and a standard deviation of 4. Assuming both types of scores have distributions that are unimodal and​ symmetric, which is more​ unusual: an exam 1 score of 750 or an exam 2 score of 29​?

The exam 1 score is more unusual.The exam 1 score is more unusual.

The data were collected from a statistics class. The column head gives the​ variable, and each of the rows represents a student in the class. Find the​ frequency, proportion, and percentage of women.

The frequency of women in the class is 4. The proportion of women in the class is 4/11. The percentage of women in the class is 36.4​%.

A sample of thirty users of a popular social networking site yielded the histogram on the right for the number of friends. Which measure of central tendency better describes the​ "center" of the​ distribution?

The median is a better of measure of the center of the data since the distribution is skewed to the right.

A study was done to see whether a smaller dose of flu vaccine could be used successfully. In this​ study, the usual amount of vaccine was injected into half the​ patients, and the other half of the patients had only a small amount of vaccine injected. The response was measured by looking at the production of antibodies. In the​ end, the lower dose of vaccine was just as effective as a higher dose for those under 65 years old. What more do we need to know to be able to conclude that the lower dose of vaccine was equally effective at preventing the flu for those under​ 65?

The patients need to be randomly assigned the full or lower dose. Without randomization there could be​ bias, however, with randomization we can infer causation.

Two sections of statistics are​ offered, the first at 8 a.m. and the second at 10 a.m. The 8 a.m. section has 25​ women, and the 10 a.m. section has 15 women. A student claims this is evidence that women prefer earlier statistics classes than men do. What information is missing that might contradict this​ claim?

The percentage of female students in the two classes is unknown. There may be more females in the 8 a.m. because there are more students in the 8 a.m. class than the 10 a.m. class. This claim could be true only if the classes were the same size.

A university conducted a survey of 274 ​Sophomore, Junior, and Senior undergraduate students regarding satisfaction with student government. Results of the survey are shown in the table by class rank. A survey participant is selected at random. What is the probability that he or she is a Senior or neutral​?

The probability that a randomly chosen participant is a Senior or neutral is .464 {[(98+41)-12]/274}

A university conducted a survey of 281281 ​Sophomore, Junior, and Senior undergraduate students regarding satisfaction with student government. Results of the survey are shown in the table by class rank. A survey participant is selected at random. What is the probability that he or she is a Sophomore or Junior​?

The probability that a student is a Sophomore or a Junior is .648 [(81+101)/281)

A university conducted a survey of 273 ​Sophomore, Junior, and Senior undergraduate students regarding satisfaction with student government. Results of the survey are shown in the table by class rank. a. If one survey participant is randomly​ selected, what is the probability that he or she is a Senior​? b. If one survey participant is randomly​ selected, what is the probability that he or she is satisfied​?

The probability that one randomly selected survey participant is a SeniorSenior is .352 (96/273) The probability that one randomly selected survey participant is satisfied is .637 (174/273)

In the fall of​ 2008, a country has a total of 175 government officials. Of the total number of government officials 17 were female. For the year​ 2008, find a​ 95% confidence interval for the percentage of government officials who were female or explain why you should not find a confidence interval for the percentage of government officials who were female in 2008.

The proportion of 17/175 is the population​ proportion, not a sample proportion. You should not find a confidence interval unless you have a sample and are making statements about the population from which the sample has been drawn.

Judging on the basis of​ experience, a politician claims that 52​% of voters in a certain area have voted for an independent candidate in past elections. Suppose you surveyed 25 randomly selected people in that​ area, and 18 of them reported having voted for an independent candidate. The null hypothesis is that the overall proportion of voters in the area that have voted for an independent candidate is 52​%. What value of the test statistic should you​ report?

The test statistic is z=2.

For the given pair of​ events, classify the two events as independent or associated. Combing your hair and dressing nice Success at a job interview

The two events are associated because the occurrence of one affects the probability of the occurrence of the other.

A teacher at a community college sent out questionnaires to evaluate how well the administrators were doing their jobs. All teachers received​ questionnaires, but only​ 10% returned them. Most of the returned questionnaires contained negative comments about the administrators. Explain how an administrator could dismiss the negative findings of the report.

There is nonresponse bias. The results could be biased because the small percentage who chose to return the survey might be very different from the majority who did not return the survey.

The mother of a teenager has heard a claim that 24​% of teenagers who drive and use a cell phone reported texting while driving. She thinks that this rate is too high and wants to test the hypothesis that fewer than 24​% of these drivers have texted while driving. Her alternative hypothesis is that the percentage of teenagers who have texted when driving is less than 24​%. H0: p=0.24 Ha: p<0.24 She polls 40 randomly selected​ teenagers, and 3 of them report having texted while​ driving, a proportion of 0.075. The​ p-value is 0.007. Explain the meaning of the​ p-value in the context of this question.

The​ p-value says that if the true proportion of teenagers who text while driving is 0.24, then there is only a 0.007 probability that one would get a sample proportion of 0.075 or smaller with a sample size of 40.

Tyler is interested in whether Proposition P will be passed in the next election. He goes to the university library and takes a poll of 100 students. Since 59​% favor Proposition​ P, Tyler believes it will pass. Explain what is wrong with his approach.

Tyler took a convenience sample. The students may not be representative of the voting​ population, so the proposition may not pass.

A random sample of students at a college reported what they believed to be their heights in inches. Then the students measured each​ others' heights in​ centimeters, without shoes. The data provided are for the​ men, with their believed heights converted from inches to centimeters. Assume that conditions for​ t-tests hold. b. Perform a​ t-test to test the hypothesis that the means are not the same. Use a significance level of 0.05. Determine the hypotheses for this test. Let mu Subscript differenceμdifference be the population mean difference between measured and believed​ height, in centimeters. Find the test statistic for this test.

Use StatCrunch (stat, t-stat, paired, measured v believed, select Confidence interval for μD = μ1 - μ2, and enter .95, compute. Use L. Limit and U limit as answer. The interval does include​ 0, so a hypothesis that the means are equal cannot be rejected. H0:μdifference=0 Ha: μdifference≠0 t=0.00 p=1 Do not reject H0. The means of measured and believed heights are not significantly different.

A random sample of 21 independent female​ college-aged dancers showed a sample mean height of 64.8 inches and a sample standard deviation of 1.7 inches. Assume that this distribution of heights is Normal. a. Use technology to find a​ 95% confidence interval for the population mean height of dancers and interpret the interval. b. Use technology to find a​ 99% confidence interval for the population mean height and interpret the interval. c. Which interval is wider and​ why?

Use http://www.sample-size.net/confidence-interval-mean/ a. We are​ 95% confident that the population mean height of female​ college-aged dancers is between 64.0 and 65.6 b. We are​ 99% confident that the population mean height of female​ college-aged dancers is between 63.7 and 65.9. c. The​ 99% interval is wider because it has a greater confidence​ level, and therefore we use a bigger value of​ t*, which creates a wider interval

A​ true/false test has 40 questions. Suppose a passing grade is 3030 or more correct answers. Test the claim that a student knows more than half of the answers and is not just guessing. Assume the student gets 30 answers correct out of 40. Use a significance level of 0.05. Steps 1 and 2 of a hypothesis test procedure are given below. Show step​ 3, finding the test statistic and the​ p-value and step​ 4, interpreting the results.

Use same websites as slide above to get z=3.162 P-Value is 0.000783 The result is significant at p < 0.05. Reject H0. The probability of doing this well by chance alone is so small that it can be concluded that the student is not guessing.

Suppose you are taking an exam with 1010 questions and you are required to get 55 or more right answers to pass. a. With a 10​-question ​true/false test, find the probability of getting at least 5 answers correct by guessing. b. With a 10​-question ​multiple-choice test where there are four possible choices for each​ question, find the probability of getting at least 5 answers correct by guessing. Only one of the choices is correct for each question. c. With a 10​-question ​multiple-choice test where there are five possible choices for each​ question, find the probability of getting at least 5 answers correct by guessing. Only one of the choices is correct for each question. d. d. Determine which test​ (of those described in parts​ a, b, and​ c) would be easiest to pass by​ guessing, which would be​ hardest, and why.

Use the table to find b(10,.5,5), then b(10,.5,6), then b(10,.5,7)...b(10,.5,10). Add all values together. a. 0.623 b. Use the same equation as part a, but replace .5 with .25 The answer is 0.077 c. Use the same equation as above, but use .20 The answer is 0.033 d. The​ true/false test would be easiest to pass because of the​ 50% chance of guessing right on each question. The​ multiple-choice test with five choices per question would be hardest to pass because of the low​ (20%) chance of guessing right on each question.

Days before a presidential​ election, a nationwide random sample of registered voters was taken. Based on this random​ sample, it was reported that​ "52% of registered voters plan on voting for Robert Smith with a margin of error of ±​3%." The margin of error was based on a​ 95% confidence level. Fill in the blanks to obtain a correct interpretation of this confidence interval. We are​ ___________ confident that the​ ___________ of registered voters​ ___________ planning on voting for Robert Smith is between​ ___________ and​ ___________.

We are 95% confident that the percentage of registered voters in the nation planning on voting for Robert Smith is between 49% and 55%.

When comparing two sample proportions with a​ two-tailed alternative​ hypothesis, all other factors being​ equal, will you get a smaller​ p-value if the sample proportions are close together or if they are far​ apart? Explain.

You will get a smaller​ p-value if the sample proportions are far apart. Assuming the standard errors are the​ same, the farther apart the two proportions​ are, the larger the absolute value of the numerator of​ z, and therefore the larger the absolute value of z and the smaller the​ p-value.

Suppose that 50 statistics students each took a random sample​ (with replacement) of 50 students at their college and recorded the ages of the students in their sample. Then each student used his or her data to calculate a 90​% confidence interval for the mean age of all students at the college. How many of the 50 intervals would you expect to capture the true population mean​ age, and how many would you expect not to capture the true population​ mean? Explain by showing your calculation. a. The number of intervals expected to capture the true population mean is b. The number of intervals expected to not capture the true population mean is The expression _____ can be used to find the number of intervals expected to capture the true population mean.

a. (.90)50=45 b. 50-45 or (.10)50=5 c. 0.90(50)

A​ true/false test has 200 questions. A passing grade is 55​% or more correct answers. a. What is the probability that a person will guess correctly on one​ true/false question? b. What is the probability that a person will guess incorrectly on one​ question? c. Find the approximate probability that a person who is just guessing will pass the test. d. If a similar test were given with​ multiple-choice questions with four choices for each​ question, would the approximate probability of passing the test by guessing be higher or lower than the approximate probability of passing the​ true/false test?​ Why?

a. 0.5 b. 0.5 c. See picture. ​Therefore, the probability of a person who is just guessing passing the test is 0.0764. d. ​Lower, because the probability of guessing correctly on each question is lower when there are four options.

There are four​ suits: clubs,​ diamonds, hearts, and​ spades, and the following cards appear in each​ suit: Ace,​ 2, 3,​ 4, 5,​ 6, 7,​ 8, 9,​ 10, Jack,​ Queen, King. The​ Jack, Queen, and King are called face cards because they have a drawing of a face on them. Diamonds and hearts are​ red, and clubs and spades are black. If you draw 1 card randomly from a standard​ 52-card playing​ deck, what is the probability that it will​ be: a. Upper A 4​? b. A red ​card? c. A spade​? d. A face ​card?

a. 1/13 b. ½ c. ¼ d. 3/13

A poll asked for​ people's opinion on whether closing local newspapers would hurt civic​ life; 437 of 1008 respondents said it would hurt civic life a lot. a. Find the proportion of the respondents who said that closing local papers would hurt civic life a lot. b. Find a​ 95% confidence interval for the population proportion who believed closing newspapers would hurt civic life a lot. Assume the poll used a simple random sample​ (SRS). (In​ fact, it used random​ sampling, but a more complex method than​ SRS.) c. Find an​ 80% confidence interval for the population proportion who believed closing newspapers would hurt civic life a lot using the same solution process as finding the​ 95% confidence interval. d. Which interval is wider and​ why?

a. 437/1008=0.434 b. See picture. SE=0.0156 0.0156*1.96=0.0306 (use 1.96 because it corresponds with 95%) 0.434-0.0306 = 0.403 0.434+0.0306 = 0.465 Thus, the​ 95% confidence interval is (0.403,0.465) c. Same as part b, but sub 1.28 for 1.96 (because the confidence interval 80% corresponds with 1.28) The​ 80% confidence interval is ​(0.414​,0.454​). d. The​ 95% interval is wider. To get a higher degree of​ certainty, the interval needs to be widened.

The quantitative scores on a test are approximately Normally distributed with a mean of 500 and a standard deviation of 100. On the horizontal axis of the​ graph, indicate the test scores that correspond with the provided​ z-scores. Answer the questions using only your knowledge of the Empirical rule and symmetry. Indicate the test scores that correspond with the provided​ z-scores. a. Roughly what percentage of students earn quantitative test scores more than​ 500? b. Roughly what percentage of students earn quantitative test scores between 400 and​ 600? c. Roughly what percentage of students earn quantitative test scores more than​ 800? d. Roughly what percentage of students earn quantitative test scores less than​ 200? e. Roughly what percentage of students earn quantitative test scores between 300 and​ 700? f. Roughly what percentage of students earn quantitative test scores between 700 and​ 800?

a. 50% b. 68% c. About 0% d. About 0% e. 95% f. 2.5%

A large collection of​ one-digit random numbers should have about​ 50% odd and​ 50% even digits because five of the ten digits are odd​ (1, 3,​ 5, 7, and​ 9) and five are even​ (0, 2,​ 4, 6, and​ 8). a. Find the proportion of​ odd-numbered digits in the following lines from a random number table. Count carefully. b. Does the proportion found in part​ (a) represent ^p ​(the sample​ proportion) or p​ (the population​ proportion)? c. Find the error in this​ estimate, the difference between ^p and p​ (or ^p−​p).

a. 53.33% (actual percent of odd digits shown. b. ^p (the sample proportion) c. 3.33% (53.33%-50%)

The bar chart shows the projected percentage of residents of a certain country in different age categories by year. a. Comment on the predicted changes from 2010 to 2030. Which age groups are predicted to become at least several percentage points​ larger? Select all that apply. b. Which age groups are predicted to become at least several percentage points​ smaller? Select all that apply. c. Which age groups are predicted to stay roughly the same​ size? Select all that apply. d. Comment on the effect this might have on a government program that collects money from those currently working and gives it to retired people. Choose the correct answer below.

a. 65 and older b. 25-64 c.24 and below d. The money for the program normally comes from those in the​ 25-64 range, a group that is decreasing in percentage. The group receiving the​ money, those that are 65 and​ over, is increasing in percentage. This suggests​ that, in the​ future, the government program will not be able to provide the retirees enough money.

The table below summarizes results from a survey that asked about political party affiliation and​ self-described political orientation.​ (Dem means​ Democrat, and Rep means​ Republican.) a. Find the probability that a randomly chosen respondent is a Republican given that he or she is liberal. In other​ words, what percentage of the liberals are Republican? b. Find the probability that a randomly chosen respondent is a Republican given he or she is conservative. In other​ words, what percentage of the conservatives are Republican? c. Which respondents are more likely to be Republican, the liberal or the conservative​ respondents?

a. 7.9% b. 52.8% The conservativeconservative respondents are more likely to be Republican.

In a 2008​ survey, people were asked their opinions on astrology​ - whether it was very​ scientific, somewhat​ scientific, or not at all scientific. Of 1446 who​ responded, 78 said astrology was very scientific. a. Find the proportion of people in the survey who believe astrology is very scientific. b. Find a​ 95% confidence interval for the population proportion with this belief. c. Suppose a TV news anchor said that​ 5% of people in the general population think astrology is very scientific. Would you say that is​ plausible? Explain your answer.

a. 78/1446=0.0539 b. See picture. SE=0.0059 0.0059*1.96=0.0116 0.0539 +/- 0.0116 ​Thus, the​ 95% confidence interval is (0.042,0.066). c. The value​ 5% is within the interval because 0.05 is greater than 0.042 and less than 0.066. This is plausible because​ 5% is inside the interval.

The boxplot shows the number of millionaires by region per 1000 residents for a certain country. Assume all distributions are unimodal. a. List the regions from lowest to highest in terms of the median rate of millionaires in that region. b. Which region has the largest interquartile​ range? c. Which region has the smallest interquartile​ range? d. Which region has potential​ outliers? e. Why is a rate of about 24 millionaires per 1000 people in Region Upper AA a potential​ outlier, while the rate of about 25.5 in Region C is not a potential​ outlier?

a. A, D, B, C b. Region C has the largest interquartile range. c. Region B has the smallest interquartile range. d. Region A has potential outliers. e. Potential outliers are observations that are a distance of more than 1.5 interquartile​ range(s) below the first quartile or above the third quartile. The rate of 24 in Region A satisfies this​ criteria, while the rate of 25.5 in Region C does not.

The ten​ top-grossing movies of a certain film studio are​ shown, in millions of dollars. a. Arrange the gross income from smallest​ (on the​ left) to largest​ (showing the​ arrangement), and find the median by averaging the two middle numbers. Interpret the median in context. b. Using the sorted​ data, find Q1 and Q3. Then find the interquartile range and interpret it in context. c. Interpret the interquartile range in context.

a. Arrange the gross incomes from smallest to largest. 113​, 123​, 130​, 148​, 167​, 187​ ,208​, 214​, 227​, 292 Find the median by averting the two middle numbers. The median of this data set is 177. Interpret the median in context. The median of the top 10 grossing movies is the typical income for the top 10 grossing movies. b. Find Q1 and Q3. Q1=130 Q3=214 Find the interquartile range. IQ=84 The interquartile range is the range of the middle​ 50% of the sorted incomes of the top ten grossing movies.

The boxplot shows the number of millionaires by region per 1000 residents for a certain country. Assume all distributions are unimodal. a. List the regions from lowest to highest in terms of the median rate of millionaires in that region. b. Which region has the largest interquartile​ range? c. Which region has the smallest interquartile​ range? d. Which region has potential​ outliers? e. Why is a rate of about 24 millionaires per 1000 people in Region Upper BB a potential​ outlier, while the rate of about 24.5 in Region C is not a potential​ outlier?

a. B, D, A, C b. Region D has the largest interquartile range. c. Region B has the smallest interquartile range. d. Region B has potential outliers. Potential outliers are observations that are a distance of more than 1.5 interquartile​ range(s) below the first quartile or above the third quartile. The rate of 2424 in Region B satisfies this​ criteria, while the rate of 24.5 in Region C does not.

The prices of a sample of books at University A were obtained by two statistics students. Then the cost of books for the same subjects​ (at the same​ level) were obtained for University B. Assume that the distribution of differences is Normal enough to​ proceed, and assume that the sampling was random. a. First find both sample means and compare them. Find the sample mean for University A. Find the sample mean for University B. Compare the sample means. b. Test the hypothesis that the population means are​ different, using a significance level of 0.05. Determine the hypotheses for this test. Let mu Subscript differenceμdifference be the mean of differences between the paired observations. Choose the correct answer below. c. Find the test statistic for this test. Find the p-value for this test. d. What is the conclusion for this​ test?

a. Compute mean on calculator $72.67 $77.87 Larger at University B b. (use stat crunch (stats, t-stats, paired, with data) H0​: μ=0 Ha​: μ≠0 c. t−3.64 p-=0.003 d. Reject H0. There is evidence that mean of the differences is not 0.

Explain the difference between sampling with replacement and sampling without replacement. Suppose you had the names of 10​ students, each written on a 3 by 5​ notecard, and want to select two names. Describe both procedures. a. Describe sampling with replacement. b. Describe sampling without replacement.

a. Draw a​ notecard, note the​ name, replace the notecard and draw again. It is possible the same student could be picked twice. b. Draw a​ notecard, note the​ name, do not replace the notecard and draw again. It is not possible the same student could be picked twice.

You have sent out 3000 invitations to hear a​ speaker, and you must rent chairs for the people who come. In the​ past, usually about 21​% of the people invited have come to hear the speaker. a. On​ average, what proportion of those invited should we expect to​ attend? b. Suppose you assume that 21.5​% of those invited will​ attend, and so you rent 645 chairs​ (because 0.215 times 3000 is 645​). What is the approximate probability that more than 21.5​% of those invited will show up and you will not have enough​ chairs? Refer to the​ TI-83/84 output given. Recall that this gives the Normal cumulative probability in the following​ format: Normalcdf​ (left boundary, right​ boundary, mean, standard​ deviation) c. What is the approximate probability that more than 2323​% of the 30003000 invited will show​ up? d. How many chairs would you have to rent if exactly 23% of those invited​ attended? e. Why is your answer to part c smaller than your answer to part​ b?

a. Expect 0.21 of those invited to attend. b. The mean is 0.21. The standard deviation is 0.007. Notice that the calculator output gives the area to the right of the first value since the upper bound is 1 or​ 100% attendance. Determine the probability this output gives for the area to the right of 0.215​. ​P(x>0.215​)=0.2507 [see picture]. ​Thus, a​ well-labeled sketch of the Normal​ curve, and the appropriate shaded region is shown to the right. Note that the mean 0.21 is at the peak of the curve and the area to the right of 0.215 is approximately equal to 0.2507. [see picture]. c. z=(.23-.21)/0.007=2.86 Recall that the standard normal distribution table gives the area to the left of z. Use the standard normal distribution table to determine Upper P left parenthesis z less than 2.86 right parenthesisP(z<2.86). Upper P left parenthesis z less than 2.86 right parenthesisP(z<2.86)almost equals≈0.9979 [use chart]. 1-.9979=0.0021 Therefore, the probability that more than 23​% of those invited actually attend is 0.0021. d. Calculate 23​% of3000​, assuming that each person attending will need a chair. # of chairs needed =23​% of 3000 =0.23•3000=690 e. The answer to part c is smaller because 12% is farther out in the right tail than 11.5​%, and it is the tail area that gives the probability of interest.

A​ true/false test has 110 questions. Suppose a passing grade is 60 or more correct answers. Test the claim that a student knows more than half of the answers and is not just guessing. Assume the student gets 60 answers correct out of 110. Use a significance level of 0.05. Steps 1 and 2 of a hypothesis test procedure are given. Show steps 3 and​ 4, and be sure to write a clear conclusion. Step 1: Hypothesis Step​ 2: Choose the​ one-proportion z-test. Step​ 3: Compute the​ z-test statistic, and the​ p-value. Step​ 4: Do you reject or not reject the null​ hypothesis? What does this mean in the context of the​ data?

a. H0​: p=0.50 Ha​: p>0.50 b. Sample size is large​ enough, because np0 is 110​(0.5)=55 and n(1−p0) ​= 110​(.5) ​= 55, and both are more than 10. Assume the sample is random. c. Use https://www.easycalculation.com/statistics/standard-error-sample-proportion.php for SE, then https://www.easycalculation.com/statistics/test-for-one-proportion.php for z, then http://www.socscistatistics.com/pvalues/normaldistribution.aspx for p-value. SE= 0.04747 z= 0.952 p-value=0.171 d. The result is not significant at p < 0.05. Do not reject null hypothesis.

In a poll of 500 adults in July​ 2010, 265 of those polled said that schools should ban sugary snacks and soft drinks. Complete parts a and b below. a. Do a majority of adults​ (more than​ 50%) support a ban on sugary snacks and soft​ drinks? Perform a hypothesis test using a significance level of 0.05. State the null and alternative hypotheses. Note that p is defined as the population proportion of people who believe that schools should ban sugary foods. Find the value of the corresponding​ p-value for this test statistic z. Do you reject or not reject the null​ hypothesis? b. Choose the best interpretation of the results you obtained in part a.

a. H0​: p=0.50 Ha​: p>0.50 b. z=1.342 https://www.easycalculation.com/statistics/test-for-one-proportion.php P-Value is 0.089798 http://www.socscistatistics.com/pvalues/normaldistribution.aspx Do not reject H0. The percentage of all adults who favor banning is notis not significantly more than​ 50%.

In a​ study, 120 rental properties that had already been the target of drug law enforcement were randomly divided into two groups. In the experimental​ group, the tenants received a letter from the police describing the enforcement tactics in place. For the control​ group, there was no letter. The table gives summary statistics for the number of crimes reported over a​ 30-month interval. Determine whether the letter from the police was effective in reducing the number of crimes at the 0.05 level. Although the distribution of number of crimes is not​ Normal, assume that the sample size is large enough for the Central Limit Theorem to apply. Determine the hypotheses for this test. Let mu Subscript Upper LμL be the population mean number of crimes reported over a​ 30-month interval for properties that received a letter from the​ police, and let μN be the population mean number of crimes reported over a​ 30-month interval for properties that did not receive a letter. b. Find the test statistic for this test. Find the p-value for this test. c. What is the conclusion for this​ test?

a. H0​: μL=μN Ha: μL<μN b. Use: http://www.quantitativeskills.com/sisa/statistics/t-test.htm under difference between means, use t-difference: -2.216 df-t: 53.8; p= 0.0157 as t and p t=−2.22 p=0.016 Reject H0. The letter from the police waswas significantly effective in reducing the number of crimes.

A random sample of likely voters showed that 64​% planned to vote for Candidate​ X, with a margin of error of 3 percentage points and with​ 95% confidence. a. Use a carefully worded sentence to report the​ 95% confidence interval for the percentage of voters who plan to vote for Candidate X. b. Is there evidence that Candidate X could​ lose? c. Suppose the survey was taken on the streets of a particular city and the candidate was running for president of the country that city is in. Explain how that would affect your conclusion.

a. I am​ 95% confident that the population percentage of voters supporting Candidate X is between 61​% and 67​%. b. There is no evidence that the candidate could lose. The reason there is no evidence is because the interval is entirely above​ 50%. c. A sample from this particular city would not be representative of the entire country and would be worthless in this context.

A study looked at the effects of light on female mice. Fifty mice were randomly assigned to a regimen of 12 hours of light and 12 hours of dark​ (LD), while another fifty mice were assigned to 24 hours of light​ (LL). Researchers observed the mice for two years. Three of the LD mice and 14 of the LL mice developed tumors. The accompanying table summarizes the data. a. Determine the percentage of mice that developed tumors from each group​ (LL and​ LD). Compare them and comment. b. Was this a controlled experiment or an observational​ study? How do you​ know? c. Can we conclude that light for 24 hours a day causes an increase in tumors in​ mice? Why or why​ not?

a. In the LD​ mice, 6​% developed tumors. In the LL​ mice, 28​% developed tumors. The LD mice developed tumors at a lower rate than the LL mice. b. This was a controlled experiment because there were two groups that were assigned by the researchers. c. Because it was a controlled​ experiment, it can be concluded that light for 24 hours a day causes an increase in tumors in mice.

Triglycerides are a form of fat found in the body. A recent study looked at whether men have higher triglyceride levels than women. After the data were​ collected, Minitab was used to find a​ 95% confidence interval for the difference between the mean triglyceride levels for men and women. The Minitab output is provided. a. Report and interpret the​ 95% confidence interval for the difference in mean triglycerides between men and women​ (refer to the Minitab output​ provided). Select the correct choice below​ and, if​ necessary, fill in any answer boxes to complete your choice. b. Does this support the hypothesis that men and women differ in mean triglyceride​ levels? Explain.

a. It can be stated with​ 95% confidence that the difference in mean triglycerides between men and women is between -76.1 and −33.9. b. Because the interval does not contain​ zero, the possibility that the mean difference in the population is 0 can be ruled out. Since this​ difference, μfemale−μmale​, is negative, the​ men's mean triglyceride level is significantly higher than the​ women's mean triglyceride level.

The data were collected from a statistics class. The column head gives the​ variable, and each of the other rows represents a student in the class. Explain why the variable​ Male, is​ categorical, even though its values are numbers.​ Often, it does not make​ sense, or is not even possible to add the values of a categorical variable. Does it make sense for​ Male? If​ so, what does the sum​ represent? a. Explain why the variable Male is​ categorical, even though its values are numbers b. Does it make sense to add the values of the categorical variable​ Male? If​ so, what does the sum​ represent?

a. Male is categorical with two categories. Its variable coding rule is that​ 1's represent males and​ 0's represent females b. It makes sense. If the numbers are​ added, the sum is the number of males.

Criminal cases are assigned to judges randomly. A list of criminal judges for a particular state is given in the table. Assume that only Miranda Jonswold is a woman and the rest are men. Suppose a clerk pulls a name out at random. a. Suppose the event of​ interest, event​ A, is that a judge is a woman. List the outcomes that make up event A. b. A criminal case is to be assigned to a judge. What is the probability that one case will be assigned to a female​ judge? c. List the outcomes that are in the complement of event A

a. Miranda Jonswold b. The probability that one case will be assigned to a female judge is 1/9 c. ​, Frank Hale​, Evan Meier​, Keith Montford​, Rob Welch, Elliott Smith, Ryan OConnell, Eric Larkin

The accompanying data table contains two body​ measurements, in​ centimeters, for some college women. Hand width is the width of the hand with the fingers spread wide. Armspan is the span of the arms with the arms spread wide. Use hand width as the predictor and armspan as the response. a. Make a scatterplot of the data. Choose the correct scatterplot below. b. Explain why linear regression is probably not appropriate for these variables.

a. Open in Stat Crunch and create a scatter plot. Assure that the labels are on the correct axises. See picture. b. Linear regression is probably not appropriate for these variables because the trend is not linear.

Use a computer or statistical calculator to calculate the correlation coefficient in parts a through c below. a. The table shows the approximate distance between selected cities and the approximate cost of flights between those cities. Calculate the correlation coefficient between cost and miles. b. This table shows the same​ information, except that the distance was converted to kilometers by multiplying the numbers of miles by 1.609 and rounding to the nearest kilometer. What happens to the correlation coefficient when numbers are multiplied by a positive​ constant? c. Suppose a tax is added to each flight. FiftyFifty dollars is added to every​ flight, no matter how long it is. The table shows the new data. What happens to the correlation coefficient when a constant is added to each​ number?

a. Open in stat crunch stat>regression>simple liner. Look for R (correlation coefficient) = X r=.987 b. Use same steps as above. The correlation is .987. The correlation coefficient remains the same when the numbers are multiplied by a positive constant. c. Use same steps as above. The correlation is .987. The correlation coefficient remains the same when a constant is added to each number.

a. In your own​ words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What​ action(s) should be taken with an​ outlier? b. Which measure of the center​ (mean or​ median) is more resistant to​ outliers, and what does​ "resistant to​ outliers" mean?

a. Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are​ mistakes, they might be removed or corrected. If they are not​ mistakes, you might do the analysis​ twice, once with and once without the outliers b. The median is more​ resistant, which indicates that it usually changes less than the mean when comparing data with and without outliers.

In a simple random sample of 1200 young​ people, 93​% had earned a high school diploma. a. What is the standard error for this estimate of the percentage of all young people who earned a high school​ diploma? b. b. Find the margin of​ error, using a​ 95% confidence​ level, for estimating the percentage of all young people who earned a high school diploma. c. Report the​ 95% confidence interval for the percentage of all young people who earned a high school diploma. d. Suppose that in the​ past, 80% of all young people earned high school diplomas. Does the confidence interval you found in part c support or refute the claim that the percentage of young people who earn high school diplomas has​ increased? Explain.

a. See picture. 0.0074 b. 0.0074*1.96=0.015, or 1.5% (use 1.96 because it corresponds with a 95% confidence interval. See next slide for table) c. 93-1.5=91.5 93+1.5=94.5 Thus, the​ 95% confidence interval is left parenthesis 91.5 % comma 94.5 % right parenthesis(91.5%,94.5%) d. The interval supports this claim. This is because​ 80% is not in the​ interval, and all values are above​ 80%.

The table gives the number of people living with AIDS in 2007 in 6 regions and the population of that region. Use the table (left side of picture) to answer the following questions. a. Find the number of people living with AIDS per thousand residents in each​ region, and rank the six regions from highest rate​ (rank 1) to lowest rate​ (rank 6). b. Are the ranks for the rates the same as the ranks for the number of​ cases? If​ not, describe at least one difference. c. If you moved to one of these regions and met 50 random​ people, the region with the highest rate of AIDS is where you would most likely meet at least one person living with AIDS and the region with the lowest rate of AIDS is where you would least likely meet at least one person living with AIDS. In which region would you be most likely to meet at least one person living with​ AIDS? In which region would you be least likely to meet at least one person living with​ AIDS?

a. See right side of picture b. The ranks for the rates are different from the ranks for the number of cases. Region F had the least number of cases but had the highest rate of cases. c. You are most likely to meet at least one person living with AIDS in region F. You are least likely to meet at least one person living with AIDS in region E.

The data show the ages of students in a statistics​ class, as well as that of the​ professor, who is 71 years old. The figure to the right shows a histogram of the data. a. Describe the distribution of ages by giving the​ shape, the numerical value for an appropriate measure of the​ center, and the numerical value for an appropriate measure of the​ spread, as well as mentioning any outliers. What is the shape of the​ distribution? What is the value of the appropriate measure of the​ center? What is the value of the appropriate measure of the​ spread? Does this distribution contain any​ outliers? b. Find the mean and the median of the distribution. State why they are not the same. What is the mean of the​ distribution? What is the median of the​ distribution? Why is the mean different from the​ median?

a. Skewed right 25 (median) 14 (Q3-Q1=36-22) Yes b. 29.9 25 The mean is different from the median because the median is resistant to​ outliers, so the median will be a more accurate estimate of the center of a distribution that includes outliers.

Use technology to find the indicated area under the standard Normal curve. Include an appropriately labeled sketch of the Normal curve and shade the appropriate region. a. Find the area in a standard Normal curve to the left of 2.02 b. Find the area in a standard Normal curve to the right of 2.02 Remember that the total area under the curve is 1.

a. The area in a standard Normal curve to the left of 2.02 is 0.9783 (use a standard statistics table for area under a normal curve- see next card for table) b. The area in a standard Normal curve to the right of 2.02 is .0217 (1-.9783)

The equation for the regression line relating the salary and the year first employed is given above the figure. a. Report the slope and explain what it means. b. Either interpret the​ y-intercept of​ 4,255,424 or explain why it is not appropriate to interpret the​ y-intercept.

a. The average salary is​ $2099 less for each year later that the person was hired or an average of​ $2099 more for each year earlier. b. The​ y-intercept of​ $4,255,424 would be the salary for a person who started in the year​ 0, which is not appropriate to interpret.

The accompanying table shows the number of millionaires​ (in thousands) and the population​ (in hundreds of​ thousands) for 10 states. a. Without doing any​ calculations, predict whether the correlation and slope will be positive or negative. Explain your prediction. b. Make a scatterplot with the population​ (in hundreds of​ thousands) on the​ x-axis and the number of millionaires​ (in thousands) on the​ y-axis. c. Find the numerical value for the correlation. d. Find the value of the slope and explain what it means in context. Be careful with the units. e. Explain why interpreting the value for the intercept does not make sense in this situation.

a. The correlation and slope will be positive because states with a larger population will most likely have a larger number of millionaires. b. Load in stat crunch and do a scatter plot. See picture c. The correlation is 0.991 d. The slope is 1.9002. This slope indicates that for every increase of one hundred thousand in​ population, the number of millionaires will​ increase, on​ average, by 1.9002 thousand. e. In this​ situation, the intercept would be the number of millionaires that live in a state with a population of zero. Since there would be no population in the​ state, then any number of millionaires besides zero would not make sense.

Grades on a political science test and the number of hours of paid work in the week before the test were studied. The instructor was trying to predict the grade on a test from the hours of work. The figure shows a scatterplot and the regression line for these data. a. By looking at the plot and the line​ (without doing any​ calculations), state whether the correlation is positive or negative and explain your prediction. b. Interpret the slope. c. Interpret the intercept.

a. The correlation is negative because the graph shows a decreasing trend. b. For each additional hour of​ work, the score tended to go down by 0.4817 point. c. A student who did not work would expect to get about 87 on average.

The weight of a group of women has a population​ mean, muμ​, of 127 pounds and a population standard​ deviation, sigmaσ​, of 20 pounds. The distribution is​ right-skewed. Suppose a random sample is taken of 100 of these​ women's weights. a. What value should we expect for the mean weight of this sample of 100​ women? Why? b. The actual sample mean will not be exactly the value determined in part a. The amount it typically differs from this value is given by the standard error. What is the standard error for a sample mean taken from this​ population?

a. The expected mean weight is about 127127 ​pounds, because the sample mean is an unbiased estimator of the population mean. b. The standard error of the sample mean is 2 pounds. (20/sqrt 100).

Researchers conducted a study on brain size as measured by pixels in a magnetic resonance imagery​ (MRI) scan. The numbers are in hundreds of thousands of pixels. The data table provides the sizes of the brains and the gender. a. Is the format of the data set stacked or​ unstacked? b. Explain the coding. What do 1 and 0​ represent? c. If you answered​ "stacked" in part​ a, then unstack the data into two columns labeled Male and Female. If you answered​ "unstacked," then stack the data into one column with an appropriate name for the stacked variable.

a. The format of the data is stacked. b. The data value 1 represents Male and the data value 0 represents Female. c. see picture

The figure shows data on the percentage of students who take a certain standardized test and the average score on the writing section for 30 school districts. a. Does the graph show an increasing or decreasing​ trend? Explain what that means in the context of these data. b. Do the points seem to follow a straight line or a​ curve? c. Is it appropriate to find the correlation for these​ data? Why or why​ not?

a. The graph shows a decreasing trend. School districts with higher percentages of students taking the standardized test tend to have lower mean writingwriting scores. b. Straight Line c. Yes comma because the trend is linear.Yes, because the trend is linear.

The figure shows a scatterplot with the regression line. The equation of the regression line is shown below. The data are for the 50 regions in Country A. The predictor is the percentage of​ smoke-free homes. The response is the percentage of high school students who smoke. Complete parts a and b. Predicted Pct. Smokers=58.199−0.500(Pct. Smoke-free) a. Explain what the trend shows. Choose the correct answer below. b. Use the regression equation to predict the percentage of students in high school who​ smoke, assuming that there are​ 70% smoke-free homes in the region. Use 70 not 0.70.

a. The higher the percentage of​ smoke-free homes in a​ region, the lower the percentage of high school students who smoke tends to be. b. 23%

The Minitab output gives some numerical summaries for the areas of 51 regions​ (in square​ miles). The regions are categorized as those east of a certain river​ (e) and those west​ (w) of the river. a. Compare the mean areas by completing this sentence. b. Compare the standard deviations​ (StDev) of the areas by completing this sentence. c. Does the​ TI-83/84 output given represent the eastern or western​ regions?

a. The mean of the eastern regions is 35514 square miles and the mean of the western regions is 110750 square​ miles, showing that the eastern regions tend to have smaller areas. b. The standard deviation of the eastern regions is 24726 square miles and the standard deviation of the western regions is 124887 square​ miles, showing that the areas of the western regions tend to have more variation. c. The output represents the eastern regions.

A pair of surfers collected data on the​ self-reported numbers of days surfed in a month for 30 longboard surfers and 30 shortboard surfers. a. Compare the typical number of days surfing for these two groups. b. Compare the interquartile ranges.

a. The median for the longboards was 14.5 days, and the median for the shortboards was 16.5 ​days, showing that those with shortboards typically surfed more days in this month. b. The interquartile range for the longboards was 15 ​days, and the interquartile range for the shortboards was 11 ​days, showing more variation in the days surfed this month for the longboards.

Assume the only grades possible in a history course are​ A, B,​ C, or lower than C. The probability that a randomly selected student will get an A in the course is 0.21​, the probability that a student will get a B in the course is 0.25​, and the probability that a student will get a C in the course is 0.25. a. What is the probability that a student will get an A OR a​ B? b. What is the probability that a student will get an A OR a B OR a​ C? c. What is the probability that a student will get a grade lower than a​ C?

a. The probability that a student will get an A OR a B is .46. (.21+.25) b. The probability that a student will get an A OR a B OR a C is .71 (.21+.25+25) c. The probability that a student will get a grade less than C is .29. (1-.71)

A poll organization frequently conducts polls asking the question​ "In general, do you feel that the laws covering the sale of firearms should be made more​ strict, less​ strict, or kept as they are​ now?" At one point in​ time, 58​% of those surveyed said​ "more strict." Shortly after a​ gun-related tragedy, 64​% of those surveyed said​ "more strict." a. Assume that both polls used samples of 560 people. Determine the number of people in the sample that said​ "more strict" in the first survey and in the second survey. b. Do a test to see whether the proportion that said​ "more strict" is statistically significantly different in the two different surveys using a significance level of 0.01. What are the correct​ hypotheses? Assume the necessary conditions are satisfied. Calculate z. Calculate the corresponding​ p-value. Decide whether the null hypothesis should be rejected or not​ rejected, using a significance level of 0.010.01. Choose the correct answer below.

a. There were 325 people who said​ "more strict" in the first survey. There were 358 people who said​ "more strict" in the second survey. b. H0​: p1=p2 Ha​: p1≠p2 http://www.socscistatistics.com/tests/ztest/Default2.aspx z-Score is -2.0215 p-value is 0.04338. Do not reject H0​, meaning there is not a statistically significant difference in the proportion of people that favor stricter gun laws after the​ gun-related tragedy.

The histogram shows the distribution of the number of televisions in the homes of 8989 community college students. a. According to the​ histogram, how many homes do not have a​ television? b. How many televisions are in the homes that have the most​ televisions? c. How many homes have three​ televisions? d. How many homes have six or more​ televisions? e. What proportion of homes have six or more​ televisions?

a. There​ is/are 22 ​home(s) that​ does/do not have a television. b. The homes that have the most televisions have 99 ​television(s). c. There​ is/are 2626 ​home(s) that​ has/have three televisions. d. There​ is/are 77 ​home(s) that​ has/have six or more televisions. e. The proportion of homes that have six or more televisions is 7/89.

A study looked at people treated for heart disease and reported lower death rates for those who received a coronary bypass​ (CABG) compared to those who received a stent. The adverse outcomes were greater with stenting. Was this an observational study or a controlled​ experiment? How do you​ know? Can we say that the use of CABG causes a better success​ rate? Why or why​ not? Was this an observational study or a controlled​ experiment? How do you​ know? Can we say that the use of CABG causes a better success​ rate? Why or why​ not?

a. This was an observational study because there was no random assignment and the study looked at records. b. Because this was an observational​ study, it only shows an​ association; it does not show that the use of CABG causes a better success rate.

A random sample of 40 students taking statistics at a community college was asked their GPA. The sample mean was 3.36​, and the margin of error for a​ 95% confidence interval was 0.22. a. Choose the correct interpretation of the confidence interval below and fill in the answer boxes to complete your choice. Assume that the conditions for constructing a confidence interval are met. b. What does the interval tell us about the population mean GPA for statistics students at the​ school? Can you reject 3.94 as the mean for statistics​ students? Explain.

a. We are​ 95% confident that the population mean GPA of statistics students at the school is between 3.14 (3.36-0.22) and 3.58 (3.36+0.22). b. It tells us a range of plausible values for the population mean​ GPA, where the population is all statistics students at the school.​ Yes, reject 3.94​, because it is not in the interval. We are confident that the statistics students have a lowerlower population mean than 3.94.

A statistics instructor randomly selected four bags of​ oranges, each bag labeled 10​ pounds, and weighed the bags. They weighed 9.6​, 9.8​, 9.3​, and 9.7 pounds. Assume that the distribution of weights is Normal. Find a​ 95% confidence interval for the mean weight of all bags of oranges. Use technology for your calculations. a. Choose the correct interpretation of the confidence interval below​ and, if​ necessary, fill in the answer boxes to complete your choice. b. Does the interval capture 10​ pounds? Is there enough evidence to reject the null hypothesis that the population mean weight is 10​ pounds? Explain your answer.

a. We are​ 95% confident the population mean is between 9.256 and 9.944. b. No, it does not capture 10. Reject the claim of 10 pounds because 10 is not in the interval.

For each​ situation, identify the sample size​ n, the probability of success​ p, and the number of successes x. Give the answer in the form​ b(n,p,x). Do not go on to find the probability. Assume the four conditions for a binomial experiment are satisfied. a. In the 2008 presidential​ election, 53​% of the voters voted for a certain candidate. What is the probability that 77 out of 100 independently chosen voters voted for this​ candidate? b. The manufacturer of a certain vehicle recovery system claims that the probability that a stolen vehicle using its product will be recovered is 85​%. What is the probability that exactly 8 out of 10 independently stolen vehicles with this product will be​ recovered? c. A student is taking a 10​-question ​multiple-choice test. Each questions has four​ options, a,​ b, c, and d. One of these four options is correct and three of them are incorrect. What is the probability that the student correctly answers exactly 6 of the 10 questions on the test by​ guessing?

a. b(100​,.53​,77​) b. b(10​,.85​,8​) c. ​b(10​,.25​,6​)

Babies born weighing 2500 grams​ (about 5.5​ pounds) or less are called​ low-birth-weight babies, and this condition sometimes indicates health problems for the infant. The mean birth weight for children born in a certain country is about 3422 grams​ (about 7.5 ​pounds). The mean birth weight for babies born one month early is 2680 grams. Suppose both standard deviations are 470 grams. Also assume that the distribution of birth weights is roughly unimodal and symmetric. a. Find the standardized score​ (z-score), relative to all births in the​ country, for a baby with a birth weight of 2500 grams. b. Find the standardized score for a birth weight of 2500 grams for a child born one month​ early, using 2680 as the mean. c. For which group is a birth weight of 2500 grams more​ common? Explain what that implies. Unusual​ z-scores are far from 0. Choose the correct answer below.

a. z-score (for 2500 gm birth weight relative to all births) z1=(2500/3422)/470=-1.96 b. z-score (for child born 1 month early) z2= (2500-2680)/470=-0.38 c. A birth weight of 2500 grams is more common for babies born one month early. This makes sense because babies gain weight during​ gestation, and babies born one month early had less time to gain weight.

Babies born weighing 2500 grams​ (about 5.5​ pounds) or less are called​ low-birth-weight babies, and this condition sometimes indicates health problems for the infant. The mean birth weight for children born in a certain country is about 3489 grams​ (about 7.7 ​pounds). The mean birth weight for babies born one month early is 2654 grams. Suppose both standard deviations are 490 grams. Also assume that the distribution of birth weights is roughly unimodal and symmetric. a. Find the standardized score​ (z-score), relative to all births in the​ country, for a baby with a birth weight of 2500 grams. b. Find the standardized score for a birth weight of 2500 grams for a child born one month​ early, using 2654 as the mean. c. For which group is a birth weight of 2500 grams more​ common? Explain what that implies. Unusual​ z-scores are far from 0. Choose the correct answer below.

a. z=−2.02 b. z=-.31 c. A birth weight of 2500 grams is more common for babies born one month early. This makes sense because babies gain weight during​ gestation, and babies born one month early had less time to gain weight.

In​ 2007, the mean rate of violent crime​ (per 100,000​ people) for 24 states west of a certain river was 403. The standard deviation was 155. Assume that the distribution of violent crime rates is unimodal and symmetric. Complete parts a through d below. a. Using the Empirical​ Rule, between what two values would you expect to find about​ 95% of the​ rates? b. Using the Empirical​ Rule, between what two values would you expect to find about​ 68% of the​ rates? c. If a western state had a violent crime rate of 319319 crimes per​ 100,000 people, would this be considered this​ unusual? Explain. d. If a western state had a violent crime rate of 8080 crimes per​ 100,000 people, would this be considered this​ unusual? Explain.

a. ≅​ 95% of the observations fall within the interval (93,713). b. ≅​ 68% of the observations fall within the interval (248,558). c. No, 319 would not be considered unusual because it falls within one standard deviation of the mean. d. Yes, 8080 would be considered unusual because it is more than two standard deviations below the mean.

According to a traffic safety administration​ organization, the rate of seat belt use in a certain country for 2009 was 8484​%. Suppose that you looked at two​ people, selected randomly and independently from the​ population, to see whether each had his or her seat belt fastened in 2009. a. What is the probability that neither person had her or his belt​ fastened? b. What is the probability that at least one person had her or his belt​ fastened? (Hint:​ "At least one had the belt​ fastened" is the complement of​ "neither one had the belt​ fastened.")

a.​ P(Neither ​fastened)=. 0256 ​(Round to four decimal places as​ needed.) b.​ P(At least one ​fastened)=. 9744 ​(Round to four decimal places as​ needed.)

Thirty GPAs from a randomly selected sample of statistics students at a college are linked below. Assume that the population distribution is approximately Normal. The technician in charge of records claimed that the population mean GPA for the whole college is 2.86. a. What is the sample​ mean? Is it higher or lower than the population mean of 2.86​? b. The chair of the mathematics department claims that statistics students typically have higher GPAs than the typical college student. Use the​ four-step procedure and the data provided to test this claim. Use a significance level of 0.05. Which of the following correctly states H0 and Ha​?

click the table icon, then the window icon, then command c. paste to: http://www.calculatorsoup.com/calculators/statistics/mean-median-mode.php a. The sample mean is 3.10. It is higher than the population mean of 2.86. b. H0​: μ=2.86 Ha​: μ>2.86 (to find the test statistic and p-value, open in stat crunch[press table icon, window icon, open in stat chrunch], then compete a "One sample T hypothesis test" [stat, t-stat, one sample with data, compute]) t=3.90 p=0.000 Reject H0. The mean GPA for statistics students is significantly higher than 2.86.

Roll a fair​ six-sided die five times and record the number of spots on top. Which of the following sequences is more​ likely? Explain. Sequence​ A: 44444 Sequence​ B: 51312

he two sequences are equally​ likely, because they both have a probability of 1/7776 (6*6*6*6*6)

Three scatterplots are shown below. The calculated correlations are 0.667​, minus−0.924​, and 0.010. Determine which correlation goes with which scatterplot. Match each scatterplot with its corresponding correlation.

​(a) 0.010 ​(b) 0.667 ​(c) - 0.924

A researcher carried out a hypothesis test using a​ two-tailed alternative hypothesis. Which of the following​ z-scores is associated with the smallest​ p-value? Explain. i. z=−0.57 ii. z=0.93 iii. z=−2.32 iv. z=−3.12 Which​ z-score has the smallest​ p-value? Explain.

−3.12 The​ z-score farthest from 0 has the smallest tail area and thus has the smallest​ p-value.


Kaugnay na mga set ng pag-aaral

SYG2000 Chapter 4. Socialization and the Construction of Reality

View Set

I Don't Have Enough Faith to be an Atheist: Chapter Seven

View Set

Sugar Changed the World, Part 2: Central Ideas

View Set

Reading for Content / Reading Techniques Review and Quiz Questions

View Set