STATS section 7

Ace your homework & exams now with Quizwiz!

A professor divided the students in her business class into three groups: those who have never taken a statistics class, those who have taken only one semester of a statistics class, and those who have taken two or more semesters of statistics. The professor randomly assigns students to groups of three to work on a project for the course. If 10% of the students have never taken a statistics class, 35% have taken only one semester of a statistics class, and the rest have taken two or more semesters of statistics, what is the probability that both of the first two groupmates you meet have studied at least one semester of statistics?

(0.90)^2 ) 0.81

A university's administrator proposes to do an analysis of the proportion of graduates who have not found employment in their major field one year after graduation. In previous years, the percentage averaged 15%. He wants the margin of error to be within 5% at a 99% confidence level. What sample size will suffice? Round to the nearest integer.

(2.576/0.05)^2(0.15)(1-0.15) =339

Correlation Properties

- if close to -1 or 1: strong - if close to 0: week - no units - changing the units of x or y does not affect r - sensitive to outliers - does not demonstrate causation conditions: - quantitative variables - straight enough - no outliers

A professor divided the students in her business class into three groups: those who have never taken a statistics class, those who have taken only one semester of a statistics class, and those who have taken two or more semesters of statistics. The professor randomly assigns students to groups of three to work on a project for the course. If 35% of the students have never taken a statistics class, 25% have taken only one semester of a statistics class, and the rest have taken two or more semesters of statistics, what is the probability that neither of the first two groupmates you meet has studied any statistics?

0.123

Suppose that 19% of people have a dog, 29% of people have a cat, and 7% of people own both. What is the probability that someone owns a dog or a cat?

0.19+0.29-0.07 = 0.41

In a Stats class, 57% of students eat breakfast in the morning and 80% of students floss their teeth. Forty-six percent of students eat breakfast and also floss their teeth. What is the probability that a student from this class eats breakfast but does NOT floss?

11%

A company's manufacturing process uses 500 gallons of water at a time. A "scrubbing" machine then removes most of a chemical pollutant before pumping the water into a nearby lake. To meet federal regulations the treated water must not contain more than 80 parts per million (ppm) of the chemical. Because a fine is charged if regulations are not met, the company sets the machine to attain an average of 75 ppm in the treated water. The machine's output can be described by a normal model with standard deviation 4.2 ppm. What percent of the batches of water discharged exceed the 80 ppm standard?

11/7%

n=10, x=13.0, s=4.5. Find a 95% confidence interval for the mean. Round to two decimal places as needed.

13.0+2.26x(4.5/V10) (9.78, 16.22)

People with z-scores of 2.25 or above on a certain aptitude test are sometimes classified as geniuses. If aptitude test scores have a mean of 100 and a standard deviation of 16 points, what is the minimum aptitude test score needed to be considered a genius?

136

Assume that 15% of students at a university wear contact lenses. We randomly pick 200 students. What is the mean of the proportion of students in this group who may wear contact lenses?

15%

Assume that 10% of students at a university wear contact lenses. We randomly pick 200 students. What is the standard deviation of the proportion of students in this group who may wear contact lenses? Round to two decimal places.

2.12%

Based on a sample of size 49, a 95% confidence interval for the mean score of all students, μ, on an aptitude test is from 59.2 to 64.8. Find the margin of error.

2.8

Based on a sample of size 49, a 95% confidence interval for the mean score of all students, μ, on an aptitude test is from 59.2 to 64.8. Find the margin of error.

2.8

We have calculated a confidence interval based on a sample of size n=100. Now we want to get a better estimate with a margin of error that is only one-fourth as large. How large does our new sample need to be?

200

What is the probability that a person likes to watch football, given that she also likes to watch basketball? Basketball 27 8 No_Basketball 39 26

27/35 = 0.771

Political analysts estimate the probability that Candidate A will run for president in 2016 is 45%, and the probability that Candidate B will run is 20%. If their political decisions are independent, then what is the probability that only Candidate A runs for president?

36%

Anytime a survey is conducted, care must be taken to avoid undercoverage. Suppose a firm selects 500 names from a city phone book, calls their homes between noon and 4 p.m., and interviews whoever answers, anticipating contacts with at least 200 people.

A simple random sample is difficult in this case because there is a problem with undercoverage. People with unlisted phone numbers, people without phones, and people who are at work or away from home are not in the sampling frame. The phone numbers could be randomly generated, and the calls could be at random times of the day. That way people with unlisted numbers and people away from their homes from noon to 4 p.m. could be included. Under the original plan, those families in which one person stays home are more likely to be included. Under the second plan, many more are included. People without phones are still excluded. Follow-up of this type greatly improves the chance that a selected household is included, increasing the reliability of the survey. Random dialers allow people with unlisted phone numbers to be selected. Time of day will still be an issue, as will people without phones.

Shortly after the Sandy Hook Elementary School shooting in December 2012, a local television news program asked viewers to call in with their opinion about gun control. These results were likely biased because of what reason?

A voluntary response sample

Some people have been complaining that the children's playground at a municipal park is too small and is in need of repair. Managers of the park decide to survey city residents to see if they believe the playground should be rebuilt. They hand out questionnaires to parents who bring children to the park. Describe some possible biases in this sample.

A. The sampling scheme suffers from voluntary response bias. Only parents who choose to visit the park and feel strongly about the issue are likely to respond. Your answer is correct. B. The sampling scheme suffers from undercoverage, because only people who come to the park to use the playground will respond. Parents who are dissatisfied with the playground will not come.

You recently began an internship at your local chapter of savethepigeons.com. Concerned about a city ballot initiative dealing with the environment, you conduct a telephone survey of local residents. What are some possible sources of bias in your results?

A. non-response bias Your answer is correct. B. response bias Your answer is correct. C. undercoverage of the population

A researcher investigating the association between two variables collected some data and was surprised when he calculated the correlation. He had expected to find a fairly strong association, yet the correlation was near 0. Discouraged, he didn't bother making a scatterplot. Explain to him how the scatterplot could still reveal the strong association he anticipated.

Although there is no strong linear association between the variables, the scatterplot could reveal a strong nonlinear relationship.

A data set on roller coasters listed the Duration of the ride in seconds in addition to the Drop height in feet. One coaster was unusual for having large drop but a short ride. After setting it aside, a regression to predict Duration from Drop for the remaining 85 coasters had R2=32.0%. Write a sentence (in context) summarizing what R2 says about this regression.

Approximately 32.0% of the variability in Duration can be accounted for by the least squares linear regression on Drop.

A polling company conducts an annual poll of adults about political opinions. The survey asked a random sample of 2390 adults whether they think things in the country are going in the right direction or in the wrong direction. 70% said that things were going in the wrong direction. Complete parts a and b below.

Calculate the margin of error for the proportion of all adults who think things are going in the wrong direction for 90% confidence. z=1-0.10/2=1.645 ME=1.645x(0.7x0.3/2390)racine =0.015 Explain what this margin of error means. We are 90% confident that the observed proportion of adults that responded "wrong track" is within 0.015 of the population proportion.

Funding for many schools comes from taxes based on assessed values of local properties. People's homes are assessed higher if they have extra features such as garages and hot tubs. Assessment records in a certain school district indicate that 30% of the homes have garages and 5% have hot tubs. The Addition Rule might suggest, then, that 35% of residences have a garage or a hot tub. What is wrong with that reasoning?

A home may have a garage and a hot tub. The events are not disjoint, so the Addition Rule does not apply.

A polling company conducts an annual poll of adults about political opinions. The survey asked a random sample of 386 adults whether they think things in the country are going in the right direction or in the wrong direction. 58% said that things were going in the wrong direction. Complete parts a and b below.

Are the assumptions and conditions required to apply a confidence interval met? Yes, all assumptions and conditions are met. The margin of error for the proportion of all adults who think things are on the wrong track for 90% confidence is 0.041. Would the margin of error be larger or smaller for 80% confidence? To be less confident, the interval needs to contain the true proportion less often, so the margin of error would be smaller.

What is the advantage of making a stem-and-leaf display instead of a dotplot?

A stem-and-leaf display preserves the individual data values.

Data in 1980 showed that about 40% of the adult population had never smoked cigarettes. In 2004, a national health survey interviewed a random sample of 2000 adults and found that 50% had never been smokers. Create a 95% confidence interval for the proportion of adults (in 2004) who had never been smokers. Round to the nearest tenth.

Based on the data, we are 95% confident the proportion of adults in 2004 who had never smoked cigarettes is between 47.8% and 52.2%.

In a research study on trends in marriage and family, 5% of randomly selected parents said that they never spank their children. The 95% confidence interval is from 3.8% to 6.2% (n=1207).

One is 95% confident that, if one were to ask every parent, between 3.8% to 6.2% of them would say they never spank their children. If one were to collect many random samples of 1207 parents, about 95% of the confidence intervals one constructs would contain the true proportion of all parents who would say that they never spank their children.

In a blood testing procedure, blood samples from 5 people are combined into one mixture. The mixture will only test negative if all the individual samples are negative. If the probability that an individual sample tests positive is 0.12, what is the probability that the mixture will test positive? Round to three decimal places.

P(everyone tests negative) = 1 - (88)^2 = 0.472

The errors in predicting hurricane tracks are given in nautical miles. A statutory mile is 0.86898 nautical mile. Most people living on the Gulf Coast of the United States would prefer to know the prediction errors in statutory miles rather than nautical miles. Explain why converting the errors to miles would not change the correlation between Prediction Error and Year.

The correlation will not change because it is based on standardized values (z-scores). The z-scores of the prediction errors are the same regardless of the units.

A study finds that during blizzards, online sales are highly associated with the number of snow plows on the road; the more plows, the more online purchases. The director of an association of online merchants suggests that the organization should encourage municipalities to send out more plows whenever it snows because, he says, that will increase business. Comment on the director's conclusion.

The director is wrong. The lurking variable here is the severity of the blizzard. A more severe blizzard calls for more plows and keeps people at home, where they are more likely to make online purchases.

Data were collected on the hourly wage ($) for two types of marketing managers: (1) advertising / promotion managers and (2) sales managers. The results were used to create the following histograms.

The distribution of hourly wages for sales managers is unimodal and skewed left.

A study of body fat on 250 men collected measurements of 12 body parts as well as the percentage of body fat that the men carried. The first accompanying display is a dotplot of their bicep circumferences (in centimeters). The second accompanying display was formed by dividing each measurement by 2.54 to convert it to inches. Do the two dot plots look different? What might account for that?

The dotplots look different. The plot based on inches has fewer values on the horizontal axis, so it shows less detail.

Prior to a mayoral election, a newspaper conducted a poll. The paper surveyed a random sample of registered voters stratified by political party, age, sex, and area of residence. This poll predicted that Candidate A would win the election with 52% of the vote. The newspaper was wrong: Candidate A lost, getting only 46% of the vote. Do you think the newspaper's faulty prediction is more likely to be a result of bias or sampling error? Explain.

The faulty prediction was more likely to be the result of sampling error. While the sampling method suggests that the sample obtained would be representative of the voting population, random chance in selecting the individuals who were polled means that sample statistics could vary from the population parameter.

A regression model uses a car's engine displacement to estimate its fuel economy. In this context, what does it mean to say that a certain car has a positive residual?

The gas mileage was better than the model predicts for a car with that size engine.

On a final project in an introductory statistics class, a student reports a 95% confidence interval for the average cost of a haircut to be ($5.50,$65.00). What is the correct interpretation of this confidence interval?

There is 95% confidence that the population mean is between these two numbers.

Meteorologists utilize sophisticated models to predict the weather up to ten days in advance. Give an example of how they might assess their models.

They can use the models to predict the average temperature ten days in advance and compare their predictions to the actual temperatures.

The managers of a large company wished to know the percentage of employees who feel "extremely satisfied" to work there. The company has roughly 28,000 employees. They contacted a random sample of employees and asked them about their job satisfaction, obtaining 491 completed responses. How does their study deal with the three Big Ideas of sampling?

They gathered data from only a part of the large population of employees, they selected that part at random, and a sample size of several hundred is reasonable.

A local TV station conducted a "Pulse-Poll" about the upcoming mayoral election. Evening news viewers were invited to text in their votes, with the results being announced on the late-night news. Based on the texts, the station predicted that the current mayor would win the election with 52% of the vote. They were wrong and the mayor lost, getting only 46% of the vote. Do you think the station's faulty prediction is more likely to be a result of bias or sampling error? Explain.

The station's faulty prediction is a result of bias. Only people watching the news will respond, and their preference may differ from that of other voters. The sampling method may systematically produce samples that do not represent the population of interest.

A philanthropic organisation sent free mailing labels and greeting cards to a random sample of 100,000 potential donors on their mailing list and received 5105 donations.

What is the 99% confidence interval? 99%=2.576 n=100000 x=5105 p^=x/n =5105/100000 =0.05105 E=2.576xV0.05105(1-0.05105)/100000 =0.00179 Boundaries: p^+E=4.93 p^-E= 5.28 Given the confidence interval you found, is the proposed true rate, 5.5%, plausible? no

Of 535 samples of seafood purchased from various kinds of food stores in different regions of a country and genetically compared to standard gene fragments that can identify the species, 29% were mislabeled.

What is the 99% confidence interval? 99%=2.576 E=2.576xV0.29(1-0.29)/535 = 0.0505 Boundaries: P^-E=0.29-0.0505=24 p^+E=0.29+0.0505=34.1 What does the confidence interval say about seafood sold in the country? We are 99% confident that the interval captures the true proportion of all seafood sold in the country that is mislabeled. Is the government spokesperson's criticism valid? No, as long as the necessary assumptions and conditions were met, the results can be generalized.

The proportion of adult women in a certain geographical region is approximately 48%. A marketing survey telephones 500 people at random.

What proportion of the sample of 500 would you expect to be women? 0.48 sd=V0.48x0.52/500=0.022

According to a research survey, 28% of adults are pessimistic about the future of marriage and the family. That is based on a random sample of about 1600 people from a much larger body of adults. Is it reasonable for the research team to use a Normal model for the sampling distribution of the sample proportion? Why or why not?

Yes. The data are from a random sample, meeting the Randomization Condition. The data have at least 10 successes and 10 failures, meeting the Success/Failure Condition. The population is much larger than the sample, meeting the 10% Condition.

For each of the following, list the sample space and tell whether you think the events are equally likely: a) Roll two dice; record the sum of the numbers b) A family has 3 children; record each child's sex in order of birth c) Toss four coins; record the number of tails d) Toss a coin 10 times; record the length of the longest run of heads

a) S={2, 3, 4,5, 6,7, 8,9, 10,11, 12} the events are not equally likely. b) S={BBB, BBG, BGB,BGG, GBB,GBG, GGB,GGG} the events are equally likely c) S={0, 1, 2,3, 4} the events are not equally likely d) S={0, 1, 2,3, 4,5, 6,7, 8,9, 10} the events are not equally likely

A certain bowler can bowl a strike 70% of the time. What is the probability that she a) goes three consecutive frames without a strike? b) makes her first strike in the third frame? c) has at least one strike in the first three frames? d) bowls a perfect game (12 consecutive strikes)?

a) (0.30)^3 = 0.027 b) 0.30 x 0.30 x 0.70 c) 1 -0.027 = 0.973 d) (0.70)^12 = 0.014

The prerequisite for a required course is that students must have taken either course A or course B. By the time they are juniors, 52% of the students have taken course A, 23% have had course B, and 12% have done both. a) What percent of the juniors are ineligible for the course? b) What's the probability that a junior who has taken course A has also taken course B? c) Are taking these two courses disjoint events? Explain. d) Are taking these two courses independent events? Explain.

a) 0.52+0.23-0.12 = 0.63 1-0.63=0.37 b) 0.12/0.52 = 0.231 c) No, because there are outcomes that are common between them. d) No, because the outcome of one influences the probability of the other.

It's believed that as many as 24% of adults over 50 never graduated from high school. We wish to see if this percentage is the same among the 25 to 30 age group. What sample size would allow us to increase our confidence level to 95% while reducing the margin of error to only 3%?

p^=0.24 c=95%=1.960 E=0.03 n=(1.960/0.03)^2(0.24)(1-0.024) =779

definition chap 10

population of interest the population/group from which a researcher tries to draw conclusions representative sampling biased samples tend to not be representative. they generate statistics that are much higher or much lower thant the parameters sampling frame list of all individuals from which the sample is drawn systematic sample cheaper alternative to SRS. ex: survey every 10th person stratified sample dividing the population into homogenous groups called strata. proportionate amount cluster sample dividing the population into groups called clusters and a SRS of the group is selected random every individual has an equal chance to be selected parameter number numbers that summarize data for an entire population bias obtain sample that is perfectly representative of the population convenience sampling easy to reach undercoverage some portion of the population is not sampled of has a smaller representation non response bias no survey succeeds in getting responses from everyone response bias refers to anything in the survey design that influences the responses voluntary response sample people who have chosen to include themseves multistage sampling combination of two or more SRS census surveys each and every person of the population

You are doing a study for a non-profit group helping at-risk children in your city. Suppose you know that 14.2% of the children in your city live in poverty. This percentage is an example of a

population parameter

A company packaging potato chips maintains quality control by randomly selecting 20 cases from each day's production and weighing the bags. Then they open four random bags from each case and inspect the contents.

population: bags of potato chips produced by the company population parameter of interest: unclear sampling frame: all bags produced by the company each day sample for this study: the 80 total bags that are opened from the selected cases sampling method: multisatge sampling is randomizing employed? Yes left? No group was left out of the study type of biais is evident: there is no

You are trying to study the amount of financial aid students at your University receive. You sample 50 students and find out the average size of their financial aid packages. The average of your sample is a

sample statistic

Which matters more about a sample you draw from a population?

size of the sample

The side-by-side boxplots show the cumulative college GPAs for sophomores, juniors, and seniors taking an intro stats course in Autumn 2003.

sophomore

A friend of yours in your intro stats class obtains permission to randomly sample the University student body to conduct a satisfaction survey on some recent changes to the enrollment process. She randomly samples 50 freshmen, 50 sophomores, 50 juniors, and 50 seniors. This is an example of a

stratified sample

Molly's Reach, a regional restaurant and gift shop, has recently launched an increased sales campaign, particularly focused on their gift shop. To determine if their proposal is of interest, they plan to survey a random sample of their regular customers. Suppose that Molly's Reach has an alphabetized list of regular customers who belong to their rewards program. After randomly selecting a customer on the list, every 25th customer from that point on is chosen to be in the sample. What is this sampling plan called?

systematic

Should companies that promote teen smoking be liable to help pay for the costs of cancer institutions?

the question is biased toward yes because of the wording promote teen smoking. a better question may be should companies be responsible to help pay for the costs of cancer institutions

Your neighbor has bought a lottery ticket once a week for the last 10 years. He has not yet won, but feels his time is due. His chance of winning on the next ticket he buys is __________ the first one he ever bought.

the same as

Even though it is represented by numbers, this is categorical data and not suitable for correlation.

true

For a fixed margin of error, larger samples provide greater confidence.

true

For a specified confidence level, larger samples provide smaller margins of error.

true

For symmetric distributions, the median is in the middle.

true

We write y^ to denote the predicted values and y to denote the observed values.

true

An investment website can tell what devices are used to access their site. The site managers wonder whether they should enhance the facilities for trading via smartphones so they want to estimate the proportion of users who access the site that way. They draw a random sample of 451 investors from their customers. Suppose that the true proportion of smartphone users is 31%

unimodal and symmetric mean = 0.31 sd = 0.022 n=451 p=0.31 q=1-p=1-0.31=0.69 Vpq/n

If you create an online survey, individuals can choose on their own whether to participate in the sample. This causes a form of bias called

voluntary response

Data were collected on monthly sales revenues (in $1,000s) and monthly advertising expenditures ($100s) for a sample of drug stores. The regression line relating revenues (Y) to advertising expenditure (X) is estimated to be y=−48.3+9.00x. What is the predicted sales revenue for a month in which $1,000 was spent?

$41,700

A sociologist develops a test to measure attitudes about public transportation, and 27 randomly selected subjects are given the test. Their mean score is 76.2 and their standard deviation is 21.4. Construct the 95% confidence interval for the mean score of all such subjects. Round to the nearest two decimal places as needed.

(67.7, 84.7)

Does the presence of any outliers affect your overall conclusions about the prices in the four markets?

No, the presence of outliers does not affect the overall conclusions.

If the sex of a child is independent of all other births, is the probability of a woman giving birth to a girl after having four boys greater than it was on her first birth? Explain.

No, the probability after having four boys is equal to the probability on her first birth. If sex is independent of previous births, then the probability of a girl given she has had four boys must equal the probability of a girl.

The pie chart shows the ratings assigned to 839 first-run movies released in a recent year.

Yes, because each movie falls into only one category and no categories overlap.

The pie chart summarizes the genres of 110 first-run movies released one year.

Yes, because each movie falls into only one category and no categories overlap.

The Centers for Disease Control lists causes of death in the United States during 2013. (Each person is assigned only one cause of death.)

Yes, because there is no possibility for overlap.

An internet company conducts a global consumer survey to help multinational companies understand different consumer attitudes throughout the world. Within 30 countries, the researchers interview 1000 people aged 13-65. Their samples are designed so that they get 500 males and 500 females in each country.

a) Are they using a simple random sample? Explain. No. It would be nearly impossible to get exactly 500 males and 500 females from every country by random chance. b) What kind of design do you think they are using? A stratified sample, stratified by whether the respondent is male or female.

A student figures that he has a 32% chance of being let out of class late. If he leaves class late, there is a 60% chance that he will miss his train. What is the probability that he gets out of class late and misses the train?

0.32 x 0.60 = 0.192

A manufacturing process has a 70% yield, meaning that 70% of the products are acceptable and 30% are defective. If three of the products are randomly selected, find the probability that all of them are acceptable.

0.343

A survey of senior citizens at a doctor's office shows that 40% take blood pressure-lowering medication, 47% take cholesterol-lowering medication, and 13% take both medications. What is the probability that a senior citizen takes either blood pressure-lowering or cholesterol-lowering medication?

0.40 +0.47 - 0.13

A professor divided the students in her business class into three groups: those who have never taken a statistics class, those who have taken only one semester of a statistics class, and those who have taken two or more semesters of statistics. The professor randomly assigns students to groups of three to work on a project for the course. If 55% of the students have never taken a statistics class, 25% have taken only one semester of a statistics class, and the rest have taken two or more semesters of statistics, what is the probability that the first groupmate you meet has studied some statistics?

0.45 because 0.55 have never taken the class

In one city, 47.5% of adults are female, 10.2% of adults are left-handed, and 4.8% are females who are left handed. For an adult selected at random from the city, let F=event the person is female L=event the person is left-handed. Find P(F or L). Round to three decimal places.

0.529 0.475+0.102-0.048

The number of hours per week that high school seniors spend on homework is normally distributed, with a mean of 10 hours and a standard deviation of 3 hours. 60 students are chosen at random. Let represent the mean number of hours spent on homework for this group. Find the probability that is between 9.8 and 10.4. Round to three decimal places.

0.547

A nervous kicker usually makes 82% of his first field goal attempts. If he makes his first attempt, his success rate rises to 89%. What is the probability that he makes his first two kicks?

0.82 x 0.89 = 0.73

The auto insurance industry crashed some test vehicles into a cement barrier at speeds of 5 to 25 mph to investigate the amount of damage to the cars. They found a correlation of r=0.60 between speed (MPH) and damage ($). If the speed at which a car hit the barrier is 1.5 standard deviations above the mean speed, what is the damage expected to be?

0.90 SD above

You purchased a five-pack of new light bulbs that were recalled because 11% of the lights did not work. What is the probability that at least one of your lights is defective?

1-0.11 = 0.89 (0.89)^5 = 0.558 1-0.559 = 0.442

For a sales promotion, the manufacturer places winning symbols under the caps of 21% of all its soda bottles. If you buy a six-pack of soda, what is the probability that you win something?

1-0.21 = 0.79 (0.79)^6 = 0.243 1-0.243 = 0.757

Corey has 4929 songs in his computer's music library. The songs have a mean duration of 244.7 seconds with a standard deviation of 110.31 seconds. One of the songs is 382 seconds long. What is its z-score?

1.24

Based on a sample of 30 randomly selected years, a 90% confidence interval for the mean annual precipitation in one city is from 48.7 inches to 51.3 inches. Find the margin of error.

1.3

Based on past experience, a bank believes that 4% of the people who receive loans will not make payments on time. The bank has recently approved 300 loans. What is the mean of the proportion of clients in this group who may not make timely payments?

What is the percentage of consumers who are male and prefer Fujifilm?

4.1%

In a survey of 300 T.V. viewers, 40% said they watch network news programs. Find the margin of error for this survey if we want 95% confidence in our estimate of the percent of T.V. viewers who watch network news programs. Round to two decimal places.

5.54% 95%=1.960 ME = 1.960V0.40(1-0.40)/300

Based on the Normal model for yearly snowfall in cm in a certain town N(57, 8), how many cm's of snow would represent the 80th percentile approximately? Round to the nearest tenth as needed.

63.7 cm

Finding Appropriate z*-Values for Given Confidence Levels

80% = 1.28 90% = 1.645 95% = 1.960 98% = 2.326 99% = 2.576

A survey found that 79% of a random sample of 1024 American adults approved of cloning endangered animals. Find the margin of error for this survey if we want 90% confidence in our estimate of the percent of American adults who approve of cloning endangered animals.

90% = 1.645 1.645V0.79(1-0.79)/1024 = 2.09%

91%

Based on data collected from its production processes, Crosstiles Inc. determines that the breaking strength of its most popular porcelain tile is normally distributed with a mean of 400 pounds per square inch and a standard deviation of 12.5 pounds per square inch. Based on the 68-95-99.7 Rule, about what percent of its popular porcelain tile will have breaking strengths between 375 and 425 pounds per square inch?

When 346 college students are randomly selected and surveyed, it is found that 121 own a car. Construct a 99% confidence interval for the percentage of all college students who own a car. Round to one decimal place.

99% = 2.576 121/346=35% E= 2.576V0.35(1-0.35/346) = 0.066054 Boundaries 0.35+-E = (28.4%, 41.6%)

99%=2.576 (2.576/0.05)^2 (0.15)(1-0.15) = 339

Suppose that 36% of families living in a certain country own a desktop computer and 21% own a laptop. The Addition Rule might suggest, then, that 57% of families own either a desktop computer or a laptop. What's wrong with that reasoning?

A family may own both a desktop computer and a laptop. The events are not disjoint, so the Addition Rule does not apply.

A friend says "I flipped five heads in a row! The next one has to be tails!" Explain why this thinking is incorrect.

An insurance company checks police records on 584 accidents selected at random and notes that teenagers were at the wheel in 81 of them.

Construct the 95% confidence interval for the percentage of all auto accidents that involve teenage drivers. n=584 x=81 p^=x/n=81/584=0.139 E=1.960V0.139(1-0.139)/584=0.0281 Boundaries: p^+E=11 p^-E= 16.6 Explain what your interval means. We are 95% confident that the true percentage of accidents involving teenagers falls inside the confidence interval limits. Explain what "95% confidence" means. About 95% of random samples of size 584 will produce confidence intervals that contain(s) the true proportion of accidents involving teenagers. A politician urging tighter restrictions on drivers' licenses issued to teens says, "In one of every five auto accidents, a teenager is behind the wheel." Does the confidence interval contradict this statement? The confidence interval contradicts the assertion of the politician. The figure quoted by the politician is outside the interval.

A college's data about the incoming freshmen indicates that the mean of their high school GPAs was 3.3, with a standard deviation of 0.35; the distribution was roughly mound-shaped and only slightly skewed. The students are randomly assigned to freshman writing seminars in groups of 25.

Describe the appropriate sampling distribution model, including shape, center, and spread. N(3.3,0.07) What assumptions and conditions must be satisfied for the sampling distribution model to be appropriate? Select all that apply. - The distribution of GPAs is roughly unimodal and symmetric, so the sample is large enough. - Individuals' GPAs are independent. - The students represent less than 10% of all possible students. Make a sketch using the 68-95-99.7 Rule. C

Medical researchers followed 1435 middle-aged men for a period of 5 years, measuring the amount of Baldness present (none=1, little=2, some=3, much=4, extreme=5) and presence of Heart Disease (No=0, Yes=1). They found a correlation of 0.089 between the two variables. Comment on their conclusion that this shows that baldness is not a possible cause of heart disease.

Even though it is represented by numbers, this is categorical data and not suitable for correlation.

A casino claims that its roulette wheel is truly random. What should that claim mean?

Every number is equally likely to occur.

A larger firm is considering acquiring a small bookstore. An analyst for the firm, noting that there is a strong, positive relationship between the number of sales people working and the amount of sales, suggests that when they acquire the store they should hire more people because that will drive higher sales. Is his conclusion justified? What alternative explanations can you offer? Use appropriate statistics terminology.

His conclusion is not justified. Correlation does not demonstrate causation. When there is a strong correlation, it is not justifiable to conclude that the predictor variable caused the change. The analyst's argument is that sales staff causes sales. However, it may be the reverse, that more people were hired as sales increased.

It's believed that as many as 21% of adults over 50 never graduated from high school. We wish to see if this percentage is the same among the 25 to 30 age group.

How many of this younger age group must we survey in order to estimate the proportion of non-grads to within 4% with 90% confidence? ^p= 0.21 c= 0.90=1.645 E= 0.04 (proportion of) n=(1.645/0.04)^2(0.21)(1-0.21) = 281 Suppose we want to cut the margin of error to 3%. What is the necessary sample size? E= 0.03 n=(1.645/0.03)^2(0.21)(1-0.21) = 499 What sample size would produce a margin of error of 2%. E=0.02 n=(1.645/0.02)^2(0.21)(1-0.21) =1123

In preparing a report on the economy, we need to estimate the percentage of businesses that plan to hire additional employees in the next 60 days.

How many randomly selected employers must we contact in order to create an estimate in which we are 95% confident with a margin of error of 8%? 0.08=1.96V(0.50)(0.50)/n n=(1.96)^2(0.50)(0.50)/(0.08)^2 =151 Suppose we want to reduce the margin of error to 5%. What sample size will suffice? 385 Why might it not be worth the effort to try to get an interval with a margin of error of 1%? The sample size becomes very large, and it is probably not worth the effort.

A regression analysis of students' college grade point averages (GPAs) and their high school GPAs found R2=0.311. Which of these is TRUE?

III only 31.1% of the variance in college GPA can be accounted for by the model.

Concerned about reports of discolored scales on fish caught downstream from a newly sited chemical plant, scientists set up a field station in a shoreline public park. For one week they asked fishermen there to bring any fish they caught to the field station for a brief inspection. At the end of the week, the scientists said that 22% of the 338 fish that were submitted for inspection displayed the discoloration. From this information, can the researchers estimate what proportion of fish in the river have discolored scales? Explain.

If discolored fish are not equally likely to be caught as normal fish, or fisherman are more disposed to bring discolored fish than normal fish, then the sample will be biased and resulting estimation will be biased.

Flipping a fair coin is said to randomly generate heads and tails with equal probability. Explain what random means in this context.

In the long run, a fair coin will generate 50% heads and 50% tails, approximately. But for each flip, the outcome cannot be predicted.

Satellites send back nearly continuous data on the Earth's land masses, oceans, and atmosphere from space. How might researchers use this information in both the short and long term to help study changes in the Earth's climate?

In the short term, researchers can more accurately report weather patterns, including hurricanes and tsunamis. In the long term, this rise and fall of temperature and water levels can help in planning for future problems and guide public policy to protect our safety.

The Environmental Protection Agency provides fuel economy and pollution information on over 2000 car models. Here is a boxplot of combined fuel economy (using an average of driving conditions) in miles per gallon by vehicle type (midsize car, standard pickup truck, or SUV) for 2012 model vehicles. Summarize the fuel economies of the three vehicle types.

In general, fuel economy is higher in cars than in either SUVs or pickup trucks. The top 50% of cars get higher fuel economy than 75% of SUVs and all pickups. The distribution for pickups shows less spread.

A town's January high temperatures average 37°F with a standard deviation of 10°, while in July the mean high temperature is 72° and the standard deviation is 8°. In which month is it more unusual to have a day with a high temperature of 53°? Explain.

It is more unusual to have a day with a high temperature of 53° in July. A high temperature of 53° in July is 2.375 standard deviations below the mean and a high temperature of 53° in January is only 1.600 standard deviations above the mean

A study of traffic delays in 68 cities found the relationship shown in the scatterplot to the right between Total Delay (in total hours lost) and Mean Highway Speed. Is it appropriate to summarize the strength of association with a correlation? Explain.

It is not appropriate to calculate the correlation because there is at least one outlier. It is not appropriate to calculate the correlation because the trend is not linear.

Suppose that a Normal model describes the acidity (pH) of rainwater, and that water tested after last week's storm had a z-score of 1.8. What does the z-score imply about the acidity of the rain?

It means that the acidity of the rain had a pH 1.8 standard deviations higher than that of average rainwater.

Consider drawing a random sample only from landline phone exchanges. Discuss the advantages and disadvantages of such a sampling method compared with surveying randomly generated telephone numbers from non-landline (cell phone) exchanges. Do you think these advantages and disadvantages have changed over time? How do you expect they'll change in the future?

Landline phones are more likely to be used by households, older individuals, and businesses. This will cause an undercoverage bias. These disadvantages will increase. As cell phones grow in use, and landline phones become less common, this problem will become greater.

A poll taken this year asked 1019 adults whether they were fans of a particular sport and 38% said they were. Last year, 43% of a similar-size sample had reported being fans of the sport. Complete parts a through e below.

ME=1.64V0.38(1-0.38)/1019 =0.025 Explain what that margin of error means. One is 90% confident that this sample proportion is within ±ME of the true proportion of adults who are fans of the sport. If one wanted to be 80% confident instead of 90% confident, would the margin of error be larger or smaller? To be less confident, the interval needs to contain the true proportion less often, so the margin of error would be smaller. Find the margin of error for the poll taken this year if one wants 80% confidence in the estimate of the percent of adults who are fans of the sport. ME=1.28V0.38(1-0.38)/1019 =0.019 In general, if all other aspects of the situation remain the same, will smaller margins of error produce greater or less confidence in the interval? Less confidence

In an effort to increase the sales of their more expensive larger-sized pizzas, a pizzeria analyzed how its coupons were used by customers in regard to what size pizza they chose. The table below shows the percentage of coupons used for each size of pizza, and the percentage of each type of pizza ordered during a four-month period. Compare the two distributions.

Medium-size pizzas were most likely to be ordered during thefour-month period; this size pizza composed a relatively large proportion of the coupons that were used during thefour-month period. Most coupons during thefour-month period were used to order small pizzas; this pizza size composed a relatively small proportion of all the pizzas that were ordered.

For her final project, Stacy plans on surveying a random sample of 40 students on whether they plan to go to Florida for Spring Break. From past years, she guesses that about 12% of the class goes. Is it reasonable for her to use a Normal model for the sampling distribution of the sample proportion? Why or why not?

No, because the data doesn't meet the Success/Failure Condition.

A survey of the world's nations in 2014 shows a strong positive correlation between percentage of the country using smart phones and life expectancy in years at birth.

No. It just means that in countries where smart phone use is high, the life expectancy tends to be high as well. General economic conditions of the country could affect both smart phone use and life expectancy. This is a lurking variable.

The distribution of scores on a test for a particular class is skewed to the left. The professor wants to predict the maximum score and understand the distribution of the sample maximum. She simulates the distribution of the maximum of the test for 34 different tests (with n = 5). The histogram to the right shows a simulated sampling distribution of the sample maximum from these tests. Complete parts a) and b) below.

No. The sampling distribution of the maximum is skewed to the left, so a Normal model would not be useful for this sampling distribution. No. The 68-95-99.7 Rule is based on the Normal distribution.

On a certain ship that sank, the probability of survival was 0.394. Among first class passengers, it was 0.794. Were survival and ticket class independent? Explain.

No, because the probability of survival and the probability of survival given a first class passenger are not the same.

The weather reporter on TV makes predictions such as a 25% chance of rain. What is the meaning of such a phrase?

On days with conditions such as these, rain occurs 25% of the time.

A research company polled a random sample of 799 U.S. teens about Internet use. 49% of those teens reported going online several times a day—a fact of great interest to advertisers. The 95% confidence interval for this number is from 45.6% to 52.5%. Complete parts a and b below.

One is 95% confident that, if one were to ask all teens whether they go online several times a day, between 45.6% and 52.5% of them would say they do. If one were to collect many random samples of 799 teens, about 95% of the confidence intervals one constructs would contain the true proportion of all teens who admit to going online several times a day.

For each situation described below, identify the population and the sample, explain what p and p^ represent, and tell whether it is appropriate to create a one-proportion z-interval.

Police set up an auto checkpoint at which drivers are stopped and their cars inspected for safety problems. They find that 10 of the 143 cars stopped have at least one safety violation. They want to estimate the percentage of all cars that may be unsafe. all cars sample: Those cars actually stopped at the checkpoint p=The proportion of all cars with safety problems p^= The proportion of cars actually seen with safety problems Is it appropriate to create a one-proportion z-interval? if the sample a cluster it is not appropriate becuase htere not at least 10 successes and 10 failures An online blog puts a poll on its website asking readers to rate the quality of its articles. Of the 591 people who voted, 450 rated the articles very highly. The blog is interested in gauging its users' satisfaction. pop= the blog's reader sample= those who voted p= the proportion of all the blog's readers who rate its articles very highly p^= the proportion of those who votes in the poll Is it appropriate to create a one-proportion z-interval? It is not appropriate because the sample is biased and nonrandom. A college admits 1501 freshmen one year, and four years later, 1180 of them graduate on time. The college wants to estimate the percentage of all their freshman enrollees who graduate on time pop=student at the college sample= the 1501 p= the proportion of all students at college who will graduate on time p^= the proportion of that year's students who will graduate on time is it appropriate= If that year's students (a cluster sample) are viewed as a representative sample of all possible students at the school, then it is appropriate to create a one-proportion z-interval

Some food retailers propose subjecting food to a low level of radiation in order to improve safety, but sale of such "irradiated" food is opposed by many people. Suppose a grocer wants to find out what his customers think. He has cashiers distribute surveys at checkout and ask customers to fill them out and drop them in a box near the front door. He gets responses from 110 customers, of whom 71 oppose the radiation treatments. What can the grocer conclude about the opinions of all his customers?

Probably nothing, because those who bothered to fill out the survey may be a biased sample.

When you sample so that every combination of individuals in your population has an equal chance of being chosen you are taking a

SRS

What problems do you see with asking the following question of students? "Are you the first member of your family to seek higher education?"

Several terms are poorly defined. The survey needs to specify the meaning of "family" for this purpose and the meaning of "higher education." The term "seek" is also poorly defined as it does not specify what qualifies as seeking more education.

Describe how the shape, center, and spread of t-models change as the number of degrees of freedom increases.

Shape becomes closer to Normal, center does not change, spread becomes narrower.

Is there an association between time of year and the nighttime temperature in North Dakota? A researcher assigned the numbers 1-365 to the days January 1-December 31 and recorded the temperature at 2:00 a.m. for each. What might you expect the correlation between DayNumber and Temperature to be? Explain.

Temperatures should be low in January, increase through spring and into the summer months, and then decrease again in the fall and winter. Since the relationship is not linear, the correlation should be near 0.

What summary statistic would be chosen to summarize the spread of this distribution? Why?

The IQR would be the most appropriate measure of spread because of the slight skew and the extreme outliers.

Administrators at a university were interested in estimating the percentage of students who plan on going abroad during college. The university's student body has about 44,000 members. How might the administrators answer their question by applying the three Big Ideas?

The administrators should take a survey. They should sample a part of the student body, selecting respondents with a randomization method. They should be sure to draw a sufficiently large sample.

Two members of the PTA committee have proposed the accompanying questions to ask in seeking parent's opinions. Question 1 is "Should elementary school-age children have to pass high-stakes tests in order to remain with their classmates?" and Question 2 is "Should schools and students be held accountable for meeting yearly learning goals by testing students before they advance to the next grade?" Complete parts a and b below.

The answers for these two questions will definitely differ. Question 1 will probably get many "No" answers, while Question 2 will get many "Yes" answers. This is an example of response bias. "Do you think standardized tests are appropriate for deciding whether a student should be promoted to the next grade?

Which of the following is the Area Principle for displaying data?

The area occupied by a part of the graph should correspond to the magnitude of the value it represents.

The pie chart given to the right and bar chart given below summarize the movie genres of all the films shown in a suburban theatre over the course of one year. Complete parts a and b.

The bar chart, because it is easier to tell the size differences of the bars in the bar chart. The slices of the pie chart are too close in size.

survey finds that a 95% confidence interval for the mean salary of a police patrol officer in a certain city in a recent year is $52,516 to $53,509. A student is surprised that so few police officers make more than $53,509. Explain what is wrong with the student's interpretation.

The confidence interval only estimates the population mean salary. The interval does not say anything about individual salaries.

A survey of 1021 school-age children was conducted by randomly selecting children from several large urban elementary schools. Two of the questions concerned eye and hair color. In the survey, the accompanying codes were used. The statistics students analyzing the data were asked to study the relationship between eye and hair color. They provided the accompanying plot. Is their graph appropriate? If so, summarize the findings. If not, explain why not.

The graph is not appropriate. Despite having numerical codes, hair color and eye color are categorical variables. Boxplots are only appropriate for quantitative data.

A university teacher saved every e-mail from students in a large introductory statistics class during an entire term. He then counted, for each student who had sent him at least one e-mail, how many e-mails each student had sent. What does the accompanying histogram say about the distribution of e-mails sent by students?

The histogram is skewed right. It was most common for students to send just one or two e-mail messages, and most sent five messages or fewer. There was one outlier that sent 21 e-mails.

The accompanying histogram shows the total number of adoptions in each of 43 regions. Determine whether the mean number of adoptions or the median number of adoptions is higher. Why?

The mean is higher because the distribution is skewed to the high end, so the mean is pulled toward the higher values.

A research company polled a random sample of 930 teens about Internet use. 56% of those teens reported going online several times a day—a fact of great interest to advertisers. Complete parts a through c below.

The meaning of p=0.56 is that 56% of the 930 teens in the sample said they go online several times a day. This is the researchers' best estimate of p, the proportion of all U.S. teens who would say they do so. se=0.016 Explain what this standard error means in the context of this situation. The standard error is the best estimate of the standard deviation of the sampling distribution of the proportions, which measures the amount of variation in the sample proportion expected to be seen from sample to sample when 930 teens are asked the polling question.

What summary statistic would be chosen to summarize the center of this distribution? Why?

The median would be the most appropriate measure of center because of the slight skew and the extreme outliers.

An analysis of the amount of fiber (in grams) and the potassium content (in milligrams) in servings of 77 breakfast cereals produced the regression model Potassium=35+27Fiber. Explain what the slope means.

The model predicts that cereals will have approximately 27 more milligrams of potassium for every additional gram of fiber.

Sensors in parking lots are able to detect and communicate when spaces are filled in a large covered parking garage next to an urban shopping mall. How might the owners of the parking garage use this information both to attract customers and to help the store owners in the mall make business plans?

The owners of the parking garage can advertise about the availability of parking. They can also communicate with businesses about hours when more spots are available and when they should encourage more business.

An organization awards prizes in six categories to people each year. Their website allows you to look up all the prizes awarded in any year. The data are not listed in a table. Rather you drag a slider to the year and see a list of the awardees for that year. Describe the "who" in this scenario.

The people who have been awarded a prize from the organization

The company plans to have the manager of each corporate division hold a meeting of their employees to ask whether they are unhappy on their jobs. They will ask people to raise their hands to indicate whether they are unhappy. What problems do you see with this plan?

The plan is likely to have biased results because employees won't want to express unhappiness in front of their supervisors or their coworkers.

Pollsters are interested in predicting the outcome of elections. Give an example of how they might model whether someone is likely to vote.

The pollsters might consider whether a person voted previously or whether he or she could name the candidates, which indicates a greater interest in the election.

For many people, breakfast cereal is an important source of fiber in their diets. Cereals also contain potassium, a mineral shown to be associated with maintaining a healthy blood pressure. An analysis of the amount of fiber (in grams) and the potassium content (in milligrams) in serving of 77 breakfast cereals produced the regression model Potassium=38+27Fiber. From this model you can estimate a cereal's potassium content from the amount of fiber it contains. In this context, what does it mean to say that a cereal has a negative residual?

The potassium content is actually lower than the model predicts for a cereal with that much fiber.

Traffic checks on a certain section of highway suggest that 70% of drivers are speeding there. Since 0.7×0.7=0.49, the multiplication rule might suggest that there is a 49% chance that two vehicles in a row are both speeding. What's wrong with that reasoning?

There are cases when the speed of one car is not independent of the speed of another car, so the multiplication rule does not apply.

Suppose we want to estimate the proportion of defective items produced by a manufacturing process. Could we use the methods of this chapter to answer this question?

The president of the university plans a speech to an alumni group. He plans to talk about the proportion of students who responded in the survey that they are the first in their family to attend college, but the first draft of his speech treats that proportion as the actual proportion of current students who are the first in their families to attend college. Explain to the president the difference between the proportion of respondents who are first attenders and the proportion of the entire student body that are first attenders. Use appropriate statistics terminology.

The proportion of students who responded in the survey that are the first in their family to attend college is a statistic. The proportion of all students that are the first in their family to attend college is a parameter. The statistic estimates the parameter, but is not likely to be exactly the same.

Given that 16-year-olds are old enough to drive, is it fair to set the gambling age at

The question is biased toward "no" because of the preamble "16-year-olds are old enough to drive." A better question may be "Do you think the gambling age should be set at 18?"

A recent public survey asked the following question. "Many people believe this playground is too small and in need of repair. Do you think the playground should be repaired and expanded even if that means raising the entrance fee to the park?"

The question mentions higher fees, which could make people reject improvements to the playground. The statement points out problems the respondent may not have noticed, and might lead them to feel they should agree.

After an unusually dry autumn, a radio announcer is heard to say, "Watch out! We'll pay for these sunny days later on this winter." Explain what he's trying to say, and comment on the validity of his reasoning.

The radio announcer is trying to use the "Law of Averages." His reasoning is invalid; if rain in the fall and winter are independent of each other, a nice fall will have no bearing on winter rains.

Is it reasonable to conclude that 5.05% of all U.S. adults think that the higher education system provides an excellent value? Why or why not?

The sample is likely representative of the population of U.S. adults, so the true value may be close to 5.05%, but not exactly 5.05%, because this is only an estimate.

In a study of streams in the Adirondack Mountains, the following relationship was found between the water's pH and its hardness (measured in grains). Is it appropriate to summarize the strength of association with a correlation?

The scatterplot is not linear; correlation is not appropriate.

Analysis of the relationship between the fuel economy (mpg) and engine size (in liters) for 35 models of cars produces the regression model mpg=36.55−3.843•Engine size. Explain what the slope means.

The slope represents the decrease in mpg per 1 liter increase in engine size.

The boxplot shows the fuel economy ratings for 67 subcompact cars with the same model year. Some summary statistics are also provided. The extreme outlier is an electric car whose electricity usage is equivalent to 112 miles per gallon. If that electric car is removed from the data set, how will the standard deviation be affected? The IQR?

The standard deviation will be much lower. Since the standard deviation is calculated by summing the squared differences between the data values and the mean, removing the electric car will drastically lower this sum. The IQR will not change very much, if at all. All that removing the electric car can do is possibly change the location of each quartile to be the preceding data value, which will not have a huge impact on the IQR.

Least squares means that some of the squares of the residuals are minimized

The statement is false. Least squares means the sum of the squared residuals is minimized.

The company's annual report states, "Our survey shows that 84.28% of our employees are 'very happy' working here." Comment on that claim. Use appropriate statistics terminology.

The survey result is a statistic. It estimates the true proportion of satisfied workers in the population.

An analysis of the amount of fiber (in grams) and the potassium content (in milligrams) in servings of 77 breakfast cereals produced the regression model Potassium=39+29Fiber and se=30.84. Explain in this context what se=30.84 means.

The true potassium contents of cereals vary from the predicted amounts with a standard deviation of 30.84 milligrams.

A medical researcher estimates the percentage of children exposed to lead-based paint, adding that he believes his estimate has a margin of error of about 10%. Explain what the margin of error means.

The true proportion is within 10% of his estimate, with some degree of confidence.

A TV newscaster reports the results of a poll of voters, and then says, "The margin of error is plus or minus 5%." Explain carefully what that means.

The true proportion is within 5% of her estimate, with some degree of confidence.

A least squares regression line was calculated to relate the length (cm) of newborn boys to their weight in kg. The line is weight=−5.95+0.1769 length. Explain in words what this model means. Should new parents (who tend to worry) be concerned if their newborn's length and weight don't fit this equation?

The weight of a newborn boy can be predicted as −5.95 kg plus 0.1769kg per cm of length. Should new parents (who tend to worry) be concerned if their newborn's length and weight don't fit this equation? No, because this is a model fit to data. No particular baby should be expected to fit this model exactly.

A professor teaching a large lecture class of 450 students wants to sample her class. To do this, she rolls a die to determine the first row to hand out a survey to. Then, she also hands out the survey to every sixth row after this first row. She says that this is a Simple Random Sample because everyone had an equal opportunity to sit in any seat and because she randomized the choice of rows. What do you think? Be specific.

This is not an SRS. Although each student may have an equal chance to be in the survey, groups of friends who choose to sit together will either all be in or out of the sample, so the selection is not independent.

A study of 939 decision (to grant parole or not) made by a parole board produced the provided computer output. Assuming these cases are representative of all cases that may come before the board, what can be concluded?

We are 99% confident that the proportion of paroles granted by the board is between 57.4% and 61.0%.

A student is considering publishing a new magazine aimed directly at owners of Japanese automobiles. He wants to estimate the fraction of cars in the United States that are made in Japan. The computer output to the right summarizes the results of a random sample of 50 autos. Explain carefully what it tells you.

We are 99% confident that between 33.0% and 48.6% of cars in the United States are made in Japan.

A credit union took a random sample of 40 accounts and yielded the following 90% confidence interval for the mean checking account balance at the institution: $2183 < μ(balance)< $3828. What is the correct interpretation of this confidence interval?

We are 90% confident that the mean checking account balance at this credit union is between $2183 and $3828.

Analysis of a random sample of 250 Illinois nurses produced a 95% confidence interval for the mean annual salary of $42,846 < μ(Nurse Salary) < $49,686. What is the correct interpretation for this confidence interval?

We are 95% confident that the interval from $42,846 to $49,686 contains the true mean annual salary of all Illinois nurses.

A random sample of clients at a weight loss center were given a dietary supplement to see if it would promote weight loss. The center reported that the 100 clients lost an average of 48 pounds, and that a 95% confidence interval for the mean weight loss this supplement produced has a margin of error of ±7 pounds. What is the correct interpretation of these findings?

We are 95% confident that the mean weight loss produced by the supplement in weight loss center clients is between 41 and 55 pounds.

Suppose data was collected for each pair of variables below to make a scatterplot. Which variable would be used as the explanatory variable and which as the response variable? Why? What is expected in the scatterplot? Discuss the likely direction, form, and strength for parts a through d below.

a) Altitude and temperature when climbing mountains Altitude would best be used as the explanatory variable, and temperature the response variable, to predict the temperature based on altitude. What is expected in the scatterplot? Discuss the likely direction, form, and strength. he scatterplot should have a negative, possibly straight, and weak to moderate correlation. b) Ice cream cone sales and air-conditioner sales for each week Either ice cream cone sales or air-conditioner sales can be the explanatory variable. Then the other will be the response variable, because ice cream cone sales can be predicted by air-conditioner sales, and air-conditioner sales can be predicted from ice cream cone sales. The scatterplot should have a positive, straight (linear), and moderate correlation. c) Age and grip strength for people Age would best be used as the explanatory variable, and grip strength the response variable, to predict grip strength based on age. The scatterplot should have a curved (nonlinear) and moderate correlation. d) Blood alcohol level and reaction time (in milliseconds) for drivers Blood alcohol level would best be used as the explanatory variable, and reaction time the response variable, to predict reaction time based on blood alcohol level. The scatterplot should have a positive, nonlinear, and moderately strong correlation.

A clerk entering salary data into a company spreadsheet accidentally put an extra "0" in the boss's salary, listing it as $2,400,000 instead of $240,000. Explain how this error will affect these summary statistics for the company payroll. a) measures of center (median and mean) b) measures of spread (range, IQR, and standard deviation)

a) Assuming the boss's true salary is above the median, the median will be the same. The mean will be too large. b) The range will likely be too large. The IQR will likely be the same. The standard deviation will be too large.

A food company sells salmon to various customers. The mean weight of the salmon is 27 lb with a standard deviation of 2 lbs. The company ships them to restaurants in boxes of 4 salmon, to grocery stores in cartons of 49 salmon, and to discount outlet stores in pallets of 81 salmon. To forecast costs, the shipping department needs to estimate the standard deviation of the mean weight of the salmon in each type of shipment.

a) Find the standard deviation of the mean weight of the salmon in the boxes sold to restaurants. mean = 27 sd = 2 restaurant: n = 4 = 2/V4 = 1 b) Find the standard deviation of the mean weight of the salmon in the cartons sold to grocery stores. grocery: n = 49 =2/V49 = 0.20 c) Find the standard deviation of the mean weight of the salmon in the pallets sold to outlet stores. outlets: n = 81 = 2/V81 = 0.22 d) The distribution of the salmon weights turns out to be skewed to the high end. Would the distribution of shipping weights be better characterized by a Normal model for the boxes or pallets? The pallets, because, regardless of the underlying distribution, the sampling distribution of the mean approaches the Normal model as the sample size increases.

A regression analysis of 117 homes for sale produced the following model, where price is in thousands of dollars and size is in square feet. Price=47.82+0.068(Size) a) Explain what the slope of the line says about housing prices and house size. b) What price would you predict for a 3000-square-foot house in this market? c) A real estate agent shows a potential buyer a 1100-square-foot house, saying that the asking price is $6000 less than what one would expect to pay for a house of this size. What is the asking price, and what is the $6000 called?

a) For every additional square foot of area of a house, the price is predicted to increase by $68. b) 251820 c) 116620 $6000 is called the residual

You were randomly assigned to be part of a group of three students from an Intro Stats class in which 55% of the students had never taken a Calculus course, 32% of students had taken only one semester of Calculus, and the rest had taken two or more semesters of Calculus. The Multiplication Rule was used to calculate the probability that neither of your other two groupmates had studied Calculus.

a) What must be true about the groups in order to make that approach valid? The Calculus backgrounds of the students must be independent. b) Do you think this assumption is reasonable? Explain Yes, because the teacher assigned the students randomly.

residual

actual - predicted

At its website, a polling company publishes results of a new survey each day. Scroll down to the end of the published results and you'll find a statement that includes words as shown below. Results are based on telephone interviews with 1,008 national adults, aged 18 and older, conducted on April 2-5, 2007 ... In addition to sampling error, question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of public opinion polls.

a) For this survey, identify the population of interest. everyone in the nation that is 18+ years old b) The company performs its surveys by phoning numbers generated at random by a computer program. What is the sampling frame? everyone with a phone c) What problems, if any, would you be concerned about in matching the sampling frame with the population? Some people do not have telephones.

Occasionally, when Josh fills his car with gas, he figures out how many miles per gallon his car got. He wrote down those results after five fill-ups in the past few months. Overall, it appears that his car gets 21.5 miles per gallon.

a) Josh has calculated the sample mean of the gas mileage for the last five fill dash ups of his car.the last five fill-ups of his car. b) Josh is trying to estimate the population mean of the gas mileage for his car.his car. c) His results may be biased because recent driving conditions may not be typical. d) The EPA would be trying to estimate the population mean of the gas mileage for all cars of that make and model.all cars of that make and model.

In a large city school system with 49 elementary schools, the school board is considering the adoption of a new policy that would require elementary students to pass a test in order to be promoted to the next grade. The PTA wants to find out whether parents agree with this plan. Listed below are some of the ideas proposed for gathering data. For each, indicate what kind of sampling strategy is involved and what (if any) biases might result. Assume the schools are homogeneous and differ from each other.

a) Randomly select 100 parents and send them a survey. Follow up with a home visit if they do not return the survey within a week. Which kind of sampling strategy is involved? A. This sampling strategy is a simple random sample. What kind of bias (if any) is most likely to result? Bias could result only if the sampling strategy is not followed as described b) List the names of all the students alphabetically and contact the parent of every 20th student. Which kind of sampling strategy is involved? A. This sampling strategy is systematic sampling. What kind of bias (if any) is most likely to result? bias could result only if the sampling strategy is not followed as described c) Put a big ad in the newspaper asking people to log their opinions on the PTA Web site. Which kind of sampling strategy is involved? voluntary response sampling What kind of bias (if any) is most likely to result? voluntary response bias will resulty since only parents who feel strongly about the issue will respond d) Randomly select one of the elementary schools and contact the parents of every student. Which kind of sampling strategy is involved? cluster sampling What kind of bias (if any) is most likely to result? A.Undercoverage bias could result since the parents in the sample may not be

Administrators at a university were interested in estimating the percentage of students who are the first in their family to go to college. The university student body has about 48,000 members.

a) Select several dormitories at random and contact everyone living in the selected dorms. cluster sample b) Using a computer-based list of registered students, contact 300 freshmen, 300 sophomores, 300 juniors, and 300seniors selected at random from each class. stratified sample c) Using a computer-based alphabetical list of registered students, select one of the first 20 names on the list at random, and then contact the student whose name is 40 names later, and then every 40th name after that systematic sample

A study measured the waist size of 975 men, finding a mean of 36.36 inches and a standard deviation of 3.95 inches. A histogram of these measurements is shown to the right.

appear, approach the Normal shape, decreases, are fairly Normal

Editors preparing a report on the economy are trying to estimate the percentage of businesses that plan to hire additional employees in the next 60 days. They are willing to accept a margin of error of 4% but want 95% confidence. How many randomly selected employers will they need to contact?

n=(1.960)^2(0.25)/(0?04)^2 n=600

Use the Normal model N(1134,79) for the weights of steers. a) What weight represents the 66th percentile? b) What weight represents the 93rd percentile? c) What's the IQR of the weights of these steers?

a) T-shirts at a store: price each, number sold Price each would be the explanatory variable and number sold would be the response variable. In application, price would most likely be used to predict number sold. What would you expect to see in the scatterplot? B.A moderate to strong negative linear association b) Scuba diving: depth, water pressure Depth would be the explanatory variable and water pressure would be the response variable. In application, depth would most likely be used to predict water pressure. What would you expect to see in the scatterplot? D. A strong positive linear association c) Scuba diving: depth, visibility Depth would be the response variable and visibility would be the explanatory variable. In application, depth would most likely be predicted by using visibility. What would you expect to see in the scatterplot? A. A weak to moderate negative, possibly linear association d) All elementary school students: age, score on a reading test Age would be the explanatory variable and score on a reading test would be the response variable. In application, age would most likely be used to predict score on a reading test. What would you expect to see in the scatterplot? A. A moderate positive, possibly linear association

The managers of a large company wished to know the percentage of employees who feel "extremely satisfied" to work there. The company has roughly 40,000 employees. Three scenarios are given in parts a through c below. For each scenario, determine the sampling method used by the managers.

a) The managers select a single branch at random and survey every employee of that branch. cluster sample b) The managers use the company e-mail directory to contact every 150th employee on the list. systematic c) The managers split the company into divisions, with each division containing employees that have similar jobs. Within each division, they select a SRS of employees to contact. stratified

The accompanying histogram shows the life expectancies at birth for 190 countries as collected by an international health agency. a) Which would you expect to be larger: the median or the mean? Explain briefly. b) Which would you report: the median or the mean? Explain briefly.

a) The median will be larger, because the distribution is skewed left. b) The median should be used, because the distribution is skewed.

A company selling clothing on the Internet reports that the packages it ships have a median weight of 59 ounces and an IQR of 24 ounces. a) The company plans to include a sales flyer weighing 5 ounces in each package. What will the new median and IQR be? b) If the company recorded the shipping weights of the packages with the sales flyers included in pounds instead of ounces, what would the median and IQR be?

a) The new median will be 64 ounces. The new IQR will be 24 ounces. b) The new median would be 44 pounds. The new IQR would be 1.5 pounds.

In a large introductory statistics lecture hall, the professor reports that 60% of the students enrolled have never taken a calculus course, 30% have taken only one semester of calculus, and the rest have taken two or more semesters of calculus. The professor randomly assigns students to groups of three to work on a project for the course. You are assigned to be part of a group. What is the probability that of your other two groupmates

a) The probability that neither of your other two groupmates has studied calculus is 0.36 (0.60)^2 b) The probability that both of your other two groupmates have studied at least one semester of calculus is 0.16 (0.40)^2 c) The probability that at least one of your other two groupmates has had more than one semester of calculus is 0.19 1-(0.90)^2

A survey of 299 undergraduate students asked about respondents' diet preference (Carnivore, Omnivore, Vegetarian) and political alignment (Liberal, Moderate, Conservative). A stacked bar chart of the 285 responses is given. a) Describe what this plot shows using the concept of a conditional distribution. b) Do you think the differences here are real? Explain.

a) Vegetarians tend to be more liberal and carnivores tend to be more conservative. b) The differences are probably real. Diet preference and political alignment appear to be strongly associated.

For the following report about a statistical study, identify (if possible) a) the population; b) the population parameter of interest; c) the sampling frame; d) the sample; e) the sampling method, including whether or not randomization was employed; f) who (if anyone) was left out of the study; and g) any potential sources of bias you can detect and any problems you see in generalizing to the population of interest. A US consumer magazine asked all adult subscribers whether they had used experimental medical treatments and, if so, whether they had benefited from them. For almost all of the treatments, approximately 17% of those responding reported cures or substantial improvement in their condition.

a) What is the population? Unclear, but probably all adults in the country b) What is the population parameter of interest? The proportion who have used and benefited from c) What is the sampling frame? All subscribers to the consumer magazine d) What is the sample for this study? All subscribers who responded to the survey e) What sampling method was used? Voluntary response sampling was used. Is the sampling method used random? no f) Who (if anyone) was left out of this study? Those who are not subscribers to the consumer magazine g) What type of bias is evident in this sample? I. Voluntary response bias; the subscribers could choose to participate. II. Undercoverage; non-subscribers are not included in the survey. III. Response bias; the questionnaire led people to answer that they had benefited. IV. Nonresponse; a large number of subscribers did not respond to the survey. I and II

Fuel economy estimates for automobiles built one year predicted a mean of 27.2 mpg and a standard deviation of 5.8 mpg for highway driving. Assume that a Normal model can be applied. Use the 68−95−99.7 Rule to complete parts a) through e).

b) In what interval would you expect the central 68% of autos to be found? Using the 68-95-99.7 rule, the central 68% of autos can be expected to be found in the interval from 21.4 to 33 mpg. c) About what percent of autos should get more than 33 mpg? Using the 68-95-99.7 rule, about 16% of autos should get more than 33 mpg d) About what percent of autos should get between 33 and 38.8 mpg? Using the 68-95-99.7 rule, about 13.5% of autos should get between 33 and 38.8 mpg. e) Describe the gas mileage of the best 2.5% of cars. They get more than 38.8 mpg.

A particular IQ test is standardized to a Normal model, with a mean of 100 and a standard deviation of 19.

b) In what interval would you expect the central 68% of the IQ scores to be found? Using the 68-95-99.7 rule, the central 68% of the IQ scores are between 81 and 119. c) About what percent of people should have IQ scores above 138? Using the 68-95-99.7 rule, about 2.5% of people should have IQ scores above 138. d) About what percent of people should have IQ scores between 43 and 62? Using the 68-95-99.7 rule, about 2.35% of people should have IQ scores between 43 and 62. e) About what percent of people should have IQ scores above 119? Using the 68-95-99.7 rule, about 16% of people should have IQ scores above 119.

A company that manufactures rivets believes the shear strength (in pounds) is modeled by N(750,50). Use the 68-95-99.7 Rule to complete parts a) through c) below.

b) Would it be safe to use these rivets in a situation requiring a shear strength of 700 pounds? Explain. No, because about 16% of all of this company's rivets have a shear strength of less than 700 pounds. c) About what percent of these rivets can be expected to fall below 850 pounds? 97.5%

The amounts (in ounces) of juice in eight randomly selected juice bottles are 15.0, 15.9, 15.8, 15.7, 15.4, 15.2, 15.2, and 15.3. Construct a 98% confidence interval for the mean amount of juice in all such bottles. Round to two decimal places as needed.

casio table (15.09, 15.78) 15.44+-3.00 x (0.325/V8)

An ice hockey organization tests players to see whether they are usingperformance-enhancing drugs. Officials select a team atrandom, and adrug-testing crew shows up unannounced to test all 20 players on the team. Each testing day can be considered a study of drug use.

cluster sample Is that choice appropriate? Yes. An entire team can be sampled at once relatively easily, but all players couldn't efficiently be sampled on the same day.

Some friends of yours in a political science class are angry about a new town ordinance restricting off-campus parties. They make an online survey asking students' opinions. This type of sampling might be classified as a

convenience sample

For a given confidence level, halving the margin of error requires a sample size twice as large.

false

For a given sample size, higher confidence means a smaller margin of error.

false

The Interquartile Range (ICR) is sensitive to outliers.

false

When analysing data, one should disregard outliers.

false

The property which states that for independent trials, as the number of trials increases, the long run relative frequency of an outcome gets closer to the true probability is called the __________.

law of large numbers

Will your flight get you to your destination on time? To the right are a histogram and summary statistics for the percentage of delayed arrivals each month from 2001 thru 2006. Consider these data to be a representative sample of all months. There is no evidence of a time trend. (The correlation of Flights Delayed % with time is r=−0.004.)

a) Check the assumptions and conditions for inference about the mean. Select all that apply. All of the assumptions and conditions for inference about the mean are met. c) Interpret this interval for a traveler planning to fly. Choose the correct answer below. We can be 99% confident that the interval contains the true mean monthly percentage of delayed flights.

What are the chances your flight will leave on time? To the right are a histogram and summary statistics for the percentage of flights departing on time each month from 2001 thru 2006.

a) Check the assumptions and conditions for inference. Is the independence assumption met? YES Is the randomization condition met or is the sample suitably representative? YES Is the 10% condition met? YES Is the nearly normal condition met? YES b) Find a 90% confidence interval for the true percentage of flights that depart on time. 90% = 1.645 formule : ybar +- % x SE(ybar) c) Interpret this interval for a traveler planning to fly. We can be 90% confident that the interval contains the true mean monthly percentage of on-time departures.

A waitress believes the distribution of her tips has a model that is slightly skewed to the right, with a mean of $9.60 and a standard deviation of $5.40. She usually waits on about 40 parties over a weekend of work.

a) Estimate the probability that she will earn at least $500. P(tips from 40 parties>$500) sd=5.40/V40 = 0.8538 average = 500/40 = 12.5 12.5-10.10/0.8538 = 3.40 table... 1-0.99966= 0.0003 How much does she earn on the best 10% of such weekends? 1.28xSD+10.10 x40 427.72

A grocery store's receipts show that Sunday customer purchases have a skewed distribution with a mean of $28 and a standard deviation of $15. Suppose the store had 296 customers this Sunday.

a) Estimate the probability that the store's revenues were at least $8400 mean : 8400/296 mean - 28 / SD b) If, on a typical Sunday, the store serves 296 customers, how much does the store take in on the worst 10% of such days? SD=15/V296 =0.8719 x 1.28 = 1.1160 - (worst) 28 = -26.88 x 296 = 7957.67

A medical researcher measured the pulse rates (beats per minute) of a sample of randomly selected adults and found the following Student's t-based confidence interval: With 95.00% Confidence, 66.372986<μ(Pulse)<

a) Explain carefully what the software output means. We are 95% confident the interval 66.4 to 70.1 beats per minute contains the true mean heart rate. b) What's the margin of error for this interval? ME = 70.105349-66.372986/2 = 1.9 c) If the researcher had calculated a 99% confidence interval, would the margin of error be larger or smaller? The margin of error would have been larger

Recently, a casino issued a press release announcing that a cocktail waitress won the world's largest slot jackpot—over $30,000,000. She said she had played less than $50 in the machine when the jackpot hit. The top jackpot for this type of slot machine builds from a base amount of $7 million and can be won with a 3-coin ($3) bet.

a) How can the casino afford to give away millions of dollars? The casino earns more than the value of the jackpot from people who bet but do not win. b) Why did the casino issue a press release rather than keep the loss a secret? The press release generates publicity, which entices more people to come and gamble. The amount the casino earns from this more than makes up for the jackpot.

Between quarterly audits, a company likes to check on its accounting procedures to address any problems before they become serious. The accounting staff processes payments on about 120 orders each day. The next day, the supervisor rechecks 10 of the transactions to be sure they were processed properly. Complete parts a and b below.

a) Propose a sampling strategy for the supervisor. Choose the correct answer below. Assign numbers 001 to 120 to each order. Use random numbers to select 10 transactions to examine. b) How would you modify that strategy if the company makes both wholesale and retail sales, requiring different bookkeeping procedures? Sample proportionately within each type before the results are combined. This is a stratified random sample.

Administrators at a university were interested in estimating the percentage of students who are the first in their family to go to college. The university student body has about 49,000 members. The university administration is considering a variety of ways to sample students for a survey. For each of these proposed survey designs, identify the problem.

a) Publish an advertisement inviting students to visit a website and answer questions. voluntary response bias b) Set up a table in the student union and ask students to stop and answer a survey. convenience sample

A consumer organization estimates that over a 1-year period 17% of cars will need to be repaired once, 5% will need repairs twice, and 1% will require three or more repairs

a) The probability that a car will require no repairs is 0.77 1- (0.17+0.05+0.01) b) The probability that a car will require no more than one repair is 0.94 0.77+0.17 c) The probability that a car will require some repairs is 0.23 0.17+0.05+0.01

A national survey found that 50% of adults ages 25-29 had only a cell phone and no landline. Suppose that five 25-29-year-olds are randomly selected. Complete parts a through c below.

a) What is the probability that all of these adults have only a cell phone and no landline? (0.5)^5 = 0.0313 b) What is the probability that none of these adults have only a cell phone and no landline? 50% have only a cell phone and no landline. So 50% have none of this. (0.5)^5 = 0.0313 c) What is the probability that at least one of these adults has only a cell phone and no landline? 1-(0.5)^5 = 0.9687

An environmental agency took soil samples at 14 locations near a former industrial waste dump and checked each for evidence of toxic chemicals. They found no elevated levels of any harmful substances.

a) What is the population? the soil around a former waste dump b) What is the population parameter of interest? the levels of toxic chemicals c) What is the sampling frame? the accessible soil around the dump c) What is the sampling frame? 14 soil samples e) What sampling method was used? not clear Is the sampling method used random? it is not known if the sampling was random f) what was left out of this study unclear, but probably nothing was left g) What type of bias is evident in this sample? none What problems can you see in generalizing the population of interest? none

For your political science class, you'd like to take a survey from a sample of all the Orthodox Church members in your region. A list of places of worship shows 21 Orthodox churches in the area. Rather than try to obtain a list of all members of all these churches, you decide to pick 3 churches at random. For those churches, you will ask to get a list of all current members and contact 100 members at random

a) What kind of design have you used? This is a multistage design with a cluster sample for the first stage and a simple random sample for each cluster b) What could go wrong with the design that you have proposed? One of the churches you picked may not be representative of all churches

See all study sets

STATS section 7

Related study sets

N-line

Fetal Development

QuickBooks Certification

SQ#3

Accessibility and Universal Design

INT-1700 Exam #2

HUN2201 Exam 4 (module 7&8)

History French Revolution

9.3

Unit 2 Anatomy

Ch 12 - Data extraction, transformation, and loading

Ch. 15 Quiz

Fin

acc ch 12

Module 1-4 quiz

Bus 301 Mid Term Practice Exam, Chapter 7, Chapter 5, Chapter 6, Chapter 4

edu 119

4860 - 2

CPE unit 10b

Lesson 2 - The Roaring Twenties Unit 5 - The Roaring Twenties and the Great Depression