Unit I - Review Homework & Quizzes

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

A particular IQ test is standardized to a Normal​ model, with a mean of 100 & a standard deviation of 13 - Choose the model for these IQ scores that correctly shows what the​ 68-95-99.7 rule predicts about the scores - In what interval would you expect the central 68​% of the IQ scores to be​ found? - About what percent of people should have IQ scores above 139​? - About what percent of people should have IQ scores between 74 & 87​? - About what percent of people should have IQ scores above 113​?

- Choose the model for these IQ scores that correctly shows what the​ 68-95-99.7 rule predicts about the scores - In what interval would you expect the central 68​% of the IQ scores to be​ found? Using the​ 68-95-99.7 rule, the central 68​% of the IQ scores are between 87 & 113 - About what percent of people should have IQ scores above 139​? Using the​ 68-95-99.7 rule, about .15​% of people should have IQ scores above 139 - About what percent of people should have IQ scores between 74 & 87​? Using the​ 68-95-99.7 rule, about 13.5​% of people should have IQ scores between 74 & 87 - About what percent of people should have IQ scores above 113​? Using the​ 68-95-99.7 rule, about 16​% of people should have IQ scores above 113.

A National Vital Statistics Report provides information on deaths by​ age, sex, & race. Below is a link to the displays of the distributions of ages at death for White & Black males. - Describe the overall shapes of these distributions A) The distribution for White males is right-skewed &​ unimodal, while the distribution for Black males is symmetric & unimodal B) The distribution for White males is left-skewed &​ unimodal, while the distribution for Black males is symmetric & unimodal C) Both distributions are right-skewed & unimodal D) Both distributions are left-skewed & unimodal - How do the distributions​ differ? A) The center for the distribution of White males is less than the center of the distribution of Black males B) The center for the distribution of Black males is less than the center of the distribution of White males C) The range for the distribution of White males is less than the range of the distribution of Black males D) The range of the distribution of Black males is less than the range of the distribution of White males - Look carefully at the bar definitions. Where do these plots violate the rules for statistical​ graphs? Select all that apply. A) The bins do not cover all possible​ ages; there are gaps between them B) The widths of the far left & right bins differ from the widths of the middle bins C) The vertical axes do not have the same maximum D) The vertical axes do not start at zero

- Describe the overall shapes of these distributions D) Both distributions are left-skewed & unimodal - How do the distributions​ differ? B) The center for the distribution of Black males is less than the center of the distribution of White males - Look carefully at the bar definitions. Where do these plots violate the rules for statistical​ graphs? Select all that apply. B) The widths of the far left & right bins differ from the widths of the middle bins C) The vertical axes do not have the same maximum

People with spinal cord injuries may lose function in​ some, but not​ all, of their muscles. The ability to push oneself up is particularly important for shifting position when seated & for transferring into & out of wheelchairs. Surgeons compared two operations to restore the ability to push up in children. The histogram shows scores rating pushing strength two years after​ surgery, & the boxplots compare results for the two surgical methods - Describe the shape of the strength distribution - What is the range of the strength​ scores? - What fact about the results of the two procedures is hidden in the​ histogram? A) The symmetry of the distribution is hidden in the histogram B) The difference in the average values of the two procedures is hidden in the histogram C) The variation in all of the data is hidden in the histogram D) The total number of trials is hidden in the histogram - Which method had the higher​ (better) median​ score? - Was the method with the higher median always​ better? A) The method with the higher median was always better because the minimum of that method is greater than the maximum of the other method B) The method with the higher median was not always better because the maximum of that method is less than the minimum of the other method C) The method with the higher median was always better because the maximum of that method is greater than the minimum of the other method D) The method with the higher median was not always better because the minimum of that method is less than the maximum of the other method - Which method produced more consistent​ results? Explain. A) The biceps procedure had more consistent results because the data values were on average greater than the results from the deltoid procedure B) It cannot be determined which method produced more consistent results because the number of trials of each procedure is not known C) The two procedures produced equally consistent results because the ranges of the two data sets are approximately equal D) The deltoid procedure had more consistent results because all but two of the data values for that procedure lie in a very small range

- Describe the shape of the strength distribution The distribution is unimodal is symmetric, & does not contain outliers - What is the range of the strength​ scores? The range is 3 - What fact about the results of the two procedures is hidden in the​ histogram? B) The difference in the average values of the two procedures is hidden in the histogram - Which method had the higher​ (better) median​ score? The median of the biceps​ procedure, which is not visible on the​ boxplot, is higher than the median of the deltoid procedure, approximately 2 - Was the method with the higher median always​ better? D) The method with the higher median was not always better because the minimum of that method is less than the maximum of the other method - Which method produced more consistent​ results? Explain. D) The deltoid procedure had more consistent results because all but two of the data values for that procedure lie in a very small range

Here are the summary statistics for the weekly payroll of a small​ company: lowest salary=​$300​, mean salary=​$900​, median=​$700​, range=​$1400​, IQR=​$700​, first quartile=​$350​, standard deviation=​$400. - Do you think the distribution of salaries is​ symmetric, skewed to the​ left, or skewed to the​ right? Explain why. A) The distribution is skewed to the right because the mean is greater than the median. B) The distribution is symmetric because the mean is greater than the median. C) The distribution is skewed to the left because the mean is greater than the median. D) There is not enough information to estimate the shape of the distribution - Between what two values are the middle​ 50% of the salaries​ found? - Suppose business has been good & the company gives each employee a $100 raise. Tell the new value of each of the summary statistics. - Instead, suppose the company gives each employee a 10​% raise. Tell the new value of each of the summary statistics

- Do you think the distribution of salaries is​ symmetric, skewed to the​ left, or skewed to the​ right? Explain why. A) The distribution is skewed to the right because the mean is greater than the median. - Between what two values are the middle​ 50% of the salaries​ found? The middle​ 50% of the salaries are found between $350 & $1050 - Suppose business has been good & the company gives each employee a $100 raise. Tell the new value of each of the summary statistics. - After everyone receives a $100 ​raise, the new lowest salary is $400 - After everyone receives a $100 raise, the new mean salary is $1000 - After everyone receives a $100 raise, the new median is $800 - After everyone receives a $100 raise, the new range is $1400 - After everyone receives a $100 raise, the new IQR is $700 - After everyone receives a $100 ​raise, the new first quartile is $450 - After everyone receives a $100 ​raise, the new standard deviation is $400 - Instead, suppose the company gives each employee a 10% raise. Tell the new value of each of the summary statistics - After everyone receives a 10​% ​raise, the new lowest salary is ​$330 - After everyone receives a 10​% ​raise, the new mean salary is ​$990 - After everyone receives a 10​% ​raise, the new median is ​$770 - After everyone receives a 10​% ​raise, the new range is ​$1540 - After everyone receives a 10​% ​raise, the new IQR is ​$770 - After everyone receives a 10​% ​raise, the new first quartile is ​$385 - After everyone receives a 10​% ​raise, the new standard deviation is ​$440

Two standardized​ tests, A and​ B, use very different scales of scores. The formula A=50×B+150 approximates the relationship between scores on the two tests. Use the summary statistics for a sample of students who took test B to determine the summary statistics for equivalent scores on test A. - Find the summary statistics for equivalent scores on test A

- Lowest score = 1050 - Mean = 1550 - Standard deviation = 100 - Q3 = 17501750 - Median = 1600 - IQR = 350

Fuel economy estimates for automobiles built one year predicted a mean of 27.2 mpg & a standard deviation of 6.8 mpg for highway driving. Assume that a Normal model can be applied. Use the 68−95−99.7 Rule - Draw the model for auto fuel economy - In what interval would you expect the central 99.7​% of autos to be​ found? - About what percent of autos should get more than 34 mpg? - About what percent of autos should get between 34 & 40.8 mpg? - Describe the gas mileage of the best ​2.5% of cars A) They get more than 47.6 mpg. B) They get more than 40.8 mpg. C) They get more than 34 mpg. D) They get less than 20.4 mpg.

- Draw the model for auto fuel economy - In what interval would you expect the central 99.7​% of autos to be​ found? Using the​ 68-95-99.7 rule, the central 99.7% of autos can be expected to be found in the interval from 6.8 to 47.6 mpg - About what percent of autos should get more than 34 mpg? Using the​ 68-95-99.7 rule, about 16​% of autos should get more than 34 mpg. - About what percent of autos should get between 34 & 40.8 mpg? Using the​ 68-95-99.7 rule, about 13.5% of autos should get between 34 & 40.8 mpg - Describe the gas mileage of the best ​2.5% of cars. B) They get more than 40.8 mpg.

The accompanying histogram shows the distribution of mean mathematics scores for a​ state's public schools. The vertical lines show the mean and one standard deviation above and below the mean. Approximately​ 78.8% of the data points are between the two outer lines. - Give two reasons that a Normal model is not appropriate for the data in the histogram. Select all that apply. A) If the Normal model was​ appropriate, approximately​ 68% of the data should lie between the two outer lines. B) The data are skewed left. C) If the Normal model was​ appropriate, approximately​ 95% of the data should lie between the two outer lines. D) The data are skewed right. - The accompanying Normal probability plot on the left shows the distribution of all the​ state's scores. The accompanying Normal probability plot on the right shows the same data with one​ region's schools​ (mostly in the low​ mode) removed. What do these plots tell you about the shape of the​ distribution? A) Both distributions are approximately the same. B) The shape of the distribution with the​ region's schools removed is more approximately Normal than the distribution with all the schools. C) The shape of the distribution with the​ region's schools removed is uniform. D) The shape of the distribution with the​ region's schools removed is more skewed than the distribution with all the schools.

- Give two reasons that a Normal model is not appropriate for the data in the histogram. Select all that apply A) If the Normal model was​ appropriate, approximately​ 68% of the data should lie between the two outer lines. B) The data are skewed left. - What do these plots tell you about the shape of the​ distribution? B) The shape of the distribution with the​ region's schools removed is more approximately Normal than the distribution with all the schools.

The mean household income in a country in a recent year was about ​$77,662 & the standard deviation was about ​$83,000. ​(The median income was ​$45,096​.) - If a Normal model was used for these​ incomes, what would be the household income of the top 5​%? - How confident should one be in the answer in part​ a? - Why might the Normal model not be a good one for​ incomes?

- If a Normal model was used for these​ incomes, what would be the household income of the top 5​%? - How confident should one be in the answer in part​ a? - Why might the Normal model not be a good one for​ income?

From a previous study of 576 cities around the​ world, it appeared that the mean cost of a cappuccino was slightly higher than the mean cost of a dozen​ eggs; however, each of the corresponding boxplots showed significant variation. Given the variation among the​ prices, could that difference be due just to​ chance? To examine that​ further, we took 1000 random samples of 100 cities & computed the difference between the mean price of a cappuccino & a dozen eggs. The histogram to the right shows the 1000 mean differences. Use this display to complete parts a & b below. - If there were no real difference between the mean​ prices, where would you expect the center of the histogram to​ be? A) One would expect the histogram to be centered around the difference in medians between the two data sets which would indicate that large spread is causing the apparent difference B) One would expect the histogram to be centered around 0 which would indicate that the mean prices were equal C) One would expect the histogram to be centered around 0 which would indicate that the median prices were equal D) One would expect the histogram to be centered around the difference in medians between the two data sets which would indicate that the skewness is causing the apparent difference - Given the​ histogram, what do you​ conclude? Explain briefly. A) The histogram shows that it is unlikely that the mean prices are​ equal, but this would almost always be the case with a sample size this large. B) The histogram shows that it is likely that the mean prices are equal since almost all samples show a greater difference in means. C) The histogram shows that it is likely that the mean prices are equal because the histogram only shows small differences. D) The histogram shows that it is unlikely that the mean prices are equal since 0 is not even shown on the horizontal axis.

- If there were no real difference between the mean​ prices, where would you expect the center of the histogram to​ be? B) One would expect the histogram to be centered around 0 which would indicate that the mean prices were equal - Given the​ histogram, what do you​ conclude? Explain briefly. D) The histogram shows that it is unlikely that the mean prices are equal since 0 is not even shown on the horizontal axis.

Ozone levels​ (in parts per​ billion, ppb) were recorded at sites in New Jersey monthly between 1926 and 1971. Here are boxplots of the data for each month​ (over the 46​ years), lined up in order ​(January=​1). - In what month was the highest ozone level ever​ recorded? - Which month has the largest​ IQR? - Which month has the smallest​ range? - Write a brief comparison of the ozone levels in January & June. Choose the correct answer below A) The median ozone level in June was slightly lower than the median ozone level in​ January, but the IQR was significantly larger B) The median ozone level in June was slightly lower than the median ozone level in January & the IQR was significantly smaller C) The median ozone level in June was slightly higher than the median ozone level in January & the IQR was significantly larger D) The median ozone level in June was slightly higher than the median ozone level in​ January, but the IQR was significantly smaller - Write a report on the annual patterns you see in the ozone levels. Choose the correct answer below A) The ozone levels appear to be roughly consistent over the course of the year. Any pattern in the boxplots is only visible because the vertical axis does not start at 0 B) The ozone levels appear to follow a sinusoidal pattern over the​ year, reaching a maximum in spring & a minimum in fall C) The ozone levels appear to follow a sinusoidal pattern over the​ year, reaching a minimum in spring & a maximum in fall D) The ozone levels appear to drop over the course of the​ year, implying that ozone levels will approach 0 in years to come

- In what month was the highest ozone level ever​ recorded? The highest ozone level ever was recorded in April - Which month has the largest​ IQR? The month of February has the largest IQR - Which month has the smallest​ range? The month of August has the smallest range - Write a brief comparison of the ozone levels in January & June. Choose the correct answer below. D) The median ozone level in June was slightly higher than the median ozone level in​ January, but the IQR was significantly smaller - Write a report on the annual patterns you see in the ozone levels. Choose the correct answer below. B) The ozone levels appear to follow a sinusoidal pattern over the​ year, reaching a maximum in spring & a minimum in fall

Corey has 4929 songs in his​ computer's music library. The lengths of the songs have a mean of 242.4 seconds and standard deviation of 114.51 seconds with the accompanying Normal probability plot of song lengths. - Is this distribution​ Normal? Explain. A) Yes, because the plot is roughly a diagonal straight line. B) No, because there is no pattern to the plot. C) Yes, because there is a curved pattern to the plot. D) No, because the plot is not roughly a diagonal straight line. - If it​ isn't Normal, how does it differ from a Normal​ model? A) The distribution is skewed to the left. B) The distribution is uniform. C) The distribution is skewed to the right. D) The plot is Normal.

- Is this distribution​ Normal? Explain. D) No, because the plot is not roughly a diagonal straight line - If it​ isn't Normal, how does it differ from a Normal​ model? C) The distribution is skewed to the right.

Here are boxplots of the points scored during the first 10 games of the season for both Fran & Kelly. - Summarize the similarities & differences in their performance so far A) Both girls have the same approximate​ median, but Fran has a larger IQR B) Both girls have the same approximate​ mean, but Kelly has a larger IQR C) Both players are about the​ same, except Fran can score more points D) Both girls have the same approximate median &​ IQR, but Fran has a larger range - The coach can take only one player to the state championship. Which one should she​ take? Why? A) She should take Fran​, because she has the ability to score a higher point total B) She should take Kelly​, because she is a more consistent performer C) It does not matter which player she​ takes, because they both score the same average number of points D) A & B are both​ possible, depending on the​ coach's preference

- Summarize the similarities & differences in their performance so far A) Both girls have the same approximate​ median, but Fran has a larger IQR - The coach can take only one player to the state championship. Which one should she​ take? Why? D) A & B are both​ possible, depending on the​ coach's preference

A company selling clothing on the Internet reports that the packages it ships have a median weight of 37 ounces and an IQR of 24 ounces. - The company plans to include a sales flyer weighing 3 ounces in each package. What will the new median & IQR​ be? - If the company recorded the shipping weights of the packages with the sales flyers included in pounds instead of​ ounces, what would the median & IQR​ be?

- The company plans to include a sales flyer weighing 3 ounces in each package. What will the new median & IQR​ be? The new median will be 40 ounces. The new IQR will be 24 ounces. - If the company recorded the shipping weights of the packages with the sales flyers included in pounds instead of​ ounces, what would the median & IQR​ be? The new median would be 2.5 pounds. The new IQR would be 1.5 pounds.

A health statistics center compiles data on the length of stay by patients in​ short-term hospitals & publishes its findings in a health statistics journal. Data from a sample of 39 male patients and 35 female patients on length of stay in days are displayed in the accompanying histograms. - What could be changed about these histograms to make them easier to​ compare? A) The histograms could be put on the same horizontal​ scale, 0.0 to 20.0 B) The vertical scales of both histograms could be reduced to 12 C) The histograms could be combined so the data for men & women are mixed together D) The center could obtain data from four more women to match the sample sizes & add those data to the​ "Women" histogram - Describe these distributions by writing a few sentences comparing the duration of hospitalization for men & women A) Lengths of​ men's stays appear to vary more than for women. Women have a mode near 5​ days, with a sharp drop afterward. Men have a mode at about 12​ days, with roughly equal amounts of times on either side of that mode. B) Lengths of​ women's stays appear to vary more than for men. Men have a mode at 1 day & then taper off from there. The shapes of the distributions are roughly equal. C) Lengths of​ men's stays appear to vary more than for women. Men have a mode at 1 day & then taper off from there. Women have a mode near 5​ days, with a sharp drop afterward. D) Lengths of​ women's stays appear to vary more than for men.​ However, men appear to have a larger range of stay lengths than women. The distribution of​ women's stays appears to follow a bell curve - Suggest a reason for the peak in​ women's length of stay

- What could be changed about these histograms to make them easier to​ compare? A) The histograms could be put on the same horizontal​ scale, 0.0 to 20.0 - Describe these distributions by writing a few sentences comparing the duration of hospitalization for men & women. C) Lengths of​ men's stays appear to vary more than for women. Men have a mode at 1 day & then taper off from there. Women have a mode near 5​ days, with a sharp drop afterward. - Suggest a reason for the peak in​ women's length of stay A possible reason is childbirth

A study was conducted on shoe sizes of​ students, reported in European sizes. For the​ men, the mean size was 44.95 with a standard deviation of 2.02. To convert European shoe sizes to U.S. sizes for​ men, use the equation shown below. US size = EuroSize ×0.7931−23.6 - What is the mean​ men's shoe size for these responses in U.S.​ units? - What is the standard deviation in U.S.​ units?

- What is the mean​ men's shoe size for these responses in U.S.​ units? The mean​ men's shoe size in U.S. units is 12.05 - What is the standard deviation in U.S.​ units? The standard deviation in U.S. units is 1.60

A class of fourth-graders takes a diagnostic reading​ test, & the scores are reported by reading grade level. The​ 5-number summaries for 13 boys & 10 girls are shown below - Which group had the highest​ score? - Which group had the greatest​ range? - Which group had the greatest interquartile​ range? - Which group generally did better on the​ test? Explain. A) The girls did better on the reading​ test, because the mean for the girls was higher than the mean for the boys B) The girls did better on the reading​ test, because the median for the girls was higher than the median for the boys C) The boys did better on the reading​ test, because the mean for the boys was higher than the mean for the girls D) The boys did better on the reading​ test, because the upper quartile for the boys was higher than the upper quartile for the girls E) The boys did better on the reading​ test, because the median for the boys was higher than the median for the girls - If the mean reading level for boys is 4.7 and for girls is 4.4​, what is the overall mean for the​ class?

- Which group had the highest​ score? The girls had the highest score of 5.9 - Which group had the greatest​ range? The boys had the greatest range of 3.1 - Which group had the greatest interquartile​ range? The boys had the greatest interquartile range of 1.4 - Which group generally did better on the​ test? Explain. E) The boys did better on the reading​ test, because the median for the boys was higher than the median for the girls - If the mean reading level for boys is 4.7 and for girls is 4.4​, what is the overall mean for the​ class? The overall mean for the class is 4.6

In an experiment to determine whether seeding clouds with silver iodide increases​ rainfall, 50 clouds were randomly assigned to be seeded or not. The amount of rain they​ generated, as shown​ below, was then measured​ (in acre-feet) - Which of the summary statistics are most appropriate for describing these​ distributions? Why? - Is there any evidence that seeding clouds may be​ effective? Explain. A) Although the seeded clouds appear to produce more​ rainfall, the spread of the data is too large to know for sure. B) There is evidence that seeding clouds is effective because all of the statistics for the seeded clouds are higher than those for the unseeded clouds. C) There is no evidence that seeding clouds is effective because all of the statistics for the seeded clouds are higher than those for the unseeded clouds.

- Which of the summary statistics are most appropriate for describing these​ distributions? Why? Because the distributions are skewed, the median should be used to describe the center & the quartiles (& IQR) should be used to describe the spread. - Is there any evidence that seeding clouds may be​ effective? Explain. A) Although the seeded clouds appear to produce more​ rainfall, the spread of the data is too large to know for sure.

A certain poll question​ asked, "In politics, as of today, do you consider yourself moderate, liberal, or conservative?" The possible responses were: - "Moderate" - "Liberal" - "Conservative" - "Other" - "No Response" What kind of variable is the​ response? A) Categorical variable B) Quantitative variable

A) Categorical variable

Shown to the right are the histogram and summary statistics for the number of camp sites at public parks in a particular state - Which statistics would you use to identify the center & spread of this​ distribution? Why? - How many parks would you classify as​ outliers? Explain. A) There are two outliers because two data values are more than 3 standard deviations from the mean B) There are three outliers because three data values are more than 3 standard deviations from the mean C) There are three outliers because three data values are either more than 1.5 IQRs below Q1 or more than 1.5 IQRs above Q3 D) There are no parks that would be classified as outliers E) There are two outliers because two data values are either more than 1.5 IQRs below Q1 or more than 1.5 IQRs above Q3 - Describe the distribution.

- Which statistics would you use to identify the center & spread of this​ distribution? Why? Because the distribution is skewed, the median should be used to describe the center & the quartiles (& IQR) should be used to describe the spread - How many parks would you classify as​ outliers? Explain. C) There are three outliers because three data values are either more than 1.5 IQRs below Q1 or more than 1.5 IQRs above Q3 - Describe the distribution. The distribution is unimodal & skewed to the right.

A livestock company reports that the mean weight of a group of young steers is 1132 pounds with a standard deviation of 79 pounds. Based on the model ​N(1132​,79​) for the weights of​ steers, what percent of steers weigh - over 1050 ​pounds? - under 1200 ​pounds? - between 1000 & 1300 ​pounds?

- over 1050 ​pounds? 85.0​% of steers have weights above 1050 pounds - under 1200 ​pounds? 80.5​% of steers have weights below 1200 pounds - between 1000 & 1300 ​pounds? ​93.6% of the steers weigh between 1000 & 1300 pounds

A​ company's customer service hotline handles many calls relating to​ orders, refunds, & other issues. The​ company's records indicate that the median length of calls to the hotline is 4.6 minutes with an IQR of 2.7 minutes. ​- If the company were to describe the duration of these calls in seconds instead of​ minutes, what would the median & IQR​ be? ​- In an effort to speed up the customer service​ process, the company decides to streamline the series of pushbutton menus customers must​ navigate, cutting the time by 30 seconds. What will the median & IQR of the length of hotline calls​ become?

- what would the median & IQR​ be? The new median will be 276 seconds. The new IQR will be 162 seconds. - What will the median & IQR of the length of hotline calls​ become? The new median would be 246 seconds. The new IQR would be 162 seconds.

The average score on the Stats midterm was 75 points with a standard deviation of 7 points, & Jeremiah​'s z-score was −2. How many points did he​ score?

61 points

Satellites send back nearly continuous data on the​ Earth's landmasses, oceans, & atmosphere from space. How might researchers use this information in both the short & long term to help study changes in the​ Earth's climate? A) In the short​ term, researchers can more accurately report weather​ patterns, including hurricanes & tsunamis. In the long​ term, this rise & fall of temperature & water levels can help in planning for future problems & guide public policy to protect our safety B) In the short​ term, researchers can add more features to the satellite's data collection process that will benefit other fields of study. In the long​ term, the data can be combined into a thorough & complete history of the​ Earth's climate C) In the short​ term, researchers can accurately predict natural disasters. In the long​ term, the function of these satellites can be expanded to include measurements regarding the climate on the Moon & Mars D) In the short​ term, researchers can manipulate the landmasses, oceans, & atmosphere to test whether their satellites are measuring the data correctly. In the long​ term, researchers can change the climate to be more beneficial for the Earth

A) In the short​ term, researchers can more accurately report weather​ patterns, including hurricanes & tsunamis. In the long​ term, this rise & fall of temperature & water levels can help in planning for future problems & guide public policy to protect our safety

A study reports load factors​ (passenger-miles as a percentage of available​ seat-miles) for commercial airlines for every month from 2000 through 2013. The accompanying display shows the domestic load factors by year. Discuss the patterns in this display. A) Load factors were generally lower & more variable in the early​ 2000s, but became higher & less variable each year B) Load factors were generally lower & less variable in the early​ 2000s, but became higher & more variable each year C) Load factors were generally higher & more variable in the early​ 2000s, but became lower & less variable each year D) Load factors were generally lower in the early​ 2000s, but became higher each year. The variability of the load factors did not increase or decrease over time

A) Load factors were generally lower & more variable in the early​ 2000s, but became higher & less variable each year

Many online retailers keep data on products that customers​ buy, & even products they look at. What do online retailers hope to gain from such​ information? Select all that apply. A) Revenue from selling the data to marketing companies B) Guidance in designs & placement of links to personalize the Web site experience C) Guidance in pricing & offers to help maximize profit D) Revenue from selling a​ customer's data back to the customer

A) Revenue from selling the data to marketing companies B) Guidance in designs & placement of links to personalize the Web site experience C) Guidance in pricing & offers to help maximize profit

A study of body fat on 250 men collected measurements of 12 body parts as well as the percentage of body fat that the men carried. The first accompanying display is a dotplot of their bicep circumferences​ (in centimeters). The second accompanying display was formed by dividing each measurement by 2.54 to convert it to inches. Do the two dot plots look​ different? What might account for​ that? Select all that apply A) The dotplots look different. The plot based on inches has fewer values on the horizontal​ axis, so it shows less detail. B) The dotplots do not look different. C) The dotplots look different. Distributions in centimeters tend to have more than one​ mode; converting to inches consolidates data values and results in a unimodal distribution. D) The dotplots look different. Distributions in centimeters are naturally skewed to the​ right; converting to inches removes this skew and results in a symmetric distribution.

A) The dotplots look different. The plot based on inches has fewer values on the horizontal​ axis, so it shows less detail.

In​ 1975, did men and women marry at the same​ age? Here are boxplots of the age at first marriage for a sample of U.S. citizens then. Write a brief report discussing what these data show. A) Women appear to marry about 3 years younger than​ men, but the two distributions are very similar in shape & spread B) Both women & men appear to marry in their late​ twenties, & the two distributions are very similar in shape & spread C) Both women & men appear to marry in their early​ twenties, & the two distributions are very similar in shape & spread D) Men appear to marry about 3 years younger than​ women, but the two distributions are very similar in shape & spread

A) Women appear to marry about 3 years younger than​ men, but the two distributions are very similar in shape & spread

Would you expect distribution of the variables in parts a through d below to be​ uniform, unimodal, or​ bimodal? Symmetric or​ skewed? Explain why. Ages of people at a Little League game. The distribution would probably be? The number of siblings of people in your class. The distribution would probably be? Pulse rates of​ college-age males The distribution would probably be? The number of times each face of a die shows in 100 tosses. The distribution would probably be?

Ages of people at a Little League game. The distribution would probably be bimodal & skewed because children & parents will have much different average ages & some parent ages can be well above the mean. The number of siblings of people in your class The distribution would probably be unimodal & skewed because most students have either 0, 1, or 2 siblings, & few have more than 2. Pulse rates of​ college-age males The distribution would probably be unimodal & symmetric because most men will have an average pulse, with a few faster than average & a few slower than average. The number of times each face of a die shows in 100 tosses. The distribution would probably be uniform & symmetric because each face of the die has the same chance of coming up

A poll asked people in a certain​ country, "Thinking about household pets ​today, would you say that you would rather have a dog or a cat in your household​? The choices were: - Dog​ or ​- Cat​ What kind of variable is the​ response? A) Quantitative variable B) Categorical variable C) The response is not a variable.

B) Categorical variable

The Environmental Protection Agency provides fuel economy & pollution information on over 2000 car models. Here is a boxplot of combined fuel economy​ (using an average of driving​ conditions) in miles per gallon by vehicle type​ (midsize car, standard pickup​ truck, or​ SUV) for 2012 model vehicles. Summarize the fuel economies of the three vehicle types. A) In​ general, fuel economy is higher in SUVs than in either cars or pickup trucks. The top​ 50% of SUVs get higher fuel economy that​ 75% of cars & all pickups. The distribution for SUVs shows less spread B) In​ general, fuel economy is higher in cars than in either SUVs or pickup trucks. The top​ 50% of cars get higher fuel economy than​ 75% of SUVs & all pickups. The distribution for pickups shows less spread C) In​ general, fuel economy is higher in SUVs than in either cars or pickup trucks. The top​ 75% of SUVs get higher fuel economy than​ 50% of cars &​ 75% of pickups. The distribution for SUVs shows less spread D) In​ general, fuel economy is higher in cars than in either SUVs or pickup trucks. The top​ 75% of cars get higher fuel economy than​ 50% of SUVs &​ 75% of pickups. The distribution for pickups shows less spread

B) In​ general, fuel economy is higher in cars than in either SUVs or pickup trucks. The top​ 50% of cars get higher fuel economy than​ 75% of SUVs & all pickups. The distribution for pickups shows less spread

A pharmaceutical company conducts an experiment in which a subject takes 100 mg of a substance orally. The researchers measure how many minutes it takes for one-quarter of the substance to exit the bloodstream. What kind of variable is the company​ studying? A) Categorical variable B) Quantitative variable

B) Quantitative variable

Sports announcers love to quote statistics. During football championship​ games, they particularly love to announce when a record has been broken. They might have a list of all football championship​ games, along with the scores of each​ team, total scores for the two​ teams, the margin of​ victory, passing yards for the​ quarterbacks, & many more bits of information. Identify the who in this list A) The two teams B) The individual games C) The quarterbacks D) The sports announcers

B) The individual games

The histogram to the right shows the distribution of the prices of plain pizza slices​ (in $) for 306 weeks in a large city. Which summary statistics would you choose to summarize the center and spread in these​ data? Why? A) The median & standard deviation should be reported because the distribution is unimodal & symmetric B) The mean & standard deviation should be reported because the distribution is unimodal and symmetric C) The median & IQR should be reported because the distribution is strongly skewed D) The mean & IQR should be reported because the distribution is strongly skewed

B) The mean & standard deviation should be reported because the distribution is unimodal and symmetric

Pollsters are interested in predicting the outcome of elections. Give an example of how they might model whether someone is likely to vote. A) The pollsters might consider whether a person lives in a​ city, which indicates easier access to polling stations B) The pollsters might consider whether a person voted previously or whether he or she could name the​ candidates, which indicates a greater interest in the election C) The pollsters might consider whether a person is legally allowed to vote. If they​ are, then the likelihood that person will vote is very high D) The pollsters might consider whether a person could name the current​ president, which indicates an interest in politics

B) The pollsters might consider whether a person voted previously or whether he or she could name the​ candidates, which indicates a greater interest in the election

A research agency analyzes load factors​ (passenger-miles as a percentage of available​ seat-miles) for commercial airlines. The accompanying boxplots summarize the data from 2000 to 2011 for international flights. Look at the accompanying boxplots by month. Two outliers are​ evident; one in September and one in October. Both of these are for the year 2001 and reflect the terrorist attacks of​ 9/11. Should the data for these months be set aside in any overall analysis of load factor patterns by​ month? Explain A) Air travel immediately after the events of​ 9/11 was typical of air travel in general. In order to analyze the monthly​ pattern, it might be best to leave these months in the data B) Air travel immediately after the events of​ 9/11 was not typical of air travel in general. In order to analyze the monthly​ pattern, it might be best to leave these months in the data C) Air travel immediately after the events of​ 9/11 was not typical of air travel in general. In order to analyze the monthly​ pattern, it might be best to set these months aside D) Air travel immediately after the events of​ 9/11 was typical of air travel in general. In order to analyze the monthly​ pattern, it might be best to set these months aside

C) Air travel immediately after the events of​ 9/11 was not typical of air travel in general. In order to analyze the monthly​ pattern, it might be best to set these months aside

A study reports load factors​ (passenger-miles as a percentage of available​ seat-miles) for commercial airlines for every month from 2000 through 2013. The accompanying boxplots display the domestic load factors by​ year, and all of the domestic load factors. The boxplot for all of the load factors shows a single outlier. The boxplots for each year show the same​ data but have no outliers. Why do you think this​ is? A) Outliers are context-dependent. Data values that seem ordinary when placed against one​ year's data may stand out when compared against the data for another year B) Outliers are context-dependent. Data values that seem ordinary when placed against the entire data set may stand out when compared against the data for one year C) Outliers are context-dependent. Data values that seem ordinary when placed against one​ year's data may stand out when compared against the entire data set D) The data value for this outlier is an error

C) Outliers are context-dependent. Data values that seem ordinary when placed against one​ year's data may stand out when compared against the entire data set

The toy​ hoop, a popular​ children's toy in the​ 1950s, has gained popularity as an exercise in recent years. But does it​ work? To answer the​ question, an exercise council conducted a study to evaluate the cardio &​ calorie-burning benefits of using this​ children's toy. Researchers recorded the heart rate & oxygen consumption of adult​ participants, as well as their individual ratings of perceived​ exertion, at regular intervals during a​ 30-minute workout. Identify Who & What was investigated & the Population of interest A) The Who is children & the What is the number​ of children's toys sold. The population of interest is all children. B) The Who is all adults & the What is the amount of time spent working out. The population of interest is the participants in the study. C) The Who is the participants in the study & the What is the recorded heart​ rate, oxygen​ consumption, & perceived exertion. The population of interest is all adults. D) The Who is the exercise council & the What is the recorded heart​ rate, oxygen​ consumption, & perceived exertion. The population of interest is all adults.

C) The Who is the participants in the study & the What is the recorded heart​ rate, oxygen​ consumption, & perceived exertion. The population of interest is all adults.

Describe what these boxplots tell you about the relationship between the number of cylinders a​ car's engine has and the​ car's fuel economy​ (mpg). Select all that apply A) These boxplots show that as the number of cylinders​ increases, fuel economy tends to increase B) These boxplots show that as the number of cylinders​ increases, fuel economy tends to become more variable C) These boxplots show that as the number of cylinders​ increases, fuel economy tends to become less variable D) These boxplots show that as the number of cylinders​ increases, fuel economy tends to decrease

C) These boxplots show that as the number of cylinders​ increases, fuel economy tends to become less variable D) These boxplots show that as the number of cylinders​ increases, fuel economy tends to decrease

Meteorologists utilize sophisticated models to predict the weather up to ten days in advance. Give an example of how they might assess their models. A) They can use the models to predict the average temperature ten days in advance & compare their predictions to the previous​ year's data B) They can create another model to predict the weather & compare the predictions of each model to determine the accuracy C) They can use the models to predict the average temperature ten days in advance & compare their predictions to the actual temperatures D) They can use the models to predict the average​ two-day temperature for the next ten days & compare the individual predictions to the predicted average temperature for the entire ten-day period

C) They can use the models to predict the average temperature ten days in advance & compare their predictions to the actual temperatures

Here is a display of the international load factors by month for the period from 2000 to 2011. Discuss the patterns in this display Choose the correct answer below A) Load factors are generally highest & least variable in the spring months​ (March-May). They are lower & more variable in the summer & winter B) Load factors are generally highest & least variable in the summer months​ (June-August). They are lower & more variable in the winter & spring C) Load factors are generally highest & least variable in the winter months​ (December-February). They are lower & more variable in the summer & spring. D) Load factors are generally highest & least variable in the fall months​ (September-October). They are lower & more variable in the summer & spring

Choose the correct answer below B) Load factors are generally highest & least variable in the summer months​ (June-August). They are lower & more variable in the winter & spring

A study reports load factors​ (passenger miles as a percentage of available​ seat miles) for commercial airlines for every month from 2000 through 2011. Here are histograms comparing the domestic & international load factors for this time period. Compare & contrast the distributions Choose the correct answer below A) Both distributions are unimodal & skewed to the left. The lowest international value may be an outlier. The medians are very similar. The IQRs show that the international load factors vary a bit more. B) Both distributions are uniform. The lowest international value may be an outlier. The medians are very similar. The IQRs show that the domestic load factors vary a bit more. C) Both distributions are unimodal & skewed to the right. The lowest international value may be an outlier. The medians are very similar. The IQRs show that the domestic load factors vary a bit more. D) Both distributions are unimodal & skewed to the left. The lowest international value may be an outlier. The medians are very similar. The IQRs show that the domestic load factors vary a bit more.

Choose the correct answer below D) Both distributions are unimodal & skewed to the left. The lowest international value may be an outlier. The medians are very similar. The IQRs show that the domestic load factors vary a bit more.

A study reports load factors​ (passenger miles as a percentage of available​ seat miles) for commercial airlines for every month from 2000 through 2013. The accompanying histograms compare load factors for September through March versus those for April through August. Compare and contrast the distributions. Choose the correct answer below. A) Both distributions are unimodal & skewed to the left. The lowest​ fall/winter value may be an outlier. The​ spring/summer load factors appear to vary more than the​ fall/winter load factors. B) Both distributions are unimodal & skewed to the left. The lowest​ spring/summer value may be an outlier. The​ fall/winter load factors appear to vary more than the​ spring/summer load factors. C) Both distributions are uniform. The lowest​ spring/summer value may be an outlier. The​ fall/winter load factors appear to vary more than the​ spring/summer load factors. D) Both distributions are unimodal & skewed to the right. The lowest​ spring/summer value may be an outlier. The​ fall/winter load factors appear to vary more than the​ spring/summer load factors.

Choose the correct answer below. B) Both distributions are unimodal & skewed to the left. The lowest​ spring/summer value may be an outlier. The​ fall/winter load factors appear to vary more than the​ spring/summer load factors.

A social media website uploads more than 350 million photos every day onto its servers. For this​ collection, describe the Who & the What A) The Who is the​ servers; the What is the number of photos on each server. B) The Who is the social media​ website; the What is the photos. C) The Who is the user of the social media​ website; the What is the total number of users D) The Who is the​ photos; the What could be the facts about the photos

D) The Who is the​ photos; the What could be the facts about the photos

The accompanying histogram shows the total number of adoptions in each of 45 regions. Determine whether the mean number of adoptions or the median number of adoptions is higher.​ Why? A) The mean is higher because the distribution is skewed to the low end, so the mean is pulled toward the higher values B) The median is higher because the distribution is nearly symmetric, so the mean is pulled toward the lower values C) The median & mean are about the same because the distribution is nearly symmetric​, so the mean is pulled equally by the higher & lower values D) The mean is higher because the distribution is skewed to the high end​, so the mean is pulled toward the higher values E) The median & mean are about the same because the distribution is skewed to the high end, so the mean is pulled equally by the higher & lower values F) The median is higher because the distribution is skewed to the low end​, so the mean is pulled toward the lower values

D) The mean is higher because the distribution is skewed to the high end​, so the mean is pulled toward the higher values

A company that sells frozen pizza to stores in four markets​ (cities A,​ B, C, and​ D) wants to examine the prices that the stores charge for pizza slices. To the right are boxplots comparing data from a sample of stores in each market Do prices appear to be the same in the four​ markets? Explain A) No. Prices appear to be both lower on average & more variable in city C B) No. Prices appear to be both higher on average & more variable in city D C) Yes. Prices appear to be the same in the four markets D) No. Prices appear to be both lower on average & less variable in city A Does the presence of any outliers affect your overall conclusions about the prices in the four​ markets?

Do prices appear to be the same in the four​ markets? Explain A) No. Prices appear to be both lower on average & more variable in city C Does the presence of any outliers affect your overall conclusions about the prices in the four​ markets? No, the presence of outliers does not affect the overall conclusions

An organization awards prizes in six categories to people each year. Their website allows you to look up all the prizes awarded in any year. The data are not listed in a table. Rather you drag a slider to the year & see a list of the awardees for that year. Describe the​ "who" in this scenario. A) The names of people who have been awarded a prize from the organization B) The categories in which the prizes are awarded C) The organization that awards the prizes D) The years in which the prizes are awarded E) The people who have been awarded a prize from the organization

E) The people who have been awarded a prize from the organization

In a​ way, boxplots are the opposite of histograms. A histogram divides the number line into equal intervals and displays the number of data values in each interval. A boxplot divides the data into equal parts & displays the portion of the number line each part covers. These two plots display the number of incarcerated prisoners in each state at the end of a certain year. Explain how you could​ tell, by looking at a​ boxplot, where the tallest bars on the histogram would be located A) The areas in the boxplot where two vertical dividers are closest together will likely contain the tallest bars B) The areas in the boxplot where two vertical dividers are farthest apart will likely contain the tallest bars C) The tallest bars are going to typically occur above the outliers D) The tallest bars are going to occur close to the median Explain how both the boxplot & the histogram can indicate a skewed distribution A) The boxplot indicates a skewed distribution because the median is not centered & the whiskers are not roughly the same length. The histogram indicates a skewed distribution since the right tail stretches out much further than the left tail B) The boxplot indicates a skewed distribution because the median is not centered & the whiskers are not roughly the same length. The histogram indicates a skewed distribution since the right tail & left tail stretch out equally in length C) The boxplot indicates a skewed distribution because the median is centered & the whiskers are roughly the same length. The histogram indicates a skewed distribution since the right tail stretches out much further than the left tail D) The boxplot indicates a skewed distribution because the median is centered & the whiskers are roughly the same length. The histogram indicates a skewed distribution since the right tail & left tail stretch out equally in length Identify one feature of the distribution that the histogram shows but the boxplot does not A) The histogram shows the presence of outliers while the boxplot does not B) The histogram shows the location of the median while the boxplot does not C) The histogram shows that the distribution is roughly uniform on the left side while the boxplot does not D) The histogram shows that a very large number of values occur in the smallest bin while the boxplot does not Identify one feature of the distribution that the boxplot shows but the histogram does not A) The boxplot shows the mode while the histogram does not B) The boxplot shows the presence of outliers while the histogram does not C) The boxplot shows the location of the median while the histogram does not D) The boxplot shows that the distribution is roughly uniform on the left side while the histogram does not

Explain how you could​ tell, by looking at a​ boxplot, where the tallest bars on the histogram would be located A) The areas in the boxplot where two vertical dividers are closest together will likely contain the tallest bars Explain how both the boxplot & the histogram can indicate a skewed distribution A) The boxplot indicates a skewed distribution because the median is not centered & the whiskers are not roughly the same length. The histogram indicates a skewed distribution since the right tail stretches out much further than the left tail Identify one feature of the distribution that the histogram shows but the boxplot does not D) The histogram shows that a very large number of values occur in the smallest bin while the boxplot does not Identify one feature of the distribution that the boxplot shows but the histogram does not C) The boxplot shows the location of the median while the histogram does not

The accompanying histogram shows the total number of adoptions in each of the 43 regions Determine whether the mean number of adoptions or the median number of adoptions is higher.​ Why? A) The median and mean are about the same because the distribution is skewed to the high end, so the mean is pulled equally by the higher & lower values B) The median is higher because the distribution is nearly symmetric, so the mean is pulled toward the lower values C) The mean is higher because the distribution is skewed to the low end, so the mean is pulled toward the higher values D) The mean is higher because the distribution is skewed to the high end​, so the mean is pulled toward the higher values E) The median and mean are about the same because the distribution is nearly symmetric​, so the mean is pulled equally by the higher & lower values F) The median is higher because the distribution is skewed to the low end​, so the mean is pulled toward the lower values

F) The median is higher because the distribution is skewed to the low end​, so the mean is pulled toward the lower values

An article reported on a school​ district's magnet school programs. Of the 1640 qualified​ applicants, 852 were​ accepted, 291 were​ waitlisted, & 497 were turned away for lack of space. Find the relative frequency for each decision​ made, & write a sentence summarizing the results. Find the relative frequency of qualified students being accepted Find the relative frequency of qualified students being waitlisted Find the relative frequency of qualified students being turned away for lack of space Select the statement that best describes the relative frequencies A) Of the qualified​ students, 52​% were waitlisted. B) Of the qualified​ students, 52​% were​ accepted, 17.7​% were​ waitlisted, & 30.3% were turned away due to lack of space. C) Of the qualified​ students, 30.3​% were accepted. D) Of the qualified​ students, 17.7​% were turned away for lack of space.

Find the relative frequency of qualified students being accepted 52.0% Find the relative frequency of qualified students being waitlisted 17.7% Find the relative frequency of qualified students being turned away for lack of space 30.3% Select the statement that best describes the relative frequencies B) Of the qualified​ students, 52​% were​ accepted, 17.7​% were​ waitlisted, & 30.3% were turned away due to lack of space.

The accompanying histogram shows the lengths of hospital stays​ (in days) for all female patients admitted to a certain hospital during one year with a primary diagnosis of acute myocardial infarction​ (heart attack). From the​ histogram, determine whether the mean or median is larger. Explain. A) Because the distribution is nearly symmetric​, the mean is expected to be larger than the median B) Because the distribution is skewed to the high end​, the mean is expected to be larger than the median C) Because the distribution is nearly symmetric​, the mean is expected to be about the same as the median D) Because the distribution is skewed to the high end​, the mean is expected to be smaller than the median E) Because the distribution is skewed to the low end​, the mean is expected to be larger than the median F) Because the distribution is skewed to the low end​, the mean is expected to be smaller than the median Is the histogram​ uniform, unimodal,​ bimodal, or​ multimodal? What is the center of the​ histogram? A) The center of the histogram is day 14 B) The center of the histogram is day 8 C) The center of the histogram is day 5 D) The center of the histogram is day 1 What is the spread of the distribution as defined by the​ range? Describe any unusual features A) Some of the data values are significantly different from the other data B) There is a peak at day​ 1, which may represent patients who did not survive C) There are many outliers D) There are no unusual features for this distribution Which summary statistics should be chosen to summarize the center and spread these​ data? Why?

From the​ histogram, determine whether the mean or median is larger. Explain. B) Because the distribution is skewed to the high end​, the mean is expected to be larger than the median Is the histogram​ uniform, unimodal,​ bimodal, or​ multimodal? The histogram is bimodal What is the center of the​ histogram? C) The center of the histogram is day 5 What is the spread of the distribution as defined by the​ range? The data range from 11 to 14 days Describe any unusual features B) There is a peak at day​ 1, which may represent patients who did not survive Which summary statistics should be chosen to summarize the center and spread these​ data? Why? The median & IQR should be chosen because the distribution is skewed

A company that sells frozen pizza to stores in four markets in the United States​ (Denver, Baltimore,​ Dallas, and​ Chicago) wants to examine the prices that the stores charge for pizza slices. Boxplots are given comparing the data from a sample of stores in each market. The mean price of pizza in Baltimore was​ $2.85, $0.23 higher than the mean price of​ $2.62 in Dallas. To see if that difference was​ real, or due to​ chance, we took the 156 prices from Baltimore and Dallas and mixed those 312 prices together. Then we randomly chose 2 groups of 156 prices​ 10,000 times, and computed the difference in mean price each time. The histogram shows the distribution of those​ 10,000 differences Given this​ histogram, what do you conclude about the actual difference of​ $0.23 between the mean prices of Baltimore &​ Dallas? A) Since the resampling process never generated a difference in sample means close to​ $0.23, it appears that the observed difference of​ $0.23 may have occurred by chance B) Since the resampling process generated many differences in sample means around​ $0.23, it appears that the observed difference of​ $0.23 did not occur by chance C) Since the resampling process never generated a difference in sample means close to​ $0.23, it appears that the observed difference of​ $0.23 did not occur by chance D) Since the resampling process generated many differences in sample means around​ $0.23, it appears that the observed difference of​ $0.23 may have occurred by chance Do you think the presence of the outliers in the accompanying boxplots affects your​ conclusion? A) No, the outliers were ignored in the analysis so they do not affect the conclusion B) Yes, the outliers balance out the difference in sample means & possibly hide an underlying difference C) Yes, the outliers are going to be the most significant contribution to the difference in sample means D) No, the outliers lie fairly close to the minimum & maximum​ values, & only account for a small proportion of the observations Consider a similar analysis using shuffling to compare prices in Chicago & Denver. Do you think that the actual difference in mean prices would be different from what you might expect by​ chance? A) Yes, by generating such a large number of differences in sample​ means, any observed result will look like it did not occur by chance B) No, since the boxplots for Chicago & Denver show a much larger range than the boxplots for Baltimore &​ Dallas, it is highly likely that any difference in sample means could occur by chance. C) No, since the majority of data in the boxplots for Chicago & Denver are close​ together, the difference in sample means would likely be a value observed often by randomly shuffling the data D) Yes, the purpose of shuffling the data is to produce a difference in sample means of​ 0, while the difference in sample means is clearly not 0 from the boxplots

Given this​ histogram, what do you conclude about the actual difference of​ $0.23 between the mean prices of Baltimore &​ Dallas? C) Since the resampling process never generated a difference in sample means close to​ $0.23, it appears that the observed difference of​ $0.23 did not occur by chance Do you think the presence of the outliers in the accompanying boxplots affects your​ conclusion? D) No, the outliers lie fairly close to the minimum & maximum​ values, & only account for a small proportion of the observations Consider a similar analysis using shuffling to compare prices in Chicago & Denver. Do you think that the actual difference in mean prices would be different from what you might expect by​ chance? C) No, since the majority of data in the boxplots for Chicago & Denver are close​ together, the difference in sample means would likely be a value observed often by randomly shuffling the data

After examining a child at his​ 2-year checkup, the​ boy's pediatrician said that the​ z-score for his height relative to other​ 2-year-olds in the country was −0.51. Explain to the parents what that means

He is 0.51 standard deviations below the mean in height for​ 2-year-olds in the country

A​ town's January high temperatures average 39°F with a standard deviation of 8°​, while in July the mean high temperature is 74° & the standard deviation is 6°. In which month is it more unusual to have a day with a high temperature of 53°​?

It is more unusual to have a day with a high temperature of 53° in July. A high temperature of 53° in July is 3.5 standard deviations below the mean & a high temperature of 53° in January is only 1.75 standard deviations above the mean.

The boxplot shows the fuel economy ratings for 67 subcompact cars with the same model year. Some summary statistics are also provided. The extreme outlier is an electric car whose electricity usage is equivalent to 112 miles per gallon. If that electric car is removed from the data​ set, how will the standard deviation be​ affected? The​ IQR? How will removing the electric car affect the standard​ deviation? A) The standard deviation will not change much. Removing the term for the electric car in the standard deviation calculation will be balanced out by removing the term of the electric car in the mean calculation B) The standard deviation will be much greater. Since the standard deviation is calculated by summing the differences between the data values & the​ mean, removing the electric car will drastically lower the mean & thus increase this sum C) The standard deviation will be much lower. Since the standard deviation is calculated by summing the data​ values, removing the electric car will drastically lower this sum D) The standard deviation will be much lower. Since the standard deviation is calculated by summing the squared differences between the data values & the​ mean, removing the electric car will drastically lower this sum E) The standard deviation will not change much. Since the standard deviation is calculated using the​ median, it is not drastically affected by the addition or removal of outliers How will removing the electric car affect the​ IQR? A) The IQR will be much lower. Removing the electric car from the data set will drastically increase the value of the first quartile & thus decrease the IQR by an equal amount. B) The IQR will be much greater. Removing the electric car from the data set will allow the boxplot to zoom in on the relevant data values. C) The IQR will be much lower. Removing the electric car from the data set will drastically decrease the value of the third quartile & thus decrease the IQR by an equal amount. D) The IQR will not change very​ much, if at all. All that removing the electric car can do is possibly change the location of each quartile to be the preceding data​ value, which will not have a huge impact on the IQR.

How will removing the electric car affect the standard​ deviation? D) The standard deviation will be much lower. Since the standard deviation is calculated by summing the squared differences between the data values & the​ mean, removing the electric car will drastically lower this sum How will removing the electric car affect the​ IQR? D) The IQR will not change very​ much, if at all. All that removing the electric car can do is possibly change the location of each quartile to be the preceding data​ value, which will not have a huge impact on the IQR.

In 1995​, a magazine collected data and published an article evaluating freezers. It listed 39 ​models, giving the​ brand, cost​ (dollars), size​ (cu ft),​ type, estimated annual energy cost​ (dollars), an overall rating​ (good, excellent,​ etc.), and repair history for that brand​ (percentage requiring repairs over the past 5​ years). Identify the Who A) 39 models of freezers B) Brands of freezers C) Size of freezers D) Freezers Identify theWhat A) Brand, type, overall rating B) Cost, size, estimated annual energy​ cost, repair history C) Brand, cost,​ size, type D) Brand, cost,​ size, type, estimated annual energy​ cost, overall​ rating, repair history Identify the When A) The data were recorded in 1995. B) The information is not provided. Identify the Where A) The 39 models of freezers B) The population of freezers C) The information is not provided. Identify the Why A) To write an article about freezers B) To compare the size of the freezers C) To provide information to the​ magazine's readers D) To collect information about Identify the hoW A) The different freezers belonged to the​ magazine's employees. B) The information is not provided. C) The freezers were chosen at random. Identify each variable as either categorical or quantitative. Brand Estimated annual energy cost Cost Overall rating Size Repair history Type

Identify the Who A) 39 models of freezers Identify theWhat D) Brand, cost,​ size, type, estimated annual energy​ cost, overall​ rating, repair history Identify the When A) The data were recorded in 1995. Identify the Where C) The information is not provided Identify the Why C) To provide information to the​ magazine's readers Identify the hoW B) The information is not provided. Identify each variable as either categorical or quantitative Brand - Categorical Estimated annual energy cost -Quantitative Cost - Quantitative Overall rating - Categorical Size - Quantitative Repair history - Quantitative Type - Categorical

The Salem horse race has been run every year since 1874 in Salem​, New Hampshire. The data for the first few & a few recent races follow. For the above description of​ data, identify the​ W's, name the​ variables, specify for each variable whether its use indicates it should be treated as categorical or​ quantitative, & for any quantitative​ variable, identify the units in which it was measured​ (or note that they were not​ provided). Identify the Who A) The jockeys B) The Salem horse races C) Horse races D) The horses that competed in the Salem horse races Identify the What A) Date, margin,​ winner's payoff, duration B) Date, winner,​ margin, jockey,​ winner's payoff,​ duration, track condition C) Date, winner, jockey D) Winner, jockey, track condition Identify the When A) 1874 to 2001 B) Not specified C) 1874​, 1875​, 2000​, 2001 D) May Identify the Where A) New Hampshire B) the United States C) Salem​, New Hampshire D) Not specified Identify the Why A) To see if the same horse won in multiple years B) To compare the times of the winners from year to year C) Not specified D) To maintain a list of the winners Identify the hoW A) A random sample of races was taken B) Official statistics were collected at the time of the race. C) A chronic better recorded the data. Identify each variable as either categorical or quantitative. - Winner ​- Winner's payoff - Margin - Duration - Jockey - Track condition

Identify the Who B) The Salem horse races Identify the What B) Date, winner,​ margin, jockey,​ winner's payoff,​ duration, track condition Identify the When A) 1874 to 2001 Identify the Where C) Salem​, New Hampshire Identify the Why C) Not specified Identify the hoW B) Official statistics were collected at the time of the race. Identify each variable as either categorical or quantitative. - Winner categorical ​- Winner's payoff quantitative - Margin quantitative - Duration quantitative - Jockey categorical - Track condition categorical

A local casting plant is a​ large, highly automated producer of gray & nodular iron automotive castings for a certain car company. The company is interested in keeping the pouring temperature of the molten iron​ (in degrees​ Fahrenheit) close to the specified value of 2550 degrees. The casting plant measured the pouring temperature for 10 randomly selected crankshafts. Identify the Who A) The casting plant B) All of the crankshafts at the casting plant C) The 10 crankshafts at the casting plant D) The pouring temperature of molten iron Identify the What A) The pouring temperature of molten iron B) The casting plant C) The 10 crankshafts at the casting plant D) All of the crankshafts at the casting plant

Identify the Who C) The 10 crankshafts at the casting plant Identify the What A) The pouring temperature of molten iron

A certain​ 2.5-mile motor speedway has been home to a race on Memorial Day nearly every year since 1915. The data for the first few races & a few recent​ races, which were collected after the races by the​ races' organizers, follow. Identify the​ W's, name the​ variables, specify for each variable whether its use indicates it should be treated as categorical or​ quantitative, & for any quantitative​ variable, identify the units in which it was measured​ (or note that they were not​ provided). Identify the Who A) Car races B) The drivers C) The car races at this motor speedway D) The cars that competed in the races E) The Who is not specified. Identify the What A) Year, driver,​ time, speed B) Year, car,​ driver, distance,​ time, speed C) Year, driver,​ car, prize money D) Distance, rate, time E) The What is not specified Identify the When A) Memorial Day B) 1915​, 1916​, 1917​, 2001​, 2002​, 2003 C) 1915 to 2003 D) The When is not specified Identify the Where A) The certain motor speedway B) United States C) New Hampshire D) The Where is not specified Identify the Why A) To maintain a list of the winners B) To compare the speeds of the drivers from year to year C) To see if the same driver won in multiple years D) To compare the times of the winners from year to year E) The Why is not specified Identify the hoW A) A fan attending each race recorded the data. B) A random sample of races was taken. C) The​ race's organizers recorded the data after the end of each race. D) The How is not specified Identify each variable as either categorical or quantitative.

Identify the Who C) The car races at this motor speedway Identify the What A) Year, driver,​ time, speed Identify the When C) 1915 to 2003 Identify the Where A) The certain motor speedway Identify the Why E) The Why is not specified Identify the hoW C) The​ race's organizers recorded the data after the end of each race. Identify each variable as either categorical or quantitative. - One​ variable, Driver, is an identifier & has no units. - Another​ variable, Time, is quantitative & measured in hours, minutes, & seconds. - Another​ variable, Speed, is quantitative & measured in mph. - Another​ variable, Year, is quantitative or an identifier & measured in years.

A study found that during pregnancy​, a woman can tell whether a man is fertile or sterile by looking at his face. The study involved 40 undergraduate women, some of whom were pregnant, who were asked to guess the likely fertility of 80 men based on photos of their faces. Half of the men were fertile​, & the other half were sterile. All held similar expressions in the photos. None of the women were in their third trimester at the time of the test. The result was that the closer a woman was to her third trimester​, the more accurate her guess. Identify the Who for this study A) The 80 men whose faces were used in the study B) The 40 fertile men C) The researchers in the study D) The 40 sterile men E) The 40 undergraduate women F) The Who is not specified. Identify the What for this study A) The ability to differentiate fertile men from sterile men B) How close a woman is to her third trimester C) How expressive a man is D) How stoic a man is E) The What is not specified Identify the Population of interest for this study A) All people B) All sterile men C) All women D) All pregnant women E) All fertile men F) The Population of interest cannot be determined from the given information.

Identify the Who for this study E) The 40 undergraduate women Identify the What for this study A) The ability to differentiate fertile men from sterile men Identify the Population of interest for this study C) All women

A study begun in 2011 examines the use of stem cells in treating two forms of anemia. Each of the 30 patients entered one of two separate trials in which embryonic stem cells were to be used to treat the condition. Identify the Who in this study A) All anemic people B) The researchers conducting the study C) The 30 anemic patients D) The two forms of anemia E) The Who is not specified Identify the What in this study A) The effects the treatments have on anemia B) How long after 2011 before anemia is cured C) The difference in the severities of the two forms of anemia D) Which of the two trials does each patient enter E) The What is not specified Identify the Population of interest in this study A) All anemic people B) The 30 anemic patients C) All people D) All people with these two forms of anemia E) All forms of anemia F) All researchers studying anemia G) The Population of interest cannot be determined from the given information.

Identify the Who in this study C) The 30 anemic patients Identify the What in this study A) The effects the treatments have on anemia Identify the Population of interest in this study D) All people with these two forms of anemia

An education department requires local school districts to keep records on; -​ students' age​ (in years) - race or​ ethnicity - days​ absent - current grade​ level - standardized test scores in reading &​ mathematics - any disabilities or special educational needs Identify the​ W's A) The Who is the​ students; the What is the​ age, race or​ ethnicity, number of​ absences, grade​ level, reading​ score, math​ score, &​ disabilities/special needs; the When is present-day; the Where is not​ specified; the Why is that keeping this information is required by the education​ department; the How is the information is collected & stored as part of the school records B) The Who is the​ teachers; the What is the tenure​ length, race or​ ethnicity, number of​ classes, grade​ level, & education​ background; the When is present-day; the Where is at​ school; the Why is that keeping this information is required by the education​ department; the How is the information is collected & stored as part of the school records C) The Who is the​ students; the What is the number of​ students, their height and​ weight, and economic​ background; the When is the past​ year; the Where is not​ specified; the Why is that keeping this information is required by the education​ department; the How is the information is surveyed in homeroom D) The Who is the education​ department; the What is the number of student​ records; the When is present-day; the Where is the education​ department; the Why is that keeping this information is required by the education​ department; the How is the information is collected & stored as part of the school records Name the variables. Select all that apply A) Race or Ethnicity B) Disabilities or special education needs C) Height & weight D) Standardized reading score E) Number of classes F) Current grade level G) Age H) Standardized mathematics score I) School district J) Days Absent

Identify the​ W's A) The Who is the​ students; the What is the​ age, race or​ ethnicity, number of​ absences, grade​ level, reading​ score, math​ score, &​ disabilities/special needs; the When is present-day; the Where is not​ specified; the Why is that keeping this information is required by the education​ department; the How is the information is collected & stored as part of the school records Name the variables. Select all that apply. A) Race or Ethnicity B) Disabilities or special education needs D) Standardized reading score F) Current grade level G) Age H) Standardized mathematics score J) Days Absent

A polling company conducted a representative telephone survey of 1180 of a​ country's voters during the first quarter of 2007. Among the reported results were the​ voter's region​ (Northeast, South,​ etc.), age​ (in years), party​ affiliation, and whether or not the person had voted in the 2006 midterm congressional election. Identify the​ W's. A) The Who is the polling​ company; the What is the survey and the​ responses; the When is the year​ 2006; the Where is the​ country; the Why is not​ specified; the How is a phone survey B) The Who is all​ citizens; the What is the​ voter's voting​ history; the When is the year​ 2007; the Where is the​ country; the Why is not​ specified; the How is an​ in-person survey C) The Who is the​ voters; the What is the​ voter's current​ congressman; the When is the next​ midterm; the Where is the Northeast or​ South; the Why is the government is interested in voting patterns for certain​ regions; the How is a phone survey D) The Who is the 1180​ voters; the What is the​ region, age, political​ affiliation, and whether or not the person voted in the 2006 midterm congressional​ election; the When is the first quarter of​ 2007; the Where is the​ country; the Why is not​ specified; the How is a phone survey Name the variables. Select all that apply A) Age B) Ethnicity C) Party affiliation D) Whether or not they voted in the previous midterm E) Type of telephone F) Region

Identify the​ W's D) The Who is the 1180​ voters; the What is the​ region, age, political​ affiliation, and whether or not the person voted in the 2006 midterm congressional​ election; the When is the first quarter of​ 2007; the Where is the​ country; the Why is not​ specified; the How is a phone survey Name the variables. Select all that apply A) Age C) Party affiliation D) Whether or not they voted in the previous midterm F) Region

The Postal Service uses​ five-digit ZIP codes to identify locations to assist in delivering mail. In what sense are ZIP codes​ categorical? A) There are exactly as many values of ZIP codes as there are cases B) The values of ZIP codes are text & those values tell what category the particular case falls into C) ZIP codes can be used to measure distances between locations D) Each ZIP code corresponds to a geographical region Is there any ordinal sense to ZIP​ codes? In other​ words, does a larger ZIP code tell you anything about a location compared to a smaller ZIP​ code? A) Yes, since the first digit of a ZIP code corresponds to a set geographical region B) Yes, since ZIP codes increase from east to west C) Yes since smaller ZIP codes correspond to more​ population-dense regions D) No, there is no ordinal sense to ZIP codes E) Yes since ZIP codes increase from west to east

In what sense are ZIP codes​ categorical? D) Each ZIP code corresponds to a geographical region Is there any ordinal sense to ZIP​ codes? In other​ words, does a larger ZIP code tell you anything about a location compared to a smaller ZIP​ code? B) Yes, since ZIP codes increase from east to west

To help travelers know what to​ expect, researchers collected the prices of commodities in 16 cities throughout the world. Here are boxplots comparing the average prices of a bottle of​ water, a dozen​ eggs, & a cappuccino in the 16 cities​ (prices are all in​ US$) In​ general, which commodity is the most​ expensive? A) On​ average, a cappuccino is the most expensive of the three. The first quartile of cappuccino prices is higher than all the water prices. The middle​ 50% of cappuccino prices is less variable than egg​ prices, & the top​ 25% is higher B) On​ average, eggs are the most expensive of the three. The first quartile of egg prices is higher than most of the water prices. The third quartile for egg prices is higher than most of cappuccino prices C) On​ average, eggs are the most expensive of the three. The third quartile of egg prices is roughly equal to the third quartile of cappuccino​ prices, but the middle​ 50% of egg prices is much more variable D) On​ average, a cappuccino is the most expensive of the three. The maximum cappuccino price is higher than any of the other prices These are prices for the same 16 cities. But can you tell whether a cappuccino is more expensive than eggs in all​ cities? Explain A) Since all three quartiles for cappuccino prices are above the corresponding quartile for egg​ prices, cappuccinos are more expensive in all cities B) The minimum price of a cappuccino is lower than the minimum price of​ eggs, so in at least one city eggs are more expensive C) The maximum price of a cappuccino is greater than the maximum price of​ eggs, so in all cities, cappuccinos are more expensive D) It is not possible to tell whether a cappuccino is more expensive than eggs in all​ cities, it is possible to pair up the data so that this is true or untrue

In​ general, which commodity is the most​ expensive? A) On​ average, a cappuccino is the most expensive of the three. The first quartile of cappuccino prices is higher than all the water prices. The middle​ 50% of cappuccino prices is less variable than egg​ prices, & the top​ 25% is higher These are prices for the same 16 cities. But can you tell whether a cappuccino is more expensive than eggs in all​ cities? Explain B) The minimum price of a cappuccino is lower than the minimum price of​ eggs, so in at least one city eggs are more expensive

The Centers for Disease Control lists causes of death in the United States during 2013. ​(Each person is assigned only one cause of​ death.) Is it reasonable to conclude that heart or respiratory diseases were the cause of approximately 29​% of U.S. deaths in 2013​? A) No, because there is no possibility for overlap B) No, because there is the possibility of overlap C) Yes, because there is the possibility of overlap D) Yes, because there is no possibility for overlap What percent of deaths were from causes not listed​ here? Select the diagram that represents these data

Is it reasonable to conclude that heart or respiratory diseases were the cause of approximately 29​% of U.S. deaths in 2013? D) Yes, because there is no possibility for overlap What percent of deaths were from causes not listed​ here? 38.3% Select the diagram that represents these data diagram with the other category

The histogram to the right shows the distribution of the prices of plain pizza slices​ (in $) for 306 weeks in a large city. ​Is the mean closer to $3.00​, $3.20​, or $3.40​? ​Why? Is the standard deviation closer to $0.15, $0.50, or $1.00​? Explain.

Is the mean closer to $3.00​, $3.20​, or $3.40? ​Why? The mean is closest to $3.20 because that is the balancing point of the histogram Is the standard deviation closer to $0.15, $0.50, or $1.00​? Explain. The standard deviation is closest to $0.15 since that is a typical distance from the mean.

The histogram to the right shows the neck sizes​ (in inches) of the 269 men recruited for a health study. Is the mean closer to 14​, 15​, or 17 inches​? ​Why? Is the standard deviation closer to 1 inch, 3 inches, or 5 inches​? Explain.

Is the mean closer to 14​, 15​, or 17 inches​? ​Why? The mean is closest to 15 inches because that is the balancing point of the histogram Is the standard deviation closer to 1 inch, 3 inches, or 5 inches​? Explain. The standard deviation is closest to 1 inch since that is a typical distance from the mean.

The pie chart summarizes the genres of 125 ​first-run movies released one year Is this an appropriate display for the​ genres? Why​ / why​ not? A) Yes, because each movie falls into only one category & no categories overlap B) No, because there are too many movies to be displayed in a pie chart C) Yes because we are given a frequency table D) No, because each movie might fall into more than one category and the categories always overlap Which genre was least​ common?

Is this an appropriate display for the​ genres? Why​ / why​ not? A) Yes, because each movie falls into only one category & no categories overlap Which genre was least​ common? The drama genre was the least common genre

An article reported on a school​ district's magnet school programs. Of the 1661 qualified​ applicants, 536 were black or​ Hispanic, 279 Asian, & 846 were white. Summarize the relative frequency distribution of ethnicity with a sentence or two​ (in the proper​ context, of​ course).

Of the qualified​ applicants, 32.3% were black or​ Hispanic, 16.8​% were​ Asian, & 50.9% were white

A climate change program surveyed 1263 adults in a recent month and asked them about their attitudes on global climate change. A display of the percentages of respondents choosing each of the major alternatives offered is provided. List the errors in this display. Select all that apply? A) There is no title B) The percentages do not sum to​ 100% C) Showing the pie chart on a slant violates the area principle D) The units for each group are different E) The responses are not categorical variables

Select all that apply? A) There is no title B) The percentages do not sum to​ 100% C) Showing the pie chart on a slant violates the area principle

An incoming MBA student took placement exams in economics & mathematics. In​ economics, she scored 81 & in math 88. The overall results on the economics exam had a mean of 75 & a standard deviation of 9​, while the mean math score was 65​, with a standard deviation of 14. On which exam did she do better compared with the other​ students?

Since she scored . 67. standard deviations above the mean in economics & 1.64 standard deviations above the mean in​ mathematics, she did better on the mathematics exam.

The table to the right gives the number of passenger car occupants killed in accidents in a certain year by car type. Convert the table to a relative frequency table Subcompact & mini - x ​% Compact ​- x ​% Intermediate - x ​% Full - x ​% Unknown - x ​%

Subcompact & mini - 11.40 ​% Compact ​- 35.48 ​% Intermediate - 34.06 ​% Full - 17.05 ​% Unknown - 2.01 ​%

The data to the right are the annual numbers of deaths from floods in a certain country over a span of 21 years. Write a short report describing the distribution of the number of deaths in this time period

The distribution of deaths from floods is slightly skewed to the right with modes at about 40 & 80. There is one extreme value at 180 deaths.

The data to the right are the annual numbers of deaths from tornadoes in a certain country over a span of 21 years. Write a short report describing the distribution of the number of deaths in this time period.

The distribution of deaths from tornadoes is slightly skewed to the right with one extreme outlier at 553. The median is 54 deaths & the IQR is 38.5 deaths

People with​ z-scores of 2.5 or above on a certain aptitude test are sometimes classified as geniuses. If aptitude test scores have a mean of 100 & a standard deviation of 22 points, what is the minimum aptitude test score needed to be considered a​ genius?

The minimum aptitude test score needed to be considered a genius is 155 points.

Would you expect the distribution of the variables in parts a through d below to be​ uniform, unimodal, or​ bimodal? Symmetric or​ skewed? Explain why The number of speeding tickets each student in the senior class of a college has ever had The distribution would probably be? Players' scores​ (number of​ strokes) at a major golf tournament in a given year. The distribution would probably be? Weights of female babies born in a particular hospital over the course of a year. The distribution would probably be? The​ price, in​ dollars, of drinks purchased at a coffee shop that serves regular & premium drinks. The distribution would probably be?

The number of speeding tickets each student in the senior class of a college has ever had The distribution would probably be unimodal & skewed because most data values will be either 0 or 1, with a few values much larger than that Players' scores​ (number of​ strokes) at a major golf tournament in a given year. The distribution would probably be unimodal & skewed because more golfers will score above the mean than below the mean. Weights of female babies born in a particular hospital over the course of a year. The distribution would probably be unimodal & symmetric because most babies will have an average weight, with a few lighter than average & a few heavier than average The​ price, in​ dollars, of drinks purchased at a coffee shop that serves regular & premium drinks. The distribution would probably be bimodal & skewed because the regular drinks are less expensive than the premium drinks, & the prices can be much higher but not negative.

A university teacher saved every​ e-mail from students in a large introductory statistics class during an entire term. He then​ counted, for each student who had sent him at least one​ e-mail, how many​ e-mails each student had sent. The accompanying histogram shows the distribution of​ e-mails sent by students. Describe the shape of the distribution.

The shape is unimodal & skewed right.

Twenty-six countries won medals in a competition. The accompanying table lists the total number of medals each country won. Try to make a display of these data. What problems do you​ encounter? A) The sum of the numbers of medals per capita is not​ 100% B) The categories overlap C) There are too many categories D) The data violate the area principle How could the data be better organized to make a​ display? A) Limit the number of categories by combining some of the countries B) Use a multidimensional display for the data C) Include an​ "Other" category D) Convert the values to percentages

Try to make a display of these data. What problems do you​ encounter? C) There are too many categories How could the data be better organized to make a​ display? A) Limit the number of categories by combining some of the countries

The pie chart given to the right and bar chart given below summarizes the movie genres of all the films shown in a suburban theatre over the course of one year. Complete parts a & b. Were Action/Adventure or SciFi/Fantasy films more​ common? Is it easier to see that in the pie chart or the bar​ chart? A) The bar​ chart because pie charts are not always accurate. B) The pie​ chart, because it is easier to tell the size differences of the slices in the pie chart rather than the bar heights of the bar chart. C) The pie​ chart because bar charts are not always accurate. D) The bar​ chart, because it is easier to tell the size differences of the bars in the bar chart. The slices of the pie chart are too close in size.

Were Action/Adventure or SciFi/Fantasy films more​ common? Action/Adventure films were more common. Is it easier to see that in the pie chart or the bar​ chart? D) The bar​ chart, because it is easier to tell the size differences of the bars in the bar chart. The slices of the pie chart are too close in size.

Many grocery store chains offer customers a card they can scan when they check out & offer discounts to people who do so. To get the​ card, customers must give​ information, including a mailing address &​ e-mail address. The actual purpose is not to reward loyal customers but to gather data. What data do these cards allow stores to​ gather, & why would they want that​ data? What data do these cards allow stores to​ gather? Select all that apply. A) The geographical location of a​ customer's home B) Items a customer purchases at the grocery store C) Frequency with which a customer makes purchases at the grocery store D) Time a customer spends at the grocery store E) Identities of other stores at which a customer shops F) Amount of money a customer spends at the grocery store What are some reasons why grocery store chains might want this​ data? Select all that apply. A) To determine how best to market and advertise to different customers B) To help determine the most profitable locations for new grocery stores C) To analyze the effects of price changes on individual​ shoppers' purchases D) To find evidence of particular customers shoplifting

What data do these cards allow stores to​ gather? A) The geographical location of a​ customer's home B) Items a customer purchases at the grocery store C) Frequency with which a customer makes purchases at the grocery store F) Amount of money a customer spends at the grocery store What are some reasons why grocery store chains might want this​ data? Select all that apply. A) To determine how best to market and advertise to different customers B) To help determine the most profitable locations for new grocery stores C) To analyze the effects of price changes on individual​ shoppers' purchases

Crowd Management Strategies monitors accidents at rock concerts. In their​ database, they list the names and other variables of victims whose deaths were attributed to​ "crowd crush" at rock concerts. Here are the histogram and boxplot of the​ victims' ages for data from a recent​ one-year period What features of the distribution are seen in both the histogram & the​ boxplot? A) Essentially​ symmetric, very slightly skewed to the right with two high outliers at 36 & 48. Most victims are between the ages of 24 & 30 B) Essentially​ symmetric, very slightly skewed to the right with two high outliers at 36 & 48. Most victims are between the ages of 16 & 24 C) Essentially​ symmetric, very slightly skewed to the right with two high outliers at 18 & 28. Most victims are between the ages of 32 & 42 D) Essentially​ symmetric, very slightly skewed to the left with two high outliers at 36 & 48. Most victims are between the ages of 24 & 30 What features of the distribution can be seen in the histogram that cannot be seen in the​ boxplot? A) The slight increase between ages 22 and 24 is apparent in the histogram but not in the boxplot. It may be a second mode B) The IQR can be seen in the​ histogram, but cannot be seen in the boxplot C) The median can be seen in the​ histogram, but cannot be seen in the boxplot D) The outliers can be seen in the​ histogram, but cannot be seen in the boxplot What summary statistic would be chosen to summarize the center of this​ distribution? Why? A) The mean would be the most appropriate measure of center because of the symmetrical shape of most of the histogram B) The median would be the most appropriate measure of center because of the slight skew & the extreme outliers C) The mean would be the most appropriate measure of center because of the slight skew & the extreme outliers D) The median would be the most appropriate measure of center because of the symmetrical shape of most of the histogram What summary statistic would be chosen to summarize the spread of this​ distribution? Why? A) The standard deviation would be the most appropriate measure of spread because of the symmetrical shape of most of the histogram B) The IQR would be the most appropriate measure of spread because of the symmetrical shape of most of the histogram C) The standard deviation would be the most appropriate measure of spread because of the slight skew & the extreme outliers D) The IQR would be the most appropriate measure of spread because of the slight skew & the extreme outliers

What features of the distribution are seen in both the histogram & the​ boxplot? B) Essentially​ symmetric, very slightly skewed to the right with two high outliers at 36 & 48. Most victims are between the ages of 16 & 24 What features of the distribution can be seen in the histogram that cannot be seen in the​ boxplot? A) The slight increase between ages 22 and 24 is apparent in the histogram but not in the boxplot. It may be a second mode What summary statistic would be chosen to summarize the center of this​ distribution? Why? B) The median would be the most appropriate measure of center because of the slight skew & the extreme outliers What summary statistic would be chosen to summarize the spread of this​ distribution? Why? D) The IQR would be the most appropriate measure of spread because of the slight skew & the extreme outliers

The​ Men's Giant Slalom skiing event consists of two runs whose times are added together for a final score. Two displays of the giant slalom times in a recent competition are shown in the accompanying histogram & boxplot What features of the distribution can you see in both the histogram & the​ boxplot? A) Highly skewed to the right with two high outliers at 200 & 207. Most skiers have times between 155 & 165 seconds B) Highly skewed to the right with two high outliers at 207 & 215. Most skiers have times between 160 & 180 seconds C) Highly skewed to the left with two low outliers at 157 & 160. Most skiers have times between 180 & 200 seconds D) Essentially​ symmetric, very slightly skewed to the right with two high outliers at 207 & 215. Most skiers have times between 160 & 180 seconds What summary statistic would be chosen to summarize the center of this​ distribution? Why? A) The mean would be the most appropriate measure of center because of the major skew & the extreme outliers B) The mean would be the most appropriate measure of center because of the symmetrical shape of most of the histogram C) The median would be the most appropriate measure of center because of the symmetrical shape of most of the histogram D) The median would be the most appropriate measure of center because of the major skew & the extreme outliers What summary statistic would be chosen to summarize the spread of this​ distribution? Why? A) The standard deviation would be the most appropriate measure of spread because of the major skew & the extreme outliers B) The IQR would be the most appropriate measure of spread because of the major skew & the extreme outliers C) The standard deviation would be the most appropriate measure of spread because of the symmetrical shape of most of the histogram D) The IQR would be the most appropriate measure of spread because of the symmetrical shape of most of the histogram

What features of the distribution can you see in both the histogram & the​ boxplot? B) Highly skewed to the right with two high outliers at 207 & 215. Most skiers have times between 160 & 180 seconds What summary statistic would be chosen to summarize the center of this​ distribution? Why? D) The median would be the most appropriate measure of center because of the major skew & the extreme outliers What summary statistic would be chosen to summarize the spread of this​ distribution? Why? B) The IQR would be the most appropriate measure of spread because of the major skew & the extreme outliers

A researcher wondered whether drivers treat bicycle riders differently when they wear helmets. He rigged his bicycle with an ultrasonic sensor that could measure how close each car was that passed him. He then rode on alternating days with & without a helmet. Of the 1000 times that a car passed​ him, he found that when he wore his​ helmet, motorists passed 3.09 inches closer to​ him, on average, than when his head was bare. What is the Who in this​ study? A) Bike helmets B) The cars C) The bike riders D) Each instance of a car passing a rider Identify the What for this study A) The number of cars that pass the bicycle rider & whether or not the rider was wearing a helmet B) The number of days which the researcher wore a helmet C) The types of cars that pass the bicycle rider D) The distance at which cars pass the bicycle rider & whether or not the rider was wearing a helmet Identify the larger population A) All cars which drive on the road B) All the cars which pass the researcher on his bicycle C) All bicyclists who ride on the road with cars D) All cars which pass bicyclists

What is the Who in this​ study? D) Each instance of a car passing a rider Identify the What for this study D) The distance at which cars pass the bicycle rider & whether or not the rider was wearing a helmet Identify the larger population D) All cars which pass bicyclists

Sugar is a major ingredient in many breakfast cereals. The histogram displays the sugar content as a percentage of weight for 48 brands of cereal. The boxplot compares sugar content for adult cereals​ (A) and​ children's cereals​ (C). - What is the range of the sugar contents of these​ cereals? - Describe the shape of the distribution - What aspect of breakfast cereals might account for this​ shape? A) Most cereals have similar sugar contents B) Sugar content varies greatly among different cereals C) Cereals tend to be either very sugary or healthy​ low-sugar cereals - Are all​ children's cereals higher in sugar than adult​ cereals? - Which group of cereals varies more in sugar​ content? Explain. A) Although the ranges appear to be comparable for both​ groups, the IQR is larger for the​ children's cereals, indicating that​ there's more variability in the sugar content of the middle​ 50% of​ children's cereals B) Although the ranges appear to be comparable for both​ groups, the IQR is larger for the adult​ cereals, indicating that​ there's more variability in the sugar content of the middle​ 50% of adult cereals C) The larger range of the adult cereals indicates that there is more variability in the adult cereals D) The larger range of the​ children's cereals indicates that there is more variability in the​ children's cereals

What is the range of the sugar contents of these​ cereals? The range of sugar content for all cereals tested is 63% Describe the shape of the distribution The shape of the distribution is bimodal What aspect of breakfast cereals might account for this​ shape? C) Cereals tend to be either very sugary or healthy​ low-sugar cereals Are all​ children's cereals higher in sugar than adult​ cereals? yes Which group of cereals varies more in sugar​ content? Explain. B) Although the ranges appear to be comparable for both​ groups, the IQR is larger for the adult​ cereals, indicating that​ there's more variability in the sugar content of the middle​ 50% of adult cereals

The accompanying bar chart summarizes movie genres from 891 movies released in a certain region in recent years. The genres were originally listed as shown in the accompanying frequency table. Complete parts a & b below What problem would one encounter in trying to make a display of these​ data? A) The frequency for​ "Drama" is at least twice as large as every other frequency B) There are too many categories to make a meaningful bar chart or pie chart by genre C) There are too many categories that are combinations of subcategories D) There are three different​ "Comedy" categories and only one category for all other genres How did the creators of the bar chart solve this​ problem? A) They combined several smaller categories into the category​ "Other" and combined three comedy categories into a single category​ "Comedy" in the bar chart B) They chose a reasonable vertical axis scale. C) They only included the category​ "Drama" in the bar chart. D) They only included the category​ "Thriller/Suspense" in the bar chart.

What problem would one encounter in trying to make a display of these​ data? B) There are too many categories to make a meaningful bar chart or pie chart by genre How did the creators of the bar chart solve this​ problem? A) They combined several smaller categories into the category​ "Other" and combined three comedy categories into a single category​ "Comedy" in the bar chart

A survey of athletic trainers asked what modalities​ (treatment methods such as​ ice, whirlpool,​ ultrasound, or​ exercise) they commonly use to treat injuries. Respondents were asked to list 3 modalities. The survey was published in an article which included the accompanying figure reporting the modalities used What problems can be seen with the​ graph? Select all that apply A) The values are given as percentages instead of counts B) From a simple design​ standpoint, running the labels on the bars one way & the vertical axis label the other way is awkward C) The percentages do not sum to​ 100% D) The bars have false​ depth, which can be misleading even if the depth is kept uniform E) This is a bar​ chart, so the bars should have space between them Consider the percentages for the named modalities. Is there anything odd about​ them? A) The percentages do not sum to​ 100%. Several of the values must have been rounded up B) The percentages sum to​ 100%. This is unlikely if the respondents were asked to name 3 methods each C) The percentages are rounded to the nearest tenth of a percent. This is unconventional & misrepresents the data D) None of the percentages are greater than​ 50%. This is unusual for distribution with many categorial variables

What problems can be seen with the​ graph? Select all that apply B) From a simple design​ standpoint, running the labels on the bars one way & the vertical axis label the other way is awkward D) The bars have false​ depth, which can be misleading even if the depth is kept uniform E) This is a bar​ chart, so the bars should have space between them Consider the percentages for the named modalities. Is there anything odd about​ them? B) The percentages sum to​ 100%. This is unlikely if the respondents were asked to name 3 methods each

The accompanying graphs are a bar chart and a pie chart summarizing the movie ratings from 891 movies. Complete parts a & b below. Which was the least common​ rating? A) R B) PG-13 C) NC-17 D) G E) PG Is it easier to answer the question from the bar chart or from the pie​ chart? Explain. Choose the correct answer below. A) It is easy to tell from either chart because the​ rating's count is so low compared to the counts for the other ratings. B) It is easier to tell from the pie chart because the bars in the bar chart look too similar in height. C) It is easy to tell from either chart because bar charts and pie charts are always equally easy to read. D) It is easier to tell from the bar chart because the slices of the pie chart look too similar in size

Which was the least common​ rating? B) PG-13 Is it easier to answer the question from the bar chart or from the pie​ chart? Explain. Choose the correct answer below. A) It is easy to tell from either chart because the​ rating's count is so low compared to the counts for the other ratings.

The accompanying histogram shows the life expectancies at birth for 190 countries as collected by an international health agency. Which would you expect to be​ larger: the median or the​ mean? Explain briefly Which would you​ report: the median or the​ mean? Explain briefly A) The mean should be​ used because the distribution is skewed B) The mean should be​ used because the distribution is symmetric C) The mean should be​ used because the distribution is uniform D) The median should be​ used because the distribution is uniform E) Neither the mean nor the median is​ reliable because the distribution is skewed F) The median should be​ used because the distribution is skewed

Which would you expect to be​ larger: the median or the​ mean? Explain briefly The median will be larger because the distribution is skewed left Which would you​ report: the median or the​ mean? Explain briefly F) The median should be​ used because the distribution is skewed

Because of the difficulty of weighing a tiger in the jungle​, researchers caught & measured 31 tigers​, recording their​ weight, whisker length​, body​ length, & sex. They hoped to find a way to estimate weight from the​ other, more easily determined quantities. Who was​ measured? A) European tigers B) 31 tigers C) 31 assorted animals in the jungle D) This information is not given What was​ measured? A) The difficulty of taking certain measurements of tigers B) Weight, whisker length​, body​ length, & sex C) Weight, number of offspring per​ season, body​ length, & sex D) This information is not given When were the measurements​ taken? A) In the​ morning, when tigers are easiest to weigh B) During the tiger mating season C) During a​ three-week expedition in the jungle D) This information is not given Where were the measurements​ taken? A) In a zoo B) In the jungle C) In a wild animal rehabilitation center D) This information is not given Why were the measurements​ taken? A) To find the whisker length of 31 tigers B) To find the weights of 31 tigers C) To find an easier way to estimate the weight of a tiger D) This information is not given How did the researchers obtain the​ measurements? A) Researchers baited tigers with food & measured while the tigers ate. B) Researchers made estimates based on more easily observable features. C) Researchers collected data on the 31 tigers they were able to catch. D) This information is not given Specify whether the variables are categorical or​ quantitative, and, for any quantitative​ variable, identify the units in which it was measured​ (or note that they were not​ provided) The variable weight is - quantitative & measured in inches - quantitative & measured in liters - quantitative & units were not provided - categorical - quantitative & measured in pounds The variable whisker length is - quantitative & measured in inches - quantitative & measured in liters - quantitative & units were not provided - categorical - quantitative & measured in pounds The variable body length is - quantitative & measured in inches - quantitative & measured in liters - quantitative & units were not provided - categorical - quantitative & measured in pounds The variable sex is - quantitative & measured in inches - quantitative & measured in liters - quantitative & units were not provided - categorical - quantitative & measured in pounds

Who was​ measured? B) 31 tigers What was​ measured? B) Weight, whisker length​, body​ length, & sex When were the measurements​ taken? D) This information is not given Where were the measurements​ taken? D) This information is not given Why were the measurements​ taken? C) To find an easier way to estimate the weight of a tiger How did the researchers obtain the​ measurements? C) Researchers collected data on the 31 tigers they were able to catch. The variable weight is - quantitative & units were not provided The variable whisker length is - quantitative & units were not provided The variable body length is - quantitative & units were not provided The variable sex is - categorical

Load the accompanying data about a particular car race into your preferred statistics package & answer the questions a through c below. ​a) What was the average speed of the winner in 1951​? A) 78.719 miles/hour B) 124.022 miles/hour C) 126.244 miles/hour D) 128.922 miles/hour b) How many times did Arie Luyendyk win the race in the 1990s? ​c) How many races took place during the 1910​s?

a) What was the average speed of the winner in 1951​? C) 126.244 miles/hour b) How many times did Arie Luyendyk win the race in the 1990s? 2 time(s) c) How many races took place during the 1910​s? 7 race(s)

Load the accompanying data about the Kentucky Derby into your preferred statistics package & answer the questions a through d below. ​a) What was the name of the winning horse in 1886​? A) Montrose B) Joe Cotton C) Sunny's Halo D) Ben Ali ​b) When did the length of the race​ change? A) 1895 B) 1897 C) 1896 D) 1996 c) What was the winning time in 1970​? Select all that apply. A) 3.4 seconds B) 123.4 seconds C) 2 minutes & 36.4667747 seconds D) 2 minutes E) 120 seconds F) 36.4667747 seconds G) 2 minutes & 3.4 d) Only two horses have run the race in less than 2 minutes. Which horses & in what​ years?

a) What was the name of the winning horse in 1886​? D) Ben Ali ​b) When did the length of the race​ change? C) 1896 c) What was the winning time in 1970​? Select all that apply. B) 123.4 seconds G) 2 minutes & 3.4 d) Only two horses have run the race in less than 2 minutes. Which horses & in what​ years? Secretariat in 1973 & Monarchos in 2001

The number of annual deaths from tornadoes in a certain region is shown below a) mean b) median &​ quartiles c) range & IQR

a) mean The mean is 114.82 b) median &​ quartiles - The median is 62 -The lower quartile is 42 - The upper quartile is 133 c) range & IQR - The range is 526 - The IQR is 91

The data to the right are the annual numbers of deaths from floods in a certain country over a span of 21 years. Find the​ a) mean b) median &​ quartiles c) range & IQR

a) mean The mean is 81.57 b) median &​ quartiles - The median is 81 - The first quartile is Q1=48 - The third quartile is Q3=109 c) range & IQR - The range is 145 - The interquartile range is IQR=61

A clerk entering salary data into a company spreadsheet accidentally put an extra ​"0​" in the​ boss's salary, listing it as ​$2,900,000 instead of ​$290,000. Explain how this error will affect these summary statistics for the company payroll. measures of center​ (median and​ mean) A) The median and the mean will be too large B) Assuming the​ boss's true salary is above the​ median, the median will be the same. The mean will be too large C) The median will be too small and the mean will be too large D) Assuming the​ boss's true salary is above the​ median, the median will be the same. The mean will be too small measures of spread​ (range, IQR, and standard​ deviation) A) The range will likely be the same. The IQR will likely be too large. The standard deviation will be too large B) The range will likely be the same. The IQR will likely be too large. The standard deviation will be too small C) The range will likely be too large. The IQR will likely be too small. The standard deviation will be too small D) The range will likely be too large. The IQR will likely be the same. The standard deviation will be too large

measures of center​ (median and​ mean) B) Assuming the​ boss's true salary is above the​ median, the median will be the same. The mean will be too large measures of spread​ (range, IQR, and standard​ deviation) D) The range will likely be too large. The IQR will likely be the same. The standard deviation will be too large

A study of body fat on 250 men collected measurements of 12 body parts as well as the percentage of body fat that the men carried. The accompanying display is a dot-plot of their bicep circumferences​ (in inches). Describe the shape of the distribution of bicep circumferences the distribution is?

the distribution is? unimodal & roughly symmetric

Corey has 4929 songs in his​ computer's music library. The songs have a mean duration of 240.9 seconds with a standard deviation of 111.11 seconds. One of the songs is 371 seconds long. What is its​ z-score?

z = 1.17

Suppose your statistics professor reports test grades as​ z-scores, and you got a score of 2.45 on an exam. ​- Write a sentence explaining what that means A) The score was 2.45 standard deviations lower than the mean score in the class B) The score was 2.45 points lower than the mean score in the class C) The score was 2.45 points higher than the mean score in the class D) The score was 2.45 standard deviations higher than the mean score in the class - Your friend got a​ z-score of −2. If the grades satisfy the Nearly Normal​ Condition, about what percent of the class scored lower than your​ friend?

​- Write a sentence explaining what that means D) The score was 2.45 standard deviations higher than the mean score in the class - Your friend got a​ z-score of −2. If the grades satisfy the Nearly Normal​ Condition, about what percent of the class scored lower than your​ friend? About 2.5​% of the class scored lower than your friend

The pie chart shows the ratings assigned to 873 ​first-run movies released in a recent year ​Is this an appropriate display for these​ data? Explain. A) No, because there are too many movies to be displayed in a pie chart. B) Yes, because each movie falls into only one category and no categories overlap. C) Yes, because we are given a frequency table. D) No, because each movie might fall into more than one category and the categories always overlap. Which was the most common​ rating? - G - PG - Not Rated - R - PG-13 - NC-17

​Is this an appropriate display for these​ data? Explain B) Yes, because each movie falls into only one category and no categories overlap Which was the most common​ rating? - R

A government bureau keeps track of the number of adoptions in each region. The accompanying histograms show the distribution of adoptions & the population of each region. ​What do the histograms say about the​ distributions? A) The distribution of adoptions is skewed​ right, but the distribution of populations is skewed left. Most regions have larger populations & fewer​ adoptions, but some small regions have substantially more adoptions. B) Both distributions are symmetric. Most regions have a moderate population and a moderate amount of​ adoptions, with some big regions having substantially more of each & some small regions having substantially fewer of each. C) Both distributions are skewed left. Most regions have larger populations & more​ adoptions, but some small regions have substantially fewer of each. D) Both distributions are skewed right. Most regions have smaller populations & fewer​ adoptions, but some big regions have substantially more of each. Why do the histograms look​ similar? A) Regions with larger populations are likely to have more adoptions than regions with smaller populations. B) The number of adoptions and the population of each region are unrelated. C) Regions with smaller populations are likely to have more adoptions than regions with larger populations. D) Regions with smaller populations are more likely to have an unusually small or an unusually large number of adoptions. What might be a better way to express the number of​ adoptions? Select all that apply A) Report the first 100 adoptions in each region. B) Report the number of adoptions per​ 100,000 people. C) Report the average number of adoptions per city in each region. D) Report the number of adoptions in the largest city in each region.

​What do the histograms say about the​ distributions? D) Both distributions are skewed right. Most regions have smaller populations & fewer​ adoptions, but some big regions have substantially more of each. Why do the histograms look​ similar? A) Regions with larger populations are likely to have more adoptions than regions with smaller populations What might be a better way to express the number of​ adoptions? Select all that apply B) Report the number of adoptions per​ 100,000 people.


Set pelajaran terkait

Chapter 7 Knowledge Representation.

View Set

Computed Tomography; Chapter. 18

View Set

PrepU - CH.1 Nurse's role in health assessment: collecting and analyzing data

View Set

MBA 6207 - Chapter 10 Knowledge Management and Specialized Information Systems

View Set

EAQ _ Technology and Informatics

View Set