AP Stats Final Year Review
e
"Least-squares" in the term "least-squares regression line" refers to a. Minimizing the sum of the squares of all values of the explanatory variable. b. Minimizing the sum of the squares of all values of the response variable. c. Minimizing the products of each value of the response variable and the predicted value based on the regression equation. d. Minimizing the squares of the differences between each value of the response variable and each value of the explanatory variable. e. Minimizing the sum of the squares of the residuals.
d
A 1992 Roper poll found that 22% of Americans say that the Holocaust may not have happened. The actual question asked in the poll was "Does it seem possible or impossible to you that the Nazi extermination of the Jews never happened?" and 22% responded possible. The results of this poll cannot be trusted because a. nonresponse is present. Many people will refuse to participate, and those who do will be biased in their opinions. b. the question is clearly biased in the direction of a "possible" answer. c. we do not know who conducted the poll or who paid for the results. d. the question is worded in a confusing manner. e. undercoverage is present. Obviously, those people who did not survive the Holocaust could not be in the poll.
c
A company produces packets of soap powder labeled "Giant Size 32 Ounces." The actual weight of soap powder in a box has a Normal distribution with a mean of 33 oz. and a standard deviation of 0.8 oz. What proportion of packets are underweight (i.e., weigh less than 32 oz.)? a. 0.159. b. 0.212. c. 0.106. d. 0.841. e. 0.115.
c
A consumer group surveyed the prices for a certain item in five different stores, and reported the average price as $15. We visited four of the five stores, and found the prices to be $10, $15, $15, and $25. Assuming that the consumer group is correct, what is the price of the item at the store that we did not visit? a. $20 b. $5 c. $10 d. $15 e. $25
d
A lobster fisherman is keeping track of the productivity of a set of traps he has placed in a favorite location. Below are the numbers of lobsters in these traps over the course of 12 different hauls. 0 3 3 3 4 5 5 6 7 7 12 14 According to the 1.5 x IQR rule, which values in the above distribution are outliers? a. 0 only b. 12 and 14 c. 0 and 14 d. 14 only e. 0, 12, and 14
b
A market research company wishes to find out whether the population of students at a university prefers brand A or brand B of instant coffee. A random sample of students is selected, and each one is asked to try brand A first and then brand B (or vice versa, with the order determined at random). They then indicate which brand they prefer. The response variable is a. whether brand A or B is tried first. b. which brand they prefer. c. coffee. d. the identity of the student. e. none of these.
d
A policeman records the speeds of cars on a certain section of roadway with a radar gun. The histogram below shows the distribution of speeds for 251 cars. Which of the following measures of center and spread would be the best ones to use when summarizing these data? a. Median and interquartile range. b. Median and standard deviation. c. Median and range. d. Mean and standard deviation. e. Mean and interquartile range.
b
A poll conducted by the student newspaper asked, "Who do you believe will win the Ohio State Undergraduate Student Government elections?" In order to vote, one had to access the student newspaper's Web site and record one's vote. The results of the poll were summarized in a graphic similar to the following. Which of the following statements is true about these results? a. There must be an error. These percentages aren't possible. b. The results of the survey are unreliable because response to the survey was voluntary. c. Patel and Patel have such a large majority that, even though there are flaws in the poll, they are still almost certain to win. d. This is not an appropriate way of presenting the results—a bar graph should have been used instead. e. The sample is large enough to eliminate potential sources of bias in the design of the poll.
e
A random sample of 100 students in grades 10 through 12 were sampled and asked their year in school and whether they were involved in interscholastic sports, intramural sports, or no sports. The results are summarized in the segmented bar graph below. Based on this graph, which of the following statements is true? a. More seniors are involved in interscholastic sports than sophomores. b. Less than half the seniors are involved in either interscholastic or intramural sports. c. There is no association between year in school and whether students are involved in sports. d. There were more seniors in the sample than juniors. e. Juniors have the highest percentage participation in intramurals.
c
A researcher wishes to determine whether the rate of water flow (in liters per second) over an experimental soil bed can be used to predict the amount of soil washed away (in kilograms). In this study, the explanatory variable is a. liters/second. b. depth of soil bed. c. rate of water flow. d. size of soil bed. e. amount of eroded soil.
b
A score's percentile is a measure of a. spread b. relative location c. center d. relative frequency e. skew
a
A set of data has a mean that is much larger than the median. Which of the following statements is most consistent with this information? a. The distribution is skewed right. b. The distribution is bimodal. c. The data set probably has a few low outliers. d. The distribution is skewed left. e. The distribution is symmetric.
d
A standard score describes a. How much spread there is in a distribution b. How far a particular score is from the median. c. How far apart the mean and median of a distribution are. d. How far a particular score is from the mean. e. How much skew there is in a distribution
e
A statistics teacher asks the 29 students in his statistics class how many minutes they spent on one homework assignment. The distribution of the variable "time on homework" is a. the number of students who were asked the questions—that is, 29. b. the average time the students spent on the assignment. c. the difference between the longest time and the shortest time among the students' responses. d. the average distance between each value of the variable. e. a description of what values the variable takes and how often it takes them.
b
A stratified random sample is appropriate when a. You want to avoid undercoverage of certain groups. b. The population can be easily subdivided into groups according to some categorical variable, and the variable you are measuring is very similar within the groups but quite different between groups. c. It is impractical to take a simple random sample because the population is too large. d. The population can be easily subdivided into groups according to some categorical variable, and the variable you are measuring is quite different within the groups but very similar between groups. e. You intend to take a sample of more than 100 individuals.
c
A study is conducted to determine if one can predict the yield of a crop based on the amount of fertilizer applied to the soil. The response variable in this study is a. the soil. b. amount of fertilizer applied to the soil. c. yield of the crop. d. amount of rainfall. e. the experimenter.
e
A study of child development measures the age (in months) at which a child begins to talk and also the child's score on an ability test given several years later. The study asks whether the age at which a child talks helps predict the later test score. The least-squares regression line of test score y on age x is y = 110 - 1.3x. According to this regression line, what happens (on the average) to children who talk one month later than other children? a. Their predicted test scores go up 1.3 points. b. Their predicted test scores go up 110 points. c. Their predicted test scores are 108.7. d. Their predicted test scores go down 110 points. e. Their predicted test scores go down 1.3 points.
d
A study of elementary school children, ages 6 to 11, finds a high positive correlation between shoe size x and score y on a test of reading comprehension. The observed correlation is most likely due to a. several outliers in the data set. b. a mistake, since the correlation must be negative. c. "reverse" cause and effect (higher reading comprehension causes larger shoe size). d. the effect of another variable, such as age. e. cause and effect (larger shoe size causes higher reading comprehension).
e
A study of the effects of television measured how many hours of television each of 125 grade school children watched per week during a school year and their reading scores. The study found that children who watch more television tend to have lower reading scores than children who watch fewer hours of television. The study report says that "Hours of television watched explained 9% of the observed variation in the reading scores of the 125 subjects." The correlation between hours of TV and reading score must be a. r = 0.3. b. r = 0.09. c. r = -0.09. d. Can't tell from the information given. e. r = -0.3.
c
A study of the effects of television on child development measured how many hours of television each of 125 grade school children watched per week during a school year and each child's reading score. Which variable would you put on the horizontal axis of a scatterplot of the data? a. Reading score, because it is the response variable. b. Reading score, because it is the explanatory variable. c. Hours of television, because it is the explanatory variable. d. It makes no difference, because there is no explanatory-response distinction in this study. e. Hours of television, because it is the response variable.
a
A survey typically records many variables of interest to the researchers involved. Below are some of the variables from a survey conducted by the U.S. Postal Service. Which of the variables is categorical? a. County of residence b. Age of respondent c. Number of people, both adults and children, living in the household d. Number of rooms in the dwelling e. Total household income, before taxes, in 1993
d
A television station is interested in predicting whether voters in its viewing area are in favor of offshore drilling. It asks its viewers to phone in and indicate whether they support/are in favor of or are opposed to this practice. Of the 2241 viewers who phoned in, 1574 (70%) were opposed to offshore drilling. The viewers who phoned in are a. a convenience sample. b. a simple random sample. c. a population. d. a voluntary response sample. e. a probability sample.
e
All of the following distributions have a mean of 10. Which has the largest standard deviation? a. 10, 10, 10, 10, 10, 10, 10, 10 b. 5, 5, 5, 10, 10, 10, 15, 15, 15 c. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 d. 5, 8, 10, 12, 15 e. 5, 5, 5, 15, 15, 15
c
An ecologist who was studying starfish populations collected starfish of the species Pisaster was interested in the distribution of sizes of starfish on a certain shoreline. One measure of size is "arm length." Below is a cumulative relative frequency distribution for the arm length of 102 Pisaster individuals. The median and interquartile range of this distribution are approximately: a. Median is 13; Intequartile range is 3.7 b. Median is 13; Interquartile range is 3.1 c. Median is 15.2; Intequartile range is 3.7 d. Median is 13; Interquartile range is 13 to 16.1 e. Median is 15.2; Intequartile range is 12.5 to 16.2
b
An experiment compares the taste of a new spaghetti sauce with the taste of a commercially successful sauce readily available in grocery stores. Each of a number of tasters tastes both sauces (in random order) and says which tastes better. This is called a a. double-blind design. b. matched pairs design. c. simple random sample. d. completely randomized design. e. stratified random sample.
c
An experiment was conducted by some students to explore the nature of the relationship between a person's heart rate (measured in beats per minute) and the frequency at which that person stepped up and down on steps of various heights. Three rates of stepping and two different step heights were used. A subject performed the activity (stepping at one of the three stepping rates at one of the two possible heights) for three minutes. Heart rate was then measured at the end of this period. The variables "stepping rate" and "step height" are the a. levels. b. units. c. factors. d. response variables. e. controls.
a
Birthweights at a local hospital have a Normal distribution with a mean of 110 oz. and a standard deviation of 15 oz. Which of the following is the proportion of infants with birthweights between 125 oz. and 140 oz.? a. 0.135 b. 0.475 c. 0.815 d. 0.270 e. 0.680
e
Consider the following cumulative relative frequency graph of the scores of students in an introductory statistics course: A grade of C or C+ is assigned to a student who scores between 55 and 70. The percentage of students who obtained a grade of C or C+ is a. 50% b. 15% c. 30% d. 25% e. 20%
e
Consider the following scatter plot of two variables, X and Y. We may conclude that the correlation between X and Y a. must be close to -1, since the relationship is between X and Y is clearly non-linear. b. must be close to 0, since the relationship is between X and Y is clearly non-linear. c. greater than 1, since the relationship is non-linear. d. may be exactly 1, since all the points lie of the same curve. e. is close to 1, even though the relationship is not linear.
b
Different writers have different styles. One way to quantify this difference is to compare the distribution of word lengths in their work. Below are parallel boxplots describing the distributions of word lengths for the first 60 words in Henry James's The Turn of the Screw, J.K. Rowling's Harry Potter and the Chamber of Secrets, and Chapter 1 of your statistics textbook (labeled "Starnes" below). Based on the graphs, which one of the following statements must be true? a. The range of Rowling's word lengths is smaller than the interquartile range of Starnes's word lengths. b. The median word length for Rowling is longer than for either Starnes or James. c. The longest word in the distribution of Rowling's word lengths is shorter than 25% of the words in the "James" distribution. d. 75% of the words in Rowling's distribution are longer than the median word length in Starnes's distribution. e. Dot plots of the distributions of James's word lengths and Starnes's word lengths are identical.
c
Entomologist Heinz Kaefer has a colony of bongo spiders in his lab. There are 1000 adult spiders in the colony, and their weights are Normally distributed with mean 11 grams and standard deviation 2 grams. About how many spiders are there in the colony which weigh more than 12 grams? a. 160 b. 117 c. 310 d. 690 e. 840
d
For the density curve below, which of the following is true? a. The density curve is Normal. b. The density curve is symmetric. c. The density curve is skewed right. d. The median is larger than 0.5. e. The median is 0.5.
e
Here is a list of exam scores for the 14 students in Mr. Williams's calculus class: 60 61 61 65 72 75 75 78 81 81 85 89 91 98 What is the percentile of the person whose score was 85? a. 15% b. 29% c. 71% d. 85% e. 21%
e
If 30 is added to every number on a list, which one of following does not change? a. the median. b. the 75th percentile. c. the mean. d. the mode. e. the standard deviation.
d
If changes in a response variable are due to the effects of the explanatory variable as well as the effects of another variable, and we cannot distinguish between these effects, we are said to have a. correlation. b. a cause-and-effect relation between the explanatory and response variable. c. extrapolated. d. confounding. e. a placebo effect.
e
If removing an observation from a data set would have a marked change on the equation of the least-squares regression line, the point is called a. a residual. b. a response. c. an outlier. d. resistant. e. influential.
d
If the heights of 99.7% of American men are between 5' 0" and 7' 0", what is your estimate of the standard deviation of the height of American men? a. 3" b. 12" c. 6" d. 4" e. 1"
c
If the values in a data set are in feet, then what are the units for standard scores? a. Feet b. 1/feet c. Standard scores do not have units. d. Square feet e. Square root of feet
a
In comparative trials in medicine, the placebo effect and subconscious bias on the part of the physicians evaluating treatment outcomes can be avoided by using a. the double-blind technique. b. randomized complete block designs. c. response variables. d. stratified random samples. e. all of the above.
a
In order to assess the opinion of students at the University of Minnesota on campus snow removal, a reporter for the student newspaper interviews the first 12 students he meets who are willing to express their opinion. The method of sampling used is a. a convenience sample b. a voluntary response sample c. a census d. a simple random sample e. a cluster sample
b
Just before the presidential election of 1936, the magazine Literary Digest predicted—incorrectly, as it turned out—that Alf Landon would defeat Franklin Delano Roosevelt. Landon lost in a landslide. It turned out that the magazine had only polled its own subscribers, plus others from a list of automobile owners and a list of people who had telephone service. All three groups had higher than typical incomes during the Great Depression. This is an example of a. voluntary response bias. b. undercoverage. c. response bias. d. nonresponse. e. bias resulting from question wording.
a
One hundred volunteers who suffer from severe depression are available for a study. Fifty are selected at random and are given a new drug that is thought to be particularly effective in treating severe depression. The other fifty are given an existing drug for treating severe depression. A psychiatrist evaluates the symptoms of all volunteers after four weeks in order to determine if there has been substantial improvement in the severity of the depression. The factor in this study is a. which treatment the volunteers receive. b. the extent to which the depression was reduced. c. the use of randomization and the fact that this was a comparative study. d. the symptoms observed by the psychiatrist. e. the use of a psychiatrist to evaluate the severity of depression.
c
Scores on the 1995 SAT verbal aptitude test x among Kentucky high school seniors were normally distributed with mean 420 and standard deviation 80. Scores on the 1995 SAT quantitative aptitude test y among Kentucky high school seniors were normally distributed with mean 440 and standard deviation 60. The least-squares regression line has the equation y = 188 = 0.6x. The correlation between verbal scores and math scores is a. cannot be determined from the information given b. -.8 c. .8 d. .45 e. 0
a
Suppose a straight line is fit to data having response variable y and explanatory variable x. Predicting values of y for values of x outside the range of the observed data is called a. extrapolation. b. correlation. c. interpolation. d. causation. e. contingency.
c
Suppose that scores on a certain IQ test are Normally distributed with mean 110 and standard deviation 15. Then about 40% of the scores are between a. the 25th and 75th percentiles. b. 106 and 110. c. 102 and 118. d. 65 and 155. e. 80 and 140.
e
Suppose we fit the least-squares regression line to a set of data. If a plot of the residuals shows a curved pattern, a. r2 = 0. b. the correlation must be positive. c. outliers must be present. d. the correlation must be 0. e. a straight line is not a good summary for the data.
a
The Normal curve below describes the death rates from heart disease per 100,000 people in developed countries in the 1990's. The mean and standard deviation of this distribution are approximately a. Mean » 190; Standard Deviation » 65 b. Mean » 200; Standard Deviation » 130 c. Mean » 190; Standard Deviation » 100 d. Mean » 100; Standard Deviation » 65 e. Mean » 100; Standard Deviation » 100
a
The State of Rhode Island has only five counties. The relative abundance of people in each county, according to the 2010 census, is given in the pie chart below. Which of the following frequency distributions could describe the same population? (Frequencies have been rounded to the nearest thousand). Selected: a. County Population Bristol 50,000 Kent 166,000 Newport 83,000 Providence 627,000 Washington 127,000 b. Population Bristol 83,000 Kent 50,000 Newport 166,000 Providence 627,000 Washington 127,000 c. County Population Bristol 50,000 Kent 166,000 Newport 166,000 Providence 627,000 Washington 50,000 d. County Population Bristol 50,000 Kent 166,000 Newport 627,000 Providence 83,000 Washington 127,000 e. County Population Bristol 50,000 Kent 166,000 Newport 50,000 Providence 627,000 Washington 166,000
b
The ages of people in a college class are as follows: Age 18 19 20 21 22 23 24 25 32 Number of students 14 120 200 200 90 30 10 2 1 What is true about the median age? Selected: a. It must be 20.5.This answer is incorrect. b. It must be 20.This is the correct answer. c. It must be over 21. d. It must be 21. e. It could be any number between 19 and 21.
c
The bar graph below summarizes responses of dog owners to the question, "Where in the car do you let your dog ride?" Which of the following statements is true? a. These data could also be presented in a pie chart. b. Roughly twice as many pets are allowed to sit in the front passenger seat as in the passenger's lap. c. The vertical scale of this graph exaggerates the difference between the percentage who let their dogs ride in the driver's lap versus a passenger's lap. d. Each owner gave only one answer to the question. e. A majority of owners do not allow their pets to ride in the front passenger seat.
b
The correlation coefficient measures a. whether a cause and effect relation exists between two variables. b. the strength of the linear relationship between two quantitative variables. c. whether there is a relationship between two variables. d. the strength of the relationship between two quantitative variables. e. whether or not a scatterplot shows an interesting pattern.
b
The density curve below shows the distribution of a variable that takes values from 0 to 1. What percent of the observations lie between 0 and 0.5? a. 80% b. 25% c. 75% d. 50% e. 20%
c
The distribution of household incomes in a small town is strongly skewed to the right. The mean income is $42,000 and the standard deviation is $24,000. The Ames family's household income is $60,000. The z-score for the Ames family's income is a. -0.75 b. 0.3 c. 0.75 d. 0.86 e. None of these, because z-score cannot be used unless the distribution is Normal.
b
The essential difference between an experiment and an observational study is that a. observational studies are always biased. b. an experiment imposes treatments on the subjects, but an observational study does not. c. observational studies may have confounded variables, but experiments never do. d. observational studies cannot have response variables. e. in an experiment, people must give their informed consent before being allowed to participate.
b
The five-number summary of the distribution of 316 scores on a statistics exam is 0 26 31 36 50. The scores are approximately Normal. The standard deviation of test scores must be about a. 0.67. b. 7.5. c. 10. d. 5.0. e. 55.
a
The five-number summary of the distribution of scores on the final exam in Psych 001 last semester was: 18 39 62 76 100. Which of the following best describes the location of the 80th percentile? a. between 76 and 100. b. between 18 and 39 c. between 62 and 76 d. 76 e. probably between 39 and 76, since most of the class scored between these two numbers.
d
The following histogram represents the distribution of acceptance rates (percent accepted) among 25 business schools in 1997. What percent of the schools have an acceptance rate of under 20%? a. 3% b. 24% c. 4% d. 16% e. 12%
c
The fraction of the variation in the values of a response y that is explained by the least-squares regression of y on x is the a. sum of the squared residuals. b. correlation coefficient. c. square of the correlation coefficient. d. slope of the least-squares regression line. e. intercept of the least-squares regression line.
d
The histogram below shows the length (in minutes) of 140 songs recorded by the band Wilco. Which of the following descriptions best fits this distribution? a. Skewed left, centered at about 8, with several high outliers. b. Skewed left, centered at about 4.5, with several high outliers. c. Skewed right, centered at about 8, with several high outliers. d. Skewed right, centered at about 4.5, with several high outliers. e. Skewed left, centered at about 3.5, with several high outliers.
b
The least-squares regression line is fit to a set of data. If one of the data points has a positive residual, then a. the correlation between the values of the response and explanatory variables must be positive. b. the point must lie above the least-squares regression line. c. the point must lie near the right edge of the scatterplot. d. the point is probably an influential point. e. all of the above.
a
The mean age of four people in a room is 30 years. A new person whose age is 55 years enters the room. The mean age of the five people now in the room is a. 35. b. 37.5. c. 40. d. 30. e. Cannot be determined from the information given.
b
The mean number of days that the midge Chaoborus spends in its larval stage is 14.1 days, with a standard deviation of 2.2 days. This distribution is skewed toward higher values. What is the z-score for an individual midge that spends 12.7 days in its larval stage? a. -1.11 b. -0.64 c. 0.64 d. 0.94 e. None of these, because z-score cannot be used unless the distribution is Normal.
d
The median age of five elephants at a certain zoo is 30 years. One of the elephants, whose age is 50 years, is transferred to a different zoo. The median age of the remaining four elephants is a. 30 years. b. 25 years. c. less than 30 years. d. Cannot be determined from the information given. e. 40 years.
e
The plot shown below is a Normal probability plot for the total annual cost (tuition plus room and board) to attend 126 of the top colleges in the country in 2005. Which statement is true for these data? a. The data are clearly Normally distributed. b. There is insufficient information to determine the shape of the distribution. c. The data are clearly skewed to the right. d. The data are approximately Normally distributed. e. The data are clearly skewed to the left.
b
The principle reason for replication in designing experiments is that it a. allows double-blinding. b. reduces sampling variability. c. eliminates the placebo effect. d. distinguishes a treatment effect from the effects of other, possibly confounding variables. e. creates approximately equal groups for comparison.
e
The scatter plot below describes the relationship between heights of 36 students and the number of words they spelled correctly in a spelling bee. The closed circles represent first graders and the open circles represent fifth graders. Which of the following statements is supported by the information in the scatter plot? a. The tallest first grader is taller than six of the third graders. b. The tallest first grader spelled more words correctly than five of the fifth graders. c. All of the fifth graders spelled more words correctly than any of the first graders. d. Within each of the two grades, there is a strong negative relationship between height and how many words were spelled correctly. e. When the data for first and fifth grades is combined, there is a moderately strong positive relationship between height and how many words were spelled correctly.
a
The scatterplot below summarizes the relationship between chocolate consumption and the number of Nobel laureates per million people in 22 developed nations. The black squares represent European countries, the open circles are non-European countries. Which of the follow statements is an appropriate conclusion to draw from these data? a. There is a stronger correlation between chocolate consumption and Nobel laureates in non-European countries than in European countries. b. The correlation is stronger for European countries than for non-European countries. c. The two countries with the highest number of Nobel laureates (per million) only consume 4 - 5 kilograms of chocolate per year per person. d. People in non-European countries consume more chocolate than people in European countries. e. Consuming more chocolate has a positive impact on cognitive function and increases the likelihood of winning a Nobel prize.
a
The scores on a university examination are Normally distributed with a mean of 62 and a standard deviation of 11. If the bottom 5% of students will fail the course, what is the lowest mark that a student can have and still be awarded a passing grade? a. 44 b. 57 c. 62 d. 40 e. 43
e
The standard deviation of 16 peoples' weights (in pounds) is computed to be 5.4. The units for the variance of these measurements is a. percentiles. b. square root pounds. c. pounds. d. no units. Variance never has units. e. pounds squared.
c
The standard deviation of 16 peoples' weights (in pounds) is computed to be 5.4. The variance of these measurements is a. 52.34. b. 256. c. 29.16. d. 21.6. e. 2.24.
e
The stemplot below shows the number of home runs hit in 2008 by members of the Philadelphia Phillies, who won major League Baseball's World Series that year. (Each of the 13 players who appeared in at least half the Phillies' games that year is included). Note that 4 | 8 represents 48 home runs. The five number summary for these data is: a. 0, 9, 11, 33, 48 b. 0, 9, 1, 3, 8 c. 0, 6.5, 11, 28.5, 33 d. 0, 4, 11, 24, 48 e. 0, 6.5, 11, 28.5, 48
a
The table below shows the results of the New Hampshire Republican Presidential Primary on January 10, 2012. Candidate Percentage of votes Mitt Romney 39 Ron Paul 23 John Huntsman 17 Rick Santorum 9 Newt Gingrich 9 Other 3 Which of the following lists of graphs are all appropriate ways of presenting these data? a. Bar graph, Pie Chart b. Bar Graph only c. Pie Chart only d. Bar graph, Box plot e. Bar graph, Pie Chart, Box plot
b
The time to complete a standardized exam is approximately Normal with a mean of 70 minutes and a standard deviation of 10 minutes. How much time should be given to complete the exam so that 80% of the students will complete the exam in the time given? a. 84 minutes b. 78.4 minutes c. 61.6 minutes d. 92.8 minutes e. 79.8 minutes
b
There are three children in a room, ages three, four, and five. If another four-year-old child enters the room the a. mean age and variance will increase. b. mean age will stay the same but the variance will decrease. c. mean age and variance will decrease. d. mean age and variance will stay the same. e. mean age will stay the same but the variance will increase.
a
There is a positive correlation between the size of a hospital (measured by number of beds) and the median number of days that patients remain in the hospital. Does this mean that you can shorten a hospital stay by choosing to go to a small hospital? a. No - the positive correlation is probably explained by the fact that seriously ill people go to large hospitals b. No - a negative correlation would allow that conclusion, but this correlation is positive. c. Yes - the correlation establishes that hospital size is the reason for shorter stays. d. Yes - but only if r is greater than 0.5. e. Yes - but only if r is very close to 1.
e
Using the standard Normal distribution tables, the area under the standard Normal curve corresponding to -0.5 < Z < 1.2 is a. 0.8849. b. 0.2815. c. 0.3085. d. 0.3661. e. 0.5764.
e
Using the standard Normal distribution tables, the area under the standard Normal curve corresponding to Z < 1.1 is a. 0.8413. b. 0.8438. c. 0.1357. d. 0.2704. e. 0.8643.
e
Using the standard Normal distribution tables, the area under the standard Normal curve corresponding to Z > -1.22 is a. 0.4129. b. 0.1151. c. 0.8849. d. 0.1112. e. 0.8888.
a
Which of the following are most likely to be negatively correlated? a. The prices and the weights of all racing bicycles sold last year in Chicago. b. Gender and yearly earnings among 35-year-old U.S. adults. c. The heights and yearly earnings of 35-year-old U.S. adults. d. The total floor space and the price of an apartment in New York. e. The percentage of body fat and the time it takes to run a mile for male college students.
c
Which of the following best describes the correlation r? a. The average of the squared products of the standardized scores of X and Y for each point. b. The average perpendicular distance between each data point and the least-squares regression line. c. The average of the products of the standardized scores of X and Y for each point. d. The average of the differences between each X value and each Y value. e. The average of the products of each of the X and Y values for each point
c
Which of the following dot plots would best be approximated by a Normal distribution? a. C b. D c. E d. A e. B
a
Which of the following graphs can be used to summarize the data in a two-way table? a. Segmented bar graph b. Stem plot c. Dot plot d. Histogram e. Box plot
a
Which of the following is correct? a. The mean of the residuals from least-squares regression is 0. b. The sum of the squared residuals from the least-squares line is 0. c. The square of the correlation is the proportion of the data lying on the least-squares regression line. d. The square of the correlation is the slope of the least-squares regression line. e. The correlation r is the slope of the least-squares regression line.
a
Which of the following statements about influential points and outliers are true? I. An influential point always has a high residual. II. Outliers are always influential points. III. Removing an influential point always causes a marked change in either the correlation, the regression equation, or both. a. III only. b. II and III only. c. I, II, and III are all true. d. I only. e. II only
e
Which of the following statements are true about the least-squares regression line? I. The slope is the predicted change in the response variable associated with a unit increase in the explanatory variable. II. The line always passes through the point, mc014-1.jpgthe means of the explanatory and response variables, respectively. III. It is the line that minimizes the sum of the squared residuals. a. I only. b. III only. c. II only. d. I and III only. e. I, II, and III are all true.
e
Which of the following statements concerning residuals is true? a. The sum of the residuals is always 0. b. A plot of the residuals is useful for assessing the fit of the least-squares regression line. c. The value of a residual is the observed value of the response minus the value of the response that one would predict from the least-squares regression line. d. An influential point on a scatterplot is not necessarily the point with the largest residual. e. All of the above.
c
Which of the following statements describes what the standard deviation of residuals for a regression equation can be used for? I. It describes the typical vertical distance between an observed data point and the regression line. II. It evaluates whether a linear model is appropriate for a set of data. III. It measures the overall precision of predictions made using the regression equation. a. III only b. Both I and II c. Both I and III d. II only e. I only
d
Which one of the following statements is correct? a. A researcher finds the correlation between the shoe size of children and their score on a reading test to be 0.22. The researcher must have made a mistake since these two variables are clearly unrelated and must have correlation 0. b. Women tend to be, on average, about 3.5 inches shorter than the men they marry, so the correlation between the heights of spouses must be negative. c. Small dogs bark more often than big dogs, so the correlation between dog size and barking frequency is positive. d. If people with larger heads tend to be more intelligent, then we would expect the correlation between head size and intelligence to be positive. e. The correlation r equals the proportion of times that two variables lie on a straight-line.
a
X and Y are two categorical variables. The best way to determine if there is a relation between them is to a. make a two-way table of the X and Y values. b. compare medians and interquartile ranges of the X and Y values. c. construct parallel box plots of the X and Y values. d. draw dot plots of the X and Y values. e. compare means and standard deviations of the X and Y values.
e
You are examining the relationship between x = the height of red oak trees and y = the number of acorns produced in a five year period. You calculate a correlation coefficient and a least-squares regression line of y on x. If you switched the variables (that is, let x = number of acorns and y = height of trees), which of the following would be true? a. Neither the correlation coefficient nor the regression equation would change. b. Only the y-intercept of the regression line would change, the slope of the line and the correlation coefficient would not change. c. Both the correlation coefficient and the regression line would be changed. d. The correlation coefficient would change, but the regression line would not change. e. The correlation coefficient would not change, but the regression line would change.
e
You are told that your score on an exam is at the 85 percentile of the distribution of scores. Which of the following is a correct interpretation of this information? a. 85% of the people who took this test earned the same score you did. b. If you took this test (or one like it) again, you would score as well as you did this time 85% of the time. c. You answered 85% of the questions correctly. d. Your score was lower than approximately 85% of the people who took this exam. e. Your score was higher than approximately 85% of the people who took this exam.
e
You can roughly locate the mean of a density curve by eye because it is a. the point at which the curve reaches its peak. b. the point at which the height of the graph is equal to 1. c. the point that divides the area under the curve into two equal parts. d. the point where the curvature changes direction. e. the point at which the curve would balance if made of solid material.
a
You can roughly locate the median of a density curve by eye because it is a. the point that divides the area under the curve into two equal parts. b. the point at which the curve would balance if made of solid material. c. the point at which the curve reaches its peak. d. the point at which the height of the graph is equal to 1. e. the point where the curvature changes direction.
e
You catch 10 cockroaches in your bedroom and measure their lengths in centimeters. Which of these sets of numerical descriptions are all measured in centimeters? a. median length, variance of lengths, largest length b. median length, first and third quartiles of lengths c. mean length, standard deviation of lengths, median length d. mean length, median length, variance of lengths. e. both (B) and (C)
e
You measure the age (years), weight (pounds), and marital status (single, married, divorced, or widowed) of 1400 women. How many variables did you measure? a. 1 b. 2 c. 1403 d. 1400 e. 3
e
You measure the age, marital status and earned income of an SRS of 1463 women. The number and type of variables you have measured is a. four; one categorical and two quantitative, and one individual. b. four; one categorical and three quantitative. c. four; two categorical and two quantitative. d. three; two categorical and one quantitative. e. three; one categorical and two quantitative.
c
You want to use numerical summaries to describe a distribution that is strongly skewed to the left. Which combination of measure of center and spread would be the best ones to use? a. Median and standard deviation. b. Mean and standard deviation. c. Median and interquartile range. d. Mean and interquartile range. e. Median and range.
e
You would draw a scatterplot to a. show the five-number summary for the heights of female students. b. show the relationship between gender and having a driver's license. c. show the distribution of heights of students in this course. d. compare the distributions of heights for male and female students in this course. e. show the relationship between the height of female students and the heights of their mothers.
d
data set is Normally distributed with a mean of 25 and a standard deviation of 8. If you calculate the standard score of every observation in this data set, what will the mean and standard deviation of the resulting scores be? a. Mean = 100; Standard deviation = 10 b. Mean = 25; Standard deviation = 1 c. Mean = 25; Standard deviation = 10 d. Mean = 0; Standard deviation = 1 e. Mean = 1; Standard deviation = 1
a
n a study of the link between high blood pressure and cardiovascular disease, a group of white males aged 35 to 64 was followed for 5 years. At the beginning of the study, each man had his blood pressure measured and it was classified as either "low" systolic blood pressure (less than 140 mm Hg) or "high" blood pressure (140 mm Hg or higher). The following table gives the number of men in each blood pressure category and the number of deaths from cardiovascular disease during the 5-year period. Blood pressure Deaths Total Low 10 2000 High 50 3500 Based on these data, which of the following statements is correct? a. These data are consistent with the idea that there is a link between high blood pressure and death from cardiovascular disease.This is the correct answer. b. The mortality rate (proportion of deaths) for men with high blood pressure is 5 times that of men with low blood pressure. c. These data probably understate the link between high blood pressure and death from cardiovascular disease, because men will tend to understate their true blood pressure. d. Although there were more deaths in the high blood pressure group, this is expected, because there were 1500 more men in that group. e. All of the above.