Statistics chapter 12-15
The Consumer Price Index (1982-84 = 100) in mid-2008 was about 218.8. The CPI in 1930 (same base) was 16.7. The New York Yankees paid Babe Ruth $80,000 in 1930, an enormous salary for an athlete in those days. The buying power of the Babe's salary in 2008 dollars is about
$1,048,144
A survey of people who invest in mutual funds found that "average" amounts invested were $18,200 and $43,500. One of these numbers is the mean amount invested and one is the median.
$43,500 is the mean, because a few people with very large amounts invested pull up the mean but not the median.
A gallon of unleaded gasoline cost $1.19 in 1980 and $4.05 in July 2008. The gasoline price index number (1980 = 100) for July 2008 is
(4.05/1.19) × 100 = 340.3 .
Scores on the 2007 SAT writing exam were normally distributed, with mean 495 and standard deviation about 110. Reference: Ref 13-10 What percent of all students scored above 605?
16%
The Consumer Price Index (1982-84 = 100) was about 207 in 2007. In 1989 the CPI was 124. Tuition for in-state students at one Big Ten university was $2,032 in 1989. In 2007 dollars, this tuition is equivalent to
2,032 × {207/124} = $3,392.
For a normal distribution with mean 20 and standard deviation 5, approximately what percent of the observations will be less than 10?
2.5%
The distribution of heights of adult men is approximately normal with mean 69 inches and standard deviation 2.5 inches. About what percent of men are taller than 74 inches?
2.5%
The five-number summary of the distribution of scores on a statistics exam taken by 316 students is 0 26 31 36 50. About how many students had scores above 26?
237
A pair of soccer shoes cost $50.00 in 1998; a pair of the same type of shoes costs $120.00 in 2008. Using 1998 as the base year, what is the soccer shoe index number for 2008?
240
The correlation between two variables x and y is 0.5. If we used a regression line to predict y using x, what percent of the variation in y would be explained?
25%
Lean body mass (your weight leaving out fat) helps predict metabolic rate (how many calories of energy you burn in an hour). The relationship is roughly a straight line. The least-squares regression line for predicting metabolic rate (y in calories) from lean body mass (x in kilograms) is y = 113.2 + 26.9x. Reference: Ref 15-7 The slope of the regression line is
26.9—that is, when lean body mass goes up by 1 kg, metabolic rate goes up by 26.9 calories.
If the least-squares regression line for predicting y from x is y = 500 - 20x, what is the predicted value of y when x = 10 ?
300
In the good old days (1986) the U.S. dollar was worth 1.85 Swiss Francs. Over two decades later in 2008, the dollar was worth 1.16 Swiss Francs. The value of the dollar in Swiss Francs went down by about
37%
Here are the number of hours that each of a group of students studied for this exam: 1 1 2 2 4 4 4 5 5 22 Reference: Ref 12-4 What is the median number of study hours?
4
In any normal distribution, the percent of observations falling between standard score z = 0 and standard score z = 2 is about
47.5%
Here are the number of hours that each of a group of students studied for this exam: 1 1 2 2 4 4 4 5 5 22 Reference: Ref 12-4 What is the mean number of study hours?
5.0
The length of pregnancy isn't always the same. In pigs, the length of pregnancies varies according to a normal distribution with mean 114 days and standard deviation five days. Reference: Ref 13-2 What percent of pig pregnancies are longer than 114 days?
50%
Here are the number of hours that each of a group of students studied for this exam: 1 1 2 2 4 4 4 5 5 22 Reference: Ref 12-4 What is the standard deviation of the number of study hours?
6.2
Scores of adults aged 60 to 64 on a common IQ test are approximately normally distributed with mean 90 and standard deviation 15. Reference: Ref 13-9 What range of IQ scores contains the central 95% of the population of adults aged 60 to 64?
60 to 120
The heights of American men aged 18 to 24 are normally distributed with mean 68 inches and standard deviation 2.5 inches. So half of all young men are taller than
68 in
Scores on the 2007 SAT writing exam were normally distributed, with mean 495 and standard deviation about 110. Reference: Ref 13-10 What percent of all students scored between 385 and 605?
68%
To locate the median of 139 observations, you would count up to what position after arranging the data in order from smallest to largest?
70th in the order list
The following is a stemplot of 12 exam scores. (The stem is tens and the leaf is units.): 6 | 8 7 | 66 8 | 0488 9 | 22666 6 8 7 66 8 0488 9 22666 Reference: Ref 12-5 The first and third quartiles are, respectively,
78 and 94
Scores on the SAT exams have approximately a normal distribution with mean 500 and standard deviation 100. Julie scores 400 on the Math SAT. What percent of scores are higher than Julie's?
84%
The length of pregnancy isn't always the same. In pigs, the length of pregnancies varies according to a normal distribution with mean 114 days and standard deviation five days. Reference: Ref 13-2 What percent of pig pregnancies are longer than 109 days?
84%
The following is a stemplot of 12 exam scores. (The stem is tens and the leaf is units.): 6 | 8 7 | 66 8 | 0488 9 | 22666 6 8 7 66 8 0488 9 22666 Reference: Ref 12-5 The median is
88
If the least-squares regression line for predicting y from x is y = 40 + 10x, what is the predicted value of y when x = 5
90
A study gathers data on the outside temperature during the winter, in degrees Fahrenheit, and the amount of natural gas a household consumes, in cubic feet per day. Call the temperature x and gas consumption y. The house is heated with gas, so x helps explain y. The least-squares regression line for predicting y from x is y = 1360 - 20x. Reference: Ref 15-5 On a day when the temperature is 20°F, the regression line predicts that gas used will be about
960 cubic feet
For a normal distribution with mean 20 and standard deviation 5, approximately what percent of the observations will be between 5 and 35?
99.7%
When Julie entered college in 2003, she dreamed of making $50,000 when she graduated. The CPI in 2003 was 184. Julie graduated in 2008. (Thinking about money rather than studies slowed her progress a bit.) The June 2008 CPI was 218.8. What must Julie earn in order to have the same buying power that $50,000 had in 2003?
about $59,500
Athletes make more now, but prices are also higher than in the past. In 2003, the basketball player Lebron James signed a contract for $90 million with Nike. How much is this in 1975 dollars? (The CPI was 53.8 in 1975 and was 184 in 2003.)
about 26 million
Scores of adults aged 60 to 64 on a common IQ test are approximately normally distributed with mean 90 and standard deviation 15. Reference: Ref 13-9 The third quartile of the distribution of IQ scores of adults aged 60 to 64 is
between 90 and 105
Which of these statements is true of the correlation r?
both B and C
The main advantage of boxplots over stemplots and histograms is
boxplots make it easy to compare several distributions, as in this example
The numerical value of a correlation coefficient
can be any number between -1 and 1.
A high correlation between two variables does not always mean that changes in one cause changes in the other. The best way to get good evidence that cause-and-effect is present is to
carry out a randomized comparative experiment
A study of new cars finds that the correlation between the weight of cars (pounds) and their city gas mileage (miles per gallon) is r = -0.4. This tells us that
heavier cars tend to get fewer miles per gallon
The Consumer Price Index (CPI) somewhat overstates the rise in prices over time. One reason for this is
many products improve in quality over time, so higher prices are partly paying for better quality.
An outlier will usually have a large effect on the
mean
Fifty percent of the observations will be between
the quartiles.
The box in each boxplot marks
the range covered by the middle half of the data.
The box in the center of a boxplot marks
the range covered by the middle half of the data.
The standard deviation is a measure of
the spread or variability of a distribution
The correlation between two variables is of -0.8. We can conclude
there is a strong negative association between the two variables.
The correlation between the heights of fathers and the heights of their (adult) sons is r = 0.52. Reference: Ref 14-4 If fathers' heights were measured in feet (one foot equals 12 inches), and sons' heights were measured in furlongs (one furlong equals 7,920 inches), the correlation between heights of fathers and heights of sons would be
unchanged: equal to 0.52.
Having taken statistics, you know that a graph that shows how the cost of attending your school has increased since 1980 should show the cost in real terms. To do this, you will
use the CPI to adjust each year's cost for changes in the buying power of a dollar.
You are told that your score on an exam is at the 85th percentile of the distribution of scores. This means that
your score was equal to or higher than approximately 85% of the people who took this exam
Until the scale was changed in 1995, SAT scores were based on a scale set many years ago. For math scores, the mean under the old scale in the 1990s was about 470 and the standard deviation was about 110. What is the standard score of someone who scored 500 on the old SAT?
z=0.27
Scores on the Scholastic Assessment Test are reported on a scale that yields a normal distribution with mean 500 and standard deviation 100. Julie scores 600 on the SAT. Her standard score is
z=100
Tall men tend to marry women who are taller than average, but the degree of association between the height of a husband and the height of his wife isn't very big. The correlation between heights of husbands and wives that best describes this situation is
0.3
We want to use scores on Exam 1 to predict final total score in a course. Last semester, students with higher Exam 1 scores did tend to get higher total scores. But regressing total score on Exam 1 score explained only 36% of the total score. What is the correlation between Exam 1 scores and total scores?
0.60
Which correlation indicates a strong positive straight-line relationship?
0.99
A correlation cannot have the value
1.5
The calorie counts for the 17 poultry brands are: 86 87 94 99 102 102 106 113 129 132 135 142 143 144 146 152 170 The first quartile of the 17 poultry hot dog calorie counts is
100.5
The length of pregnancy isn't always the same. In pigs, the length of pregnancies varies according to a normal distribution with mean 114 days and standard deviation five days. Reference: Ref 13-2 What range covers the middle 95% of pig pregnancies?
104 to 124 days
The calorie counts for the 17 poultry brands are: 86 87 94 99 102 102 106 113 129 132 135 142 143 144 146 152 170 The median of these values is
129
If there were something genetic which made people simultaneously more susceptible to both smoking and lung cancer, that would be an instance of
Common response
Which of the following is least likely to have a nearly normal distribution?
Family incomes of all students taking STAT 001 at State Tech.
A study gathers data on the outside temperature during the winter, in degrees Fahrenheit, and the amount of natural gas a household consumes, in cubic feet per day. Call the temperature x and gas consumption y. The house is heated with gas, so x helps explain y. The least-squares regression line for predicting y from x is y = 1360 - 20x. Reference: Ref 15-5 When the temperature goes up 1°, what happens to the gas usage predicted by the regression line?
It goes down 20 cubic feet.
The federal minimum wage was $6.55 an hour after it was increased in 2008. In 1980, the minimum wage was $3.25 an hour. The CPI (1982-84 = 100) was 82.4 in 1980 and was 218.8 in mid-2008. Which of these is true?
The 1980 minimum wage is about $8.63 in 2008 dollars, so the minimum wage has gone down in real terms
A study of the effects of television measured how many hours of television each of 125 grade school children watched per week during a school year and their reading scores. Which variable would you put on the horizontal axis of a scatterplot of the data?
Hours of television, because it is the explanatory variable.
A normal distribution always
Symmetric
The correlation between height (in inches) and weight (in pounds) among first-grade students at Happy Hollow Elementary School is exactly 0.57. If heights are converted to centimeters and weights are converted to kilograms, what happens to the correlation between height and weight among the first-graders? (1 inch = 2.54 cm.; 1 pound = 0.394 kg.)
The correlation is still 0.57.
The most recent value of the Consumer Price Index (1982-84 = 100) is 218.8. This means that
a market basket of goods and services that cost $100 in 1982 to 1984 now costs $218.80
If you know the mean and standard deviation of a distribution, do you know the complete shape of the distribution?
Yes if the distribution is normal, but not in general.
A study found correlation r = -0.43 between how many cigarettes a person smokes and how overweight the person is. You conclude that
people who smoke more tend to be less overweight.
If the Consumer Price Index (1982-84 = 100) is 218.8, this means that
prices have increased 218.8%, so that it now costs $218.80 to buy goods and services that cost $100 in 1984.
What can we say about the relationship between a correlation r and the slope b of the least-squares line for the same set of data?
r and b always have the same sign (+ or -).
Below is a graph of the percent of adults in each state who were obese in 1991 and the percent who were obese in 1998: Reference: Ref 14-8 Which of these is a reasonable value of the correlation r for the data in this graph?
r=0.7
When dealing with financial data (such as salaries or lawsuit settlements), we often find that the shape of the distribution is _____________. When the distribution has this shape, the _____________ is pulled toward the long tail of the distribution, but the ____________ is less affected. The sequence of words to correctly complete this statement is:
right skewed; mean; median
The possible values of the standard deviation s of a set of observations are
s can be 0 or positive, but not negative.
Below is a graph of the percent of adults in each state who were obese in 1991 and the percent who were obese in 1998: Reference: Ref 14-8 This type of graph is called a
scatter
A study found that SAT Verbal scores were positively associated with first-year grade point averages for liberal arts majors. We can conclude from this that
students who scored high on the SAT Verbal test tended to get higher GPAs than those who scored lower on the SAT Verbal test.
The correlation between the heights of fathers and the heights of their (adult) sons is r = 0.52. Reference: Ref 14-4 This tells us that
taller than average fathers tend to have taller than average sons.
The standard deviation should not be used to measure spread when
the distribution is skewed.
In a scatterplot we can see
the form, direction, and strength of a relationship between two quantitative variables.
For a distribution that is skewed to the left, usually
the median will be larger than the mean.
The five numbers in the five-number summary are
the median, the quartiles, the minimum, and the maximum.