Econ 261 Chapters 1-2
Which type of plot is most commonly used to display the distribution of a single quantitative variable?
Histogram
When should you consider rounding the data before making a stemplot?
If the stemplot would have very many stems, but most of the stems would have no leaves, or just one leaf.
The histograms give the distribution of age for a sample of married and unmarried college students.
The age distribution of the married students has a larger/older center.
A researcher is looking at trends in smoking over time. She has a data set showing the smoking rates for British males each year over a 25 year time period in England. The histogram gives the distribution of the percent of British males who smoked for each year in the data set. For example, the first bar in the histogram goes from 26 to 27 and has a height of 1, indicating that there was 1 year in which the percent of males who smoked was between 26% and 27%. The bar for 28 to 29 has a height of 2, indicating there were two years in which the percent of males who smoked was between 28% and 29%. Why is this particular display not a good choice? Select the BEST reason.
We have too many classes, making it difficult to determine the shape of the distribution.
Airport administrators take a sample of airline baggage and record the weight of each bag in order to estimate the number of bags that weigh more than 75 pounds. What is the variable of interest?
Weight of the bags
The histogram gives the distribution of height for a sample of college students. Does this distribution contain a clear outlier?
Yes, the data value falling between 85 and 90 inches is a clear outlier.
A professor collects information on students in her statistics course at Whatsamatta U. She records the students' majors, the number of books they read last semester, and whether or not they owned a pet. In this context, "student's major" is _______________.
a categorical variable
Which graphs are appropriate for displaying the distribution of a categorical variable?
a pie chart or a bar graph
Compare female and male thigh girths displayed in the following side-by-side boxplots. Typically female thigh girths are _______________ male thigh girths.
about the same as
standard deviation ranges from ______ to +∞.
0
The following pie chart shows the result of a survey of students at a large state university, who were asked if they preferred pizza (red) or tacos (blue) for a snack. The pie chart shows the percentages of students who answered that they preferred pizza or preferred tacos. According to the pie chart, what is a plausible count of students who answered in each category?
1,000 total students, where 800 students preferred tacos and 200 students preferred pizza
A shopper at a local supermarket spent the following amounts in his last eight trips to the store: $30.80 $27.34 $28.34 $75.58 $36.33 $33.51 $27.41 $22.94 The value of the median is _________________. Give your answer to two decimal places as $XX.XX.
29.47
A statistics class has 245 students. To find the median score on the first midterm, you should first order the exam scores from smallest to largest and then find the score in the ___rd position. Give your answer as a whole number.
123
The time (in minutes) that students in an introductory statistics class spent taking their final is summarized as follows: (Note: This final had no time limit.) Min = 61, Q1 = 86, Median = 101, Q3 = 128.5, Max = 210 According to the IQR rule of thumb for detecting outliers, the upper bound is computed as (fill in the missing blank): 128.5 + 1.5 x (_____ - 86)
128.5
What is the range of the distribution shown by the stemplot?
18 to 112 The smallest value in this data set is 18 and the largest value in 112. We can report these two values to summarize the spread of the data set
You are constructing a stemplot for a data set. For this data set, you do not need to round the data, or split the stems in the plot. One of the data values is 185. What numbers should you use for the stem and leaf of this data value?
18 would be the stem, 5 would be the leaf.
___________% of the data are less than the first quartile.
25
The histogram gives the distribution of the number of children that students in an introductory statistics class had. There were 56 students in the class. The number of children are the values on the x-axis.
25%
The given histogram gives the distribution of money spent on textbooks by students in an introductory statistics course. There are 56 students in the class.
33/56
A college basketball player scored the following points in the 37 games he played his senior year: Stem-and-leaf of points n = 37 Leaf Unit = 1.0 1331662123344425555666789302222334443894234752 The median of his point totals is found in the ____________ place.
38/2
A professional basketball player played in 87 games last season, including the playoffs. For each game, the number of points scored was recorded. To find his median score for last season, you should list the scores in order from smallest to largest. The median score will be the score in the __th position. Note: Your answer should be a number.
44
If a boxplot has a wide central box, ________________% falls between the quartiles Q1 and Q3.
50 The central box spans where the middle 50% of the data lie regardless of how wide the box is.
Scores on the first midterm in a statistics course for 38 students are summarized numerically as follows: Approximately what percentage of the data are between 83.7% and 95.6%?
50%
The third quartile is larger than _____% of the data.
75%
The value of the interquartile range is _________________. Give your answer to one decimal place as X.X.
9.5 Q3-Q1
Approximately twenty-five percent of the data are greater than _____%.
95.6
Students in an introductory statistics course were asked what county they were living in. The following responses were given:
Add the percentages for the first four counties and then subtract from 100 percent to get the percentage of students who live in Other counties. Then draw a circle with slices whose sizes in the circle correspond to the percentages in the table.
Which numerical summaries should be used to describe the data from a distribution that is strongly skewed to the left?
Five-number summary The five-number summary is better than mean and standard deviation for describing a skewed distribution. The mean and standard deviation work best when the data is symmetric, with no outliers.
The following pie chart shows the result of a survey of students at a large state university, who were asked if they preferred pizza (red) or tacos (blue) for a snack. According to the pie chart, the percentage of students who said they liked both pizza and tacos equally is about __________.
Cannot be determined by this pie chart.
The range is a measure obtained by subtracting the minimum value in the data set from the maximum value. The following histogram gives the distribution of salaries of National League baseball players in $1000s from the year 2001. What is the range for this set of data?
Cannot be determined.
What does the mean measure?
Center
Which of the following stemplots is/are an accurate way to display the shape for the following data?
D. Both A and B are equally accurate.
Which data set has the largest standard deviation?
Dataset B Because Dataset B has more data in both tails, it has the largest standard deviation.
Airport administrators take a sample of airline baggage and record the number of bags that weigh more than 75 pounds. What is the individual?
Each piece of baggage
Which numerical summaries should be used to describe the data in the following histogram?
Five-number summary
Which numerical summaries should be used to describe the data in the following histogram?
Five-number summary The distribution is skewed to the right. The five-number summary is better than mean and standard deviation for describing a skewed distribution.
Which of the following numerical summaries will be most affected by an outlier?
Mean
Scores on the first midterm in a statistics course for 38 students are displayed in the following boxplot:
Min = 43, Q1 = 112, Median = 118, Q3 = 129, Max = 137
Would the variable "amount of rainfall in Michigan," measured in inches, be considered a categorical or quantitative variable?
Quantitative
The given data gives weights in pounds for a random sample of automobiles. 4000 3230 3160 3890 3731 2680 3781 2860 2980 2979 What, if anything, should we do to this data before making a stemplot, if we want the stemplot to have more than 5 stems, but fewer than 100 stems?
Round to the nearest 10 pounds.
Scores on the first midterm in a statistics course for 38 students are given in the following boxplot: What is the shape of the distribution of midterm scores?
Skewed to the left The longest line is on the side with the lower numbers.
The time plot gives the average salary (in 1000s) for Major League Baseball players from 1985 to 2010. Interpret the plot.
The average baseball player's salary shows an upward trend over the 25 year period.
There are eight boys in a pre-school class. Their mean height is 33 inches and their median height is 33 inches. The tallest boy whose height is 38 inches moves away and is replaced by a boy whose height is 39 inches. How does this affect the mean?
The mean will increase. The tallest boy is replaced by a taller boy so the mean will increase.
There are eight boys in a pre-school class. Their mean height is 33 inches and their median height is 33 inches. The tallest boy whose height is 38 inches moves away and is replaced by a boy whose height is 39 inches. How does this affect the median?
The median will not change.
The time plot shows the percent of British males who smoked for the years 1974 - 1998. Which of the statements is a correct interpretation of the plot?
The percent of men who smoke shows a downward trend, with a lower percentage of males smoking in the late 1990s than in the 1970s.
The horizontal axis of a histogram can display individual numbers or the data may be grouped in _________, or intervals in which data values occur.
classes
A chart or graph that shows the values of a variable and how often it takes these values is displaying the variable's ________.
distribution
If you were interested in studying the selling prices of homes in Westchester County, NY, where most homes are between $500,000-$800,000 and there are few that are above $1,000,000, you should use the _____ to describe the center. Hint: Consider whether or not there would be any outliers.
median Since the median is resistant to the outliers of a few very expensive homes, it is a better choice than the mean.
The distribution of the points scored per game for two professional basketball players during the 2010-2011 season are displayed in the following side-by-side boxplots. Typically, LeBron James scored ________ Dirk Nowitzki.
more than The boxplot of the points scored by LeBron James is generally higher than the boxplot of points scored by Dirk Nowitzki.
You collect data on the number of children for each woman in a random sample of 100 adult women in the United States. Think about the amount of children most women typically have (is the average more likely 2-3 children or 8-9 children?). Picture or sketch out the distribution of children. The distribution will most likely have what shape?
skewed to the right
The mean and median are equal when the distribution is ______.
symmetric
In a _____ plot, connecting the data points helps to emphasize any changes in the data over time.
time
The following graph is a:
time plot.
The mean is a measure of center whereas the standard deviation measures the ____________ of data about the mean. Define sample variance/standard deviation
variability
The standard deviation measures the ________ of a distribution.
variability