STATS Chapter 2
What are Pareto charts?
A Pareto chart is a bar chart that is sorted from most frequent to least frequent.
A ____________________ is a bar graph in which the bars are drawn in decreasing order of frequency or relative frequency.
A Pareto chart is a bar graph that has all the bars in order of magnitude, starting with the largest.
Explain the difference between a bar graph and a Pareto chart.
A Pareto chart is a particular type of bar graph in which the bars are drawn in decreasing order of height. A Pareto chart is just a bar graph whose bars are drawn in decreasing order of frequency to emphasize the most important categories.
A distribution of a variable in which most of the values are relatively small but that also has a few very large values is called ________.
A right-skewed distribution has mostly values that are relatively small but also has a few very large values. This makes the graphical representation of the graph appear to have a tail that extends to the right.
Changing the width of bins in a histogram _______.
Changing the width of the bins in a histogram changes the shape of the histogram. Narrow bins will result in a spiky histogram that shows a lot of detail, while wider bins will hide more detail. Next Question
When thinking about the variability of a categorical distribution, it is sometimes useful to think of the word _______.
Diversity If the distribution of a categorical variable has a lot of diversity, or many observations in many different categories, then the variability is high. If the distribution has only a little diversity, or many of the observations fall into the same category, then the variability is low.
What would be a better type of graph for displaying these data? Select all that apply. Why is this pie chart hard to interpret?
Histogram Dot Plot There are so many possible numerical values causing the pie chart to have too many "slices", which makes it difficult to tell which is which.
When examining the shape of a distribution of numerical data, which of the following is not one of the three basic characteristics of a distribution's shape?
How many numbers are in the data set.
Suppose you have the following bar graph showing foreclosure rates for a few select states. If you want to make the point that the rates are all very similar, how would you change the graph?
If you show a wider range on the y-axis, it will make the differences in the bar heights seem less noticeable. To deemphasize the differences in foreclosure rates, change the scale on the y-axis to cover a larger range, 0-20 percent, for example.
Suppose you want to know if more technical service calls are made to homes with cable television or with satellite dish television. Should you use frequencies or relative frequencies to make the comparison? Why?
Relative frequencies should be used since there is likely a difference in the number of users of cable and satellite television. If you make comparisons using frequencies, the results can be very misleading for different population sizes. Remember that proportions (or percents) are better for comparing populations of different sizes.
The existence of multiple mounds in a distribution is sometimes a sign of which of the following?
Sometimes, the presence of multiple mounds can indicate that there have been multiple groups combined into a single sample. For example, if a group that has a right-skewed distribution were combined with a group that has a left-skewed distribution, the result would be a sample that has two mounds.
Suppose you construct a graph to compare the student populations of the five largest high schools in your city and choose to depict the populations with school buildings of various sizes. If the school buildings are drawn so that the length and the width are each in proportion to the population of the corresponding schools, is the resulting graph misleading? Why or why not?
Suppose you construct a graph to compare the student populations of the five largest high schools in your city and choose to depict the populations with school buildings of various sizes. If the school buildings are drawn so that the length and the width are each in proportion to the population of the corresponding schools, is the resulting graph misleading? Why or why not? If one school's population is twice that of another school, the school building representing it will actually be four times larger. This will give a misleading impression to the viewer.
A teacher asks 90 students who drive how many speeding tickets they received in the last year. Predict the shape of the distribution and explain.
The distribution will be right-skewed. Most people will have no tickets, but there will be a few people with 1, 2, 3, or more tickets.
What is the first step in almost every investigation of data?
The first step in every investigation of data is to make an appropriate graph.
Which of the statements below is true concerning bar graphs?
The height of each bar represents the category's frequency or relative frequency The height of each bar represents the category's frequency or relative frequency. Bar graphs are a visual representation of either a frequency or relative frequency distribution. Next Question
What is the most common trick to mislead readers of bar graphs?
What is the most common trick to mislead readers of bar graphs? By changing the vertical axis so that it does not start at 0, minor differences in the heights of the bars can be exaggerated to look very significant.
All methods used for visualizing distributions are based on which of the following?
When visualizing a distribution, the idea is to make some sort of mark that indicates how many times each value occurs in the data set.
What are two commonly used graphs to display the distribution of a sample of categorical data?
Two commonly used graphs to display the distribution of a sample of categorical data are bar charts and pie charts.
Suppose that you have data which indicates that 90% of adults in a nearby town have cell phones. Of those who have cell phones, 30% use Carrier A, 30% use Carrier B, 10% use Carrier C, 20% use Carrier D, 5% use Carrier E, and 5% use other carriers. Would a bar graph or pie chart be better if the goal is to compare Carrier B and Carrier C? Explain.
A bar graph would be better since you are trying to compare two parts, not a part to the whole. The angles might be difficult to judge on a pie chart, making it hard to directly compare two sectors. Recall that pie charts are good for comparing parts to the whole, while bar graphs are good for comparing two specific parts.
A categorical variable is only called bimodal under what circumstances?
A categorical variable is only called bimodal if two categories are nearly tied for most frequent outcomes.
"Relative frequency" is the same as which of the following?
Proportion: A relative frequency is the proportion of the observations that exhibit the relevant characteristic.
After constructing any relative frequency distribution, what should be the sum of the relative frequencies?
Relative frequencies give the percentage or proportion of data values in each class. If one looks at the total of all the relative frequencies, it should be 1 (or 100%) since the classes together cover all of the data.
What are the 3 basic characteristics of a distribution's shape?
The three basic characteristics of a distribution's shape are whether the distribution is symmetric or skewed, the number of mounds that appear in the distribution, and whether any unusually large or small values are present.
The histogram shows frequencies for the ages of 25 randomly selected CEOs. Approximately what is a typical age of a CEO in this sample?
The typical age of a CEO in this sample is between 56 and 60 years old.
True or false? A histogram and a relative frequency histogram, constructed from the same data, always have the same basic shape.
True. A relative frequency histogram will have a different scale on the y-axis but the same shape as a regular histogram. Since a relative frequency histogram uses relative frequencies on the y-axis, the scale on the y-axis will be labeled in percents (or proportions) instead of simple frequencies. This does not affect the shape of the graph though.
When summarizing graphs of categorical data, report the _______ and describe the _______.
mode(s) and variability When describing a distribution of categorical data, the mode should be mentioned and something should be said about the variability.