Chapter 2
Modes
A peak, or high point, of a histogram is referred to as a *mode*
A Frequency Distribution
A table that presents the frequency for each category.
Relative Frequency Distribution
A table that presents the relative frequency of each category. Often the frequency is presented as well.
Histogram
-Constructed by drawing a rectangle for each class. -The heights of the rectangles are equal to the frequencies or the relative frequencies, and the widths are equal to the class width
Requirements for Choosing Classes
-Every observation must fall into one of the classes. -The classes must not overlap. -The classes must be of equal width. -There must be no gaps between classes. Even if there are no observations in a class, it must be included in the frequency distribution.
Bimodal Histograms
A histogram is *bimodal* if it has two clearly distinct modes.
Approximately Symmetric Histograms
A histogram is *symmetric* if its right half is a mirror image of its left half.
Unimodal Histograms
A histogram is *unimodal* if it has only one mode
Skewed Right Histograms
A histogram with a longright-hand tail is said to be *skewed to the right*, or *positively skewed*
Skewed Left Histograms
A histogram with a long left-hand tail is said to be *skewed to the left*, or *negatively skewed*
Small Data Sets
For *small data sets*, however, it is sometimes useful to have a summary that is more detailed than a histogram.
Classes
Intervals of equal width that cover all values that are observed in the data set.
Split Stem-and-Leaf Plot
Sometimes one or two stems contain most of the leaves. When this happens, we often use two or more lines for each stem.
Class Width
The *difference between consecutive lower class limits*. -The class width is the difference between the lower limit and the lower limit of the next class, not the difference between the lower limit and the upper limit.
Relative Frequency of a Category
The *frequency of the category divided by the sum of all the frequencies*. -The proportion of observations in a category
Upper Class Limit
The largest value that can appear in that class. -Right-most number of a class
Frequency of a Category
The number of times it occurs in the data set.
Lower Class Limit
The smallest value that can appear in that class. -Left-most number of a class
Side-By-Side Bar Graphs
When comparing two bar graphs that have the same categories, both bar graphs are constructed on the same axes, putting bars that correspond to the same category next to each other
Histograms for Discrete Data
When data are discrete, we can construct a frequency distribution in which each possible value of the variable forms a class.
Computing the class width for a given number of classes
*Class width = (Largest Data Value - Lowest Data Value) / Number of Classes*
Pareto Charts
-A bar graph in which the categories are presented in order of frequency or relative frequency, with the largest frequency or relative frequency on the left and the smallest one on the right. -Pareto charts are useful when it is important to see clearly which are the most frequently occurring categories.
Dotplots
-A graph that can be used to give a rough impression of the shape of a data set. -It is useful when the size of the data set is not too large, and when there are some repeated values. -In a dotplot, for each value in the data set a vertical column of dots is drawn, with the number of dots in the column equal to the number of times the value appears in the data set.
Bar Graph
-A graphical representation of a frequency distribution. -A bar graph consists of rectangles of equal width, with one rectangle for each category. -The heights of the rectangles represent the frequencies or relative frequencies of the categories
Pie Charts
-An alternative to the bar graph for displaying relative frequency information. -A pie chart is a circle. -The circle is divided into sectors, one for each category. -The relative sizes of the sectors match the relative frequencies of the categories. -It is customary to label each sector with its relative frequency, expressed as a percentage.
Frequency Distributions for Quantitative Data
-Frequency distributions for quantitative data are just like those for qualitative data, except that the *data are divided into classes rather than categories* -The classes are *intervals of equal width* that cover all the values that are observed. -For example, we could choose the classes to be 0.00-0.99, 1.00-1.99, and so forth. -We then count the number of observations that fall into each class, to obtain the class frequencies
Stem-and-Leaf Plots
-Illustrate the shape of the data set, while allowing every value in the data set to be seen. -Are a simple way to display small data sets. -In a stem-and-leaf plot, the rightmost digit is the leaf, and the remaining digits form the stem.
Open-Ended Classes
-It is sometimes necessary for the first class to have no lower limit or for the last class to have no upper limit. -Such a class is called *open-ended* -When a frequency distribution contains an open-ended class, a histogram cannot be drawn.
Time-Series Plots
-May be used when the data consist of values of a variable measured at different points in time. -In a time-series plot, the horizontal axis represents time, and the vertical axis represents the value of the variable we are measuring. -The values of the variable are plotted at each of the times, then the points are connected with straight lines.
Procedure for Constructing a Frequency Distribution for Quantitative Data
-Step 1: Choose a class width. -Step 2: Choose a lower class limit for the first class. This should be a convenient number that is slightly less than the minimum data value. -Step 3: Compute the lower limit for the second class by adding the class width to the lower limit for the first class: *Lower limit for second class = Lower limit for first class + Class width* -Step 4: Compute the lower limits for each of the remaining classes, by adding the class width to the lower limit of the preceding class. Stop when the largest data value is included in a class. -Step 5: Count the number of observations in each class, and construct the frequency distribution
Steps for Constructing a Stem-and-Leaf Plot
-Step 1: Make a vertical list of all the stems, in increasing order, and draw a vertical line to the right of this list. -Step 2: For each value in the data set, write the leaf next to its stem. -Step 3: For each stem, arrange its leaves in increasing order.
Difference between Frequency and Relative Frequency
-The *frequency of a category* is the *number of items in the category*. -The *relative frequency* is the *proportion of items in the category*
Choosing the number of classes
-There is no single right way to choose classes for a histogram. -Use your best judgment to construct a histogram with an appropriate amount of detail. -There are two principles that can guide the choice: 1. Too few classes produce a histogram lacking in detail. 2. Too many classes produce a histogram with too much detail, so that the main features of the data are obscured.
Back-to-Back Stem-and-Leaf Plots
The stems go down the middle. The leaves for one of the data sets go off to the right, and the leaves for the other go to the left