Statistics Ch. 2 - Organizing and Summarizing Data
relative frequency equation
relative frequency = frequency / sum of all frequencies
uniform distribution
the frequency of each value of the variable is evenly spread out across the values of a variable
bell-shaped distribution
the highest frequency occurs in the middle and frequencies tail off to the left and right of the middle
upper class limit
the largest value within a class
relative frequency
the proportion (or percent) of observations within a category
lower class limit
the smallest value within a class
skewed left
the tail to the left of the peak is longer than the tail to the right of the peak
skewed right
the tail to the right of the peak is longer than the tail to the left of the peak
relative frequency distribution
Lists each category of data together with the relative frequency. The sum of all the relative frequencies should add up to 1.
time-series data
data collected on the same element for the same variable at different points in time or for different time periods
raw data
data obtained from either observational studies or designed experiments, before it is organized into a meaningful form.
time-series plot
Obtained by plotting the time in which a variable is measured on the horizontal axis and the corresponding value of the variable on the vertical axis. Line segments are then drawn connecting the points.
guidelines for constructing good graphics
1. Title and label the graphic axes clearly. Include units of measurement and a data source. 2. Avoid distortion. Never lie about the data. 3. Minimize the white space. Clearly indicate truncated scales. 4. Avoid clutter. 5. Avoid three dimensional graphs. 6. Do not use more than one design in the same graphic. 7. Avoid relative graphs that are devoid of data or scales.
guidelines for determining the lower class limit of the first class and class width
1. choose the lower class limit of the first class by choosing the smallest observation in the data set or a number slightly lower than the smallest observation in the data set 2. determine the class width by deciding on the number of classes, then compute and round up: class width = (largest data value - smallest data value)/number of classes
graphical misrepresentations of data
1. misrepresentation of data 2. misrepresentation of data by manipulating the vertical scale 3. misleading graphs
distribution shapes
1. uniform distribution 2. bell-shaped distribution 3. skewed right 4. skewed left
dot plot
graph drawn by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed
side-by-side bar graph
Compares two sets of data by aligning the bars for one data set with the bars for another data set, by class. Should be compared using relative frequencies to avoid differences in population sizes.
histogram
Constructed by drawing rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same and the rectangles touch each other.
bar graph
Constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis. Rectangles of equal width are drawn for each category. The height of each rectangle represents the category's frequency or relative frequency.
pie chart
A circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.
Pareto chart
a bar graph whose bars are drawn in decreasing order of frequency or relative frequency
open ended
a class whose first class has no lower class limit, or whose last class has no upper limit
deceptive graph
a graph that purposely attempts to create an incorrect impression
misleading graph
a graph that unintentionally creates an incorrect impression
stem-and-leaf plot
a method of representing quantitative data graphically by using the digits to the left of the rightmost digit to for the stem, and the rightmost digits to form the leaf
frequency distribution
lists each category of data and the number of occurrences for each category of data
class width
the difference between consecutive lower class limits