Chapter 2
frequency
Number of times for each category (simplest)
The numbers used to separate the classes of a frequency distribution, but without the gaps created by class limits, are called ____________________.
class boundaries
stem and leaf display
splits the data values into stems and leaves. by listing all the leaves to the right of each stem, we graphically describe how the data are distributed.
Pie Charts
excellent tool for comparing proportions for categorical data. Each category occupies a segment of the pie that represents the relative frequency of that category
discrete data
- values based on observations that can be counted and are numbers typically represented by whole numbers. - often involved counting observations - because ability to be counted, they have a finite number of values within an interval
Continuous data
- values that can take on any real numbers, including numbers that contain decimal points. - often the result of measuring observations - have infinite number of values available
After constructing any relative frequency distribution, what should be the sum of the relative frequencies?
1 or 100%
What is wrong with the following class limits for organizing weight data for a sample of 200 adult men in the United States? 140-150 pounds 150-160 pounds 160-170 pounds 170-180 pounds 180-190 pounds 190-200 pounds 200-210 pounds 210-220 pounds 220-230 pounds
the classes are overlapping
range
the range of cells from which you are counting
criteria
the value that you are counting
cumulative relative frequency distribution
totals the proportion of observations that are less than or equal to the class at which you are looking. (add relative frequency up to 1.0)
2^k ≥ n
k = number of classes n = number of data points the trick is to find the lowest value of k that satisfies the rule. For example: n = 50 2^5 = 32 ≤ 50 (k=5 too small) 2^6 = 64 ≥ 50 (k=6 good choice)
dependent variable
placed on vertical axis of the scatter plot and is influenced by changes in the independent variable which is placed on the horizontal axis
contingency tables
provide a format to display the frequencies of two qualitative variables - help us identify relationships between two or more variables
scatter plot
provides a picture of the relationship between two quantitative variables that are paired together.
chapter two review page
66
Suppose you want to know if more technical service calls are made to homes with cable television or with satellite dish television. Should you use frequencies or relative frequencies to make the comparison? Why?
Relative frequencies should be used since there is likely a difference in the number of users of cable and satellite television. If you make comparisons using frequencies, the results can be very misleading for different population sizes.
Which of the statements below is true concerning bar graphs?
The height of each bar represents the category's frequency or relative frequency.
bar charts
a good tool for displaying qualitative data that have been organized in categories and can be arranged in a vertical or horizontal orientation
histogram
a graph showing the number of observations in each class of a frequency distribution. (similar to bar graph) - when displaying descrete data, their are gaps inbetween bars indicating the frequency of each class continuous data do not have gaps between bars.
line chart
a special type of scatter plot in which the data points in the scatter plot are connected with a line
A frequency distribution lists the _________ of occurrences of each category of data, while a relative frequency distribution lists the _________ of occurrences of each category of data.
number; proportion
The ____________________ is the difference between two consecutive lower class limits or two consecutive upper class limits.
class width
relative frequency distributions
display the proportion of observations of each class relative to the total number of observations. (decimal percentage.)
pareto charts
essentially bar charts that show the frequency of the categories that cause quality control problems. The charts show the categories in a decreasing order.
class width formula
estimated class width = (maximum data value - minimum data value) / k ex: (17.4 - 0.6) / 6 = 2.8 (class width)
percentage polygon
graphs the midpoint of each class as a line rather than a column. The height of each midpoint represents the relative frequency of the corresponding class. - more appropriate over histograms when comparing the shape of two or more distributions on one graph
stacked bar chart
group several values in a single column within the same category in a vertical direction
clustered bar charts
group several values side by side within the same category in a vertical direction
A(n) ____________________ is a bar graph in which the height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same, and the rectangles touch each other.
histogram
symmetrical distribution
is one in which the right side of the distribution is the mirror image of the left side of the distribution.
qualitative data
values that are categorical (nominal or ordinal measurement levels) that describe a characteristic, such as gender or level of education.