MATH-164 - Chapter 2
In a relative frequency distribution, what should the relative frequencies add up to?
1
1. A frequency distribution lists the ________of occurrences of each category of data, while a relative frequency distribution lists the the ___________ of occurrences of each category of data.
1. number 2. proportion
Pareto chart
A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.
Statistics
Collection of methods for planning experiments, obtaining data, organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on data. The only science that enables different experts using the same figures to draw different conclusions
In a stem-and-leaf plot (or stem plot), use the digits to the left of the rightmost digit to form the stem.
Each rightmost digit forms a leaf.
A random sample of 100 adults aged 18 years or older were given a list of ice cream flavors and were asked to list which flavors they liked. The responses are given below. Chocolate 45 Strawberry 42 Vanilla 23 Mocha 19 Which of the following graphs would be most appropriate for visually displaying the results?
Frequency Bar Graph
A newspaper article claimed that the afternoon hours were the worst in terms of robberies and provided the graph to the right in support of this claim. Explain how this graph is misleading.
Not all of the time intervals are the same size. Redistributing the time interval so they are all the same size may lead to a different shape.
As you use the applet, remember:
The goal is to design a distribution that is best for revealing the patterns within the data.
skewed right distribution
The peak of the data is to the left side of the graph. There are only a few data points to the right side of the graph.
Skewed Left Distribution
The peak of the data is to the right side of the graph. There are only a few data points to the left side of the graph.
Is the statement below true or false? There is not one particular frequency distribution that is correct, but there are frequency distributions that are less desirable than others.
True. Any correctly constructed frequency distribution is valid. However, some choices for the categories or classes give more information about the shape of the distribution.
The following graph represents the results of a survey, in which a random sample of adults in a certain country was asked if a certain action was morally wrong in general. Complete parts (a) through (c) below. (a) What percent of the respondents believe the action is morally acceptable? (b) If there are 275 million adults in the country, how many believe that the action is morally wrong? (c) If a polling organization claimed that the results of the survey indicate that 10% of adults in the country believe that the action is acceptable in certain situations, would you say this statement is descriptive or inferential? Why?
a. About 70% of the respondents b. About 52 million adults (take the percentage and multiply by the population) 275x 19% c. The statement is inferential because it makes a prediction.
Classes
are the categories by which data are grouped.
symmetric
because if we split the histogram down the middle, the right and left sides are mirror images
uniform distribution
because the frequency of each value of the variable is evenly spread across the values of the variable
The _________________ is the difference between consecutive lower class limits.
class width
cumulative frequency distribution
displays the aggregate frequency of the category. In other words, it displays the total number of observations less than or equal to the upper class limit of the class. A cumulative relative frequency distribution displays the proportion (or percentage) of observations less than or equal to the upper class limit of the class.
Pie Chart/Graph
is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category. Pie charts are typically used to present the relative frequency of qualitative data. In most cases, the data are nominal, but ordinal data can also be displayed in a pie chart.
relative frequency
is the proportion (or percent) of observations within a category and is found using the formula relative frequency = frequency/sum of all frequencies
frequency distribution
lists each category of data and the number of occurrences for each category of data.
Use the applet available below to complete parts (a) through (c). (a) How many classes are in the histogram for the Five-Year Rate of Return data when the "Starting point" is 8 and the "Bin width" is 2? (b) How many classes are in the histogram for the Five-Year Rate of Return data when the "Starting point" is 8 and the "Bin width" is 4? (c) How many classes are in the histogram for the Five-Year Rate of Return data when the "Starting point" is 8 and the "Bin width" is 1?
(a) There are 6 classes. (Type a whole number.) (b) There are 3 classes. (Type a whole number.) (c) There are 12 classes. Count the empty spaces between the blue columns (Type a whole number.)
ogive (read as "oh jive")
is a graph that represents the cumulative frequency or cumulative relative frequency for the class. It is constructed by plotting points whose x-coordinates are the upper class limits and whose y-coordinates are the cumulative frequencies or cumulative relative frequencies of the class. Then line segments are drawn connecting consecutive points. An additional line segment is drawn connecting the first point to the horizontal axis at a location representing the upper limit of the class that would precede the first class (if it existed).
class width formula
largest data value − smallest data value/number of classes. Exp. 19.43-8.28/10=1.115 round the number to 1 Rounding up may result in fewer classes than were originally intended, while rounding down may result in more class than originally intended.
relative frequency distribution
lists each category of data together with the relative frequency.
side-by-side bar graph
A bar graph that compares different groups, in one category, to one another. When comparing data sets, it is best to use relative frequencies because different sample or population sizes make comparisons using frequencies difficult or misleading.
class width
the difference between consecutive lower class limits. Generally, there should be between 5 and 20 classes. The smaller the data set, the fewer classes you should have. class width = largest data value - smallest data value/number of classes
open-ended
the first class has no lower class limit or the last class has no upper class limit.
bell-shaped distribution
the highest frequency occurs in the middle and frequencies tail off to the left and right of the middle.
upper class limit
the largest value within the class
lower class limit
the smallest value within the class
The data to the right represent the top speed (in kilometers per hour) of all the players (except goaltenders) in a certain soccer league. Find (a) the number of classes, (b) the class limits for the fourth class, and (c) the class width
(b) The lower class limit for the fourth class is 28. (Type an integer or a decimal. Do not round.) The upper class limit for the fourth class is 33.9. (Type an integer or a decimal. Do not round.) (c) The class width is 6. (Type an integer or a decimal. Do not round.)
Histogram
A graph of vertical bars representing the frequency distribution of a set of data. is constructed by drawing rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same, and the rectangles touch each other.
The following graph shows the median earnings for females from 2005 to 2009 in constant 2009 dollars. Complete parts (a) and (b) below. (a) How is the bar graph misleading? What does the graph seem to convey? (b) Redraw the graph so that it is not misleading. Choose the correct graph below. What does the new graph seem to convey?
A. The vertical axis starts at 34,500 instead of 0. This tends to indicate that the median earnings for females changed at a faster rate than it actually did. B. Redraw the graph so that it is not misleading. Choose the correct graph below. (Median Earnings for females in constant 2009) This graph indicates that median earnings for females have remained fairly constant over the given time period.
Dot Plot/Line Plot
We draw a dot plot by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed.
bar graph
a graph that uses vertical or horizontal bars to show comparisons among two or more items. is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis. Rectangles of equal width are drawn for each category. The height of each rectangle represents the category's frequency or relative frequency. A bar graph is a horizontal or vertical representation of the frequency or relative frequency of the categories. The height of each rectangle represents the category's frequency or relative frequency.
The following frequency histogram represents the IQ scores of a random sample of seventh-grade students. IQs are measured to the nearest whole number. The frequency of each class is labeled above each rectangle. Use the histogram to answers parts (a) through (g). (a) How many students were sampled? (b) Determine the class width. (c) Identify the classes and their frequencies. (d) Which class has the highest frequency? (e) Which class has the lowest frequency? (f) What percent of students had an IQ of at least 120? (g) Did any students have an IQ of 169?
a. 200 (its the sum of all class) b. The class width is 10. (The width of each class is equal to the width of each bar on the histogram. For example, the width of the first bar is 70−60=10.) c. 60-69, 1;70-79, 5;80-89, 15;90-99, 41;100-109, 46;110-119, 40;120-129, 33;130-139, 11;140-149, 6;150-159, (Determine the range of IQ scores that make up each class. All classes should have IQ scores that are unique to only that class. Remember that the frequency of each class is shown above its corresponding bar in the histogram.) d. 100-109 is the class with the highest frequency (Use the histogram to find the tallest bar. Determine the class that is associated with the bar to determine the class that has the highest frequency.) e. 60-69 is the class with the lowest frequency. f. 26% (sum up all the students with IQ frequency over 120. Divide the result by the sum of all frequencies in the histogram. 52/200x100=26% g. No
Researchers conducted a poll of the adults in a nation and asked, "When there is a voiceover in a commercial, which type of voice is more likely to sell you a car?" Results of the survey are in the bar graph. Complete parts (a) through (c) below. (a) How many participants were in the survey? (b) What is the relative frequency of the respondents who indicated that it made no difference which voice they hear? (c) Redraw the graph as a Pareto chart.
a. There were 2216 participants. (sum of all frequency) b. The relative frequency is. 6534. (take the number of no difference participants and divide by the sum of all participants) 1448/2216=.6534 c. Select that chart arranges the bars from largest to smallest, from left to right.
The side-by-side bar graph available below shows the approximate average grade point average for the years 1991-1992, 1996-1997, 2001-2002, and 2006-2007 for colleges and universities. Complete parts (a) through (c) below. (a) Does the graph suggest that grade inflation is a problem in colleges? (b) In public schools, the average GPA was 2.86 in 1991-1992 and 3.02 in 2006-2007. In private schools, the average GPA was 3.09 in 1991-1992 and 3.30 in 2006-2007. Determine the percentage increase in GPAs for public schools from 1991 to 2006. Determine the percentage increase in GPAs for private schools from 1991 to 2006. Which type of institution appears to have the higher inflation? (c) Do you believe the graph is misleading?
a. Yes, because the GPAs increased over time for all schools. b. The increase is 66% for public schools and 77% for private schools. So, private schools appear to have higher inflation. c. Yes, because the vertical axis does not start at 0.
Qualitative, or categorical, variables
allow for the classification of individuals based on some attribute or characteristic. Qualitative data are observations corresponding to a qualitative variable. When qualitative data are collected, we often first determine the number of occurrences within each category.
Select the correct choices that complete the sentence below. The ______ class limit is the smallest value within the class and the ______ class limit is the largest value within the class.
lower upper