Chapter 2
Lower Boundary
The smallest boundary in each interval of a frequency distribution. IE: table 2.2 (p. 34) Intervals f(x) 162-179 1 144-161 0 126-143 1 108-125 3 90-107 4 72-89 3 54-71 7 36-53 8 18-35 12 0-17 11* *The lower boundary is 0, and the upper boundary is 17 for this interval. The interval width is 18 for al intervals.
Interval Boundaries
The upper and lower limits for each interval in a grouped frequency distribution. IE: table 2.2 (p. 34) Intervals f(x) 162-179 1 144-161 0 126-143 1 108-125 3 90-107 4 72-89 3 54-71 7 36-53 8 18-35 12 0-17 11* *The lower boundary is 0, and the upper boundary is 17 for this interval. The interval width is 18 for al intervals.
Relative Frequency Distribution
A proportion from 0 to 1.0 that describes the portion of data in each interval. It is often easier to list the relative frequency of scores because a list with very large frequencies in each interval can be more confusing to read. RF= observed frequency/ total frequency count or (N) IE: Table 2.9 (p. 39) 2/45= 0.04 5/45= 0.11 11/45= 0.2 The sum of the relative frequencies across all intervals should add up to be 1.00
Grouped Data
A set of scores distributed into intervals, where the frequency of each score can fall into any given interval.
Real Range
One more than the difference between the largest and smallest number in a list of data. IE: in table 2.1 (p.32) the smallest value is 0 and the largest value is 175. 175-0= 175 175+1= 176 Therefore, 176 is the real range.
Rules for Creating a Simple Frequency Distribution
1. Each interval is defined (it has a lower and upper boundary). Intervals such as "or more" or "less than" should not be expressed. 2. Each interval is equidistant (the interval width is the same for each interval). 3. No interval overlaps (the same score cannot occur in more than one interval). 4. All values are rounded to the same degree of accuracy measured in the original data (or to the ones place for the data listed in table 2.1 (p. 32)).
Relative Percent Distribution
A common way to summarize relative frequency is to convert it to a relative percent because it tends to be easier to read at a glance. RP= observed frequency/ total frequency X 100
Interval
A discrete range of values within which the frequency of a subset of scores is contained.
Ogive
A dot-and-line graph used to summarize the cumulative percent of continuous data at the upper boundary of each interval.
Frequency Polygon
A dot-and-line graph used to summarize the frequency of continuous data at the midpoint of each interval.
Pie Chart
A graphical display in the shape of a circle that is used to summarize relative percent of discrete and categorical data into sectors. You can think of pie charts as slices or pieces of data. To construct, we simply distribute data as relative percents.
Histogram
A graphical display used to summarize the frequency of continuous data that are distributed in numeric intervals. It is similar to a Bar Chart, but a histogram groups numbers into ranges. And you decide what ranges to use! To construct a histogram, follow three rules: Rule 1: A vertical rectangle represents each interval, and the height of the rectangle equals the frequency recorded for each interval. Rule 2: The base of each rectangle begins at the upper and lower boundaries of each interval. Rule 3: Each rectangle touches adjacent rectangles at the boundaries of each interval.
Proportion
A part of a portion of all measured data. The sum of all proportions for a distribution of scores is 1.0
Stem-and-leaf Plot
A special table where each data value is split into a "stem" (the first digit or digits) and a "leaf" (usually the last digit). A graphical display where each individual score from an original set of data is listed. The data are organized such that the common digits shared by all scores are listed to the left (in the stem), with the remaining digits for each score listed to the right (in the leaf).
Simple Frequency Distribution
A summary display for: (1) The frequency of each individual score or category (ungrouped data) in a distribution (2) The frequency of the scores falling within defined groups (grouped data) in a distribution. They summarize how often scores occur. With large data sets, the frequency of scores contained in discrete intervals is summarized (group data). To construct a simple frequency distribution for grouped data, follow three steps: 1. Find the real range 2. Find the interval width 3. Construct the frequency distribution
Cumulative Relative Frequency Distribution
A summary display that distributes the sum of relative frequencies across a series of intervals. It is useful to summarize relative frequencies and percents cumulatively for the same reason described for cumulative frequencies. To distribute, add each rel. frequency at the top or bottom of the table. The total cumulative rel. frequency is equal to 1.00 (give or take rounding errors).
Cumulative Percent Distribution
A summary display that distributes the sum of relative percents across a series of intervals. To distribute, we can sum the relative percent in each interval, following the same procedures for adding as we did for cumulative relative frequencies. It identifies percentiles, which are measures of the relative position of individuals or scores within a larger distribution.
Pictogram
A summary display that uses symbols or illustrations to represent a concept, object, place, or event.
Outliers
Extreme scores that fall substantially above or below most of the scores in a particular data set. IE: Intervals f(x) 162-179 1 144-161 0 126-143 1 108-125 3 90-107 4 72-89 3 54-71 7 36-53 8 18-35 12 0-17 11 175 is an outlier in the data because this value falls substantially above most of the other values recorded. An interval would make it less obvious that an outlier exists because the uper boundary of the interval would not be given; instead, the upper boundary would be left open. Beware that saying "126 and above" is not very informative.
Open Interval
Or Open Classis an interval with no defined upper or lower boundary. IE: in table 2.2 (p. 34), only two values fall at or above 126. Intervals f(x) 162-179 1 144-161 0 126-143 1 108-125 3 90-107 4 72-89 3 54-71 7 36-53 8 18-35 12 0-17 11 It may be tempting here to combine the 3 top intervals into one open interval. We could list the interval as "126 and above" because only 2 values were counted.
Constructing the Frequency Distribution
Step 3: To do this, we distribute the same number of intervals we chose in step 2. IE: We chose 10 intervals. Table 2.2 (p. 34) show that each interval has a width of 18. Intervals f(x) 162-179 1 144-161 0 126-143 1 108-125 3 90-107 4 72-89 3 54-71 7 36-53 8 18-35 12 0-17 11 While it may seem like the first one doesn't have a width of 18, it includes (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17) which is 18 different numbers.
Frequency Distributions
Summarizes how often (or frequently) scores occur in a data set. To find the frequency of values, you must count the number of times that the scores occur. They are most often published when researchers measure counts of behavior.
Upper Boundary
The largest value in each interval of a frequency distribution. IE: table 2.2 (p. 34) Intervals f(x) 162-179 1 144-161 0 126-143 1 108-125 3 90-107 4 72-89 3 54-71 7 36-53 8 18-35 12 0-17 11* *The lower boundary is 0, and the upper boundary is 17 for this interval. The interval width is 18 for al intervals.
Frequency
The number of times or how often a category, score, or range of scores occurs. It can make the presentation and interpretation of a distribution of data clearer. IE: Exam Scores Frequency 90-99 2 80-89 5 70-79 6 60-69 4 *The student can compare their scores to the rest of the class off of a glance.
Sector
The particular portion of a pie chart that represents the relative percent of a particular class or category.
Percentile Rank
The percentile rank of s a score is the percentage of scores with values that fall below a specified score distribution.
Interval Width
The range of values contained in each interval of a grouped frequency distribution. To find this, we divide the real range by the number of intervals chosen. (The recommended number of intervals is between 5 and 20. Anything less provides you with either too little of a summary; anything more is too confusing). IE: in table 2.1 (p. 32), we decided the data should be split into 10 intervals. (176 is the real range). 176/10= 17.6 Since the rest of this data in the table is rounded up to the nearest ones place (the nearest whole number), the interval width should be rounded also to match. 17.6 should be rounded up to the nearest whole number. interval width= 18.
Percentile Point
The value of a score on a measurement scale below which a specified percentage of scores in a distribution fall. The corresponding percentile of a percentile point is the percentile rank of that score. IE: The 75th percentile point, for example, is the value (the percentile point) below which 75 percent of scores in a distribution fall (percentile rank).
Bar Graph
They are much like histograms, except that the bars are separated from one another, whereas the verticle rectangles on histograms touch. The separation between bars reflects the separation or "break" between the whole numbers or categories being summarized. Thusly they are appropriate for summarizing the frequency of discrete and categorical data.
Finding Percentile Point
To find the percentile point in a cumulative percent distribution, follow four basic steps: Based off of table 2.9 (p. 42). 1. Identify the interval within which a specified percentile point falls. IE: We want to identify the 75th percentile point, which falls in the interval of 90-98. Note that each percentile given in the table is the top percentage in each interval. 2. Identify the real range of the interval identified. IE: The interval that contains the 75th percentile point is the interval of 90-98 (this is the observed range). The real limits for this interval are 0.5 less than the lower limit, and 0.5 more than the upper limit. Thusly the real limits are 89.5-98.5. The width of the real range is therefore 9 points, or one point greater than the observed range. For percentages, the range width is 2 percentage points (from 61%-85%). 3. Find the position of the percentile point within the interval. First, find the distance of the 75th percentile from the top of the interval. (85-75=10). The 75th percentile is 10 points from the top of the interval. Next, divide 10 by the total range of the width of the percentages. Hence, 75% is 10 out of 24 or 10/24 of the total interval. Multiply the fraction by the width of the real range (9 points): 10/24 X 9= 3.57 points from the top of the interval. 4. Identify the percentile point. The top of the interval is 98.5. We subtract 3.75 from that value to identify the percentile point at the 75th percentile: 98.5 - 3.75= 94.75. Thus, the percentile point at the 75th percentile is 94.75.
Cumulative Frequency Distribution
When researchers want to describe frequencies above or below a certain value, they often report Cumulative Frequency Distribution. It distributes the sum of frequencies across a series of intervals. You can add from the top of the bottom, it really depends on how you want to discuss the data. IE: Intervals f(x) Calculation CFD 108-116 2 3+3+5+7+9+11+5+2 45 99-107 5 3+3+5+7+9+11+5 43 90-98 11 3+3+5+7+9+11 38 81-89 9 3+3+5+7+9 27 72-80 7 3+3+5+7 18 63-71 5 3+3+5 11 54-62 3 3+3 6 45-53 3 3 3 The top frequency is equal to the total number of measures recorded (the total was 45 businesses in this example). This type of summary "bottom up" or "at or below" a certain value or "at most." For example: "safe businesses are those that report "fewer than" 72 complaints. In the CF column, we find that 11 business are categorized as safe.