Statistics Chapter 2: Frequency Distributions
Guidelines for Grouped Frequency Distribution Tables
1. The grouped frequency distribution table should have about 10 class intervals. 2. The width of each interval should be a relatively simple number. 3. The bottom score in each class interval should be a multiple of the width. 4. All intervals should be the same width. They should cover the range of scores completely with no gaps and no overlaps, so that any particular score belongs in exactly one interval.
Apparent Limits
Values of an interval that form the upper and lower boundaries for the class interval.
Relative Frequencies
Although you cannot usually find the absolute frequency for each score in a population, you very often can obtain relative frequencies. You can represent relative frequencies in bar graphs.
Symmetrical Distribution
In a symmetrical distribution, it is possible to draw a vertical line through the middle so that one side of the distribution is a mirror image of the other.
Proportions
Proportion measures the fraction of the total group that is associated with each score. They describe the frequency (f) in relation to the total number (N), and are often called *relative frequencies*. They can be expressed as fractions or decimals. A column of proportions, headed with *p*, can be added to the basic frequency distribution table.
Smooth Curves
When a population consists of numerical scores from an interval or a ratio scale, it is customary to draw the distribution with a smooth curve instead of the jagged, step-wise shapes that occur in histograms and polygons. The smooth curve indicates that you are not connecting a series of dots (real frequencies) but instead are showing the relative changes the occur from one score to the next.
Polygons
The second option for graphing a distribution of numerical scores from an interval or ratio scale measurement. A polygon can be used with data that have been grouped into class intervals.
Elements of Frequency Distribution
A frequency distribution can be structured either as a table or a graph and always presents that same two elements: 1. The set of categories that make up the original scale measurement. 2. A record of the frequency, or number of individuals in each category. Thus, it presents a picture of how the individual scores are distributed on the measurement scale.
Frequency Distribution Graphs
A picture of the information available in a frequency distribution table. All have two perpendicular lines called *axes*. The horizontal line is the X-axis (abscissa) and the vertical line is the Y-axis (ordinate). The measurement scale (set of X values) is listed along with the X-axis with values increasing from left to right. The frequencies are listed on the Y-axis with values increasing from bottom to top. A general rule is that the graph should be constructed to that its height (Y-axis) is approximately two-thirds to three-quarters of its length (X-axis).
Frequency Distribution
An organized tabulation of the number of individuals located at each category on the scale of measurement. Takes a disorganized set of scores and places them in order from highest to lowest, grouping together individuals who will have the same score. It shows whether the scores are generally high or low, whether they are concentrated in one area or spread out across the entire scale, and generally provides an organized picture of the data. It also allows you to see the location of any individual score relative to all the other scores in the set.
Bar Graph
Essentially the same as a histogram, except that spaces are left between adjacent bars. For a *nominal scale*, the space between the bars emphasizes the scale consists of separate, distinct categories. For *ordinal scales*, separate bars are used because you cannot assume that the categories are all the same size. To construct a bar graph, list the categories of measurement along the X-axis and then draw a bar above each category so that the height of the bar equals the frequency for the category.
Skewed Distribution
In a skewed distribution, the scores tend to pile up toward one end of the scale and taper off gradually at the other end. The section where the scores taper off is called the *tail* of the distribution. The skewed distribution with the tail on the right-hand side is *positively skewed* because the tail points toward the positive (above-zero) end of the X-axis. If the tail points to the left, the distribution is *negatively skewed*.
Proportions and Percentages
In addition to the two basic columns of a frequency distribution table, there are other measures that describe the distribution of scores and can be incorporated into the table. The two most common are proportion and percentage.
Percentages
In addition to using frequencies (f) and proportions (p), researchers often describe a distribution of scores with percentages. To compute the percentage associated with each score, you first find the proportion (p) and then multiply by 100. Percentages can be added in a frequency distribution table by adding a column headed with %.
The Shapes of a Frequency Distribution
Rather than drawing a complete frequency distribution graph, researchers often simply describe a distribution by listing its characteristics. There are three characteristics that completely describe any distribution: shape, central tendency, and variability. In simple terms, central tendency measures where the centre of the distribution is located. Variability tells us whether the scores are spread over a wide range or are clustered together. Technically, the shape of a distribution is defined by an equation that prescribes the exact relationship between each X and Y value on the graph. Nearly all distributions can be classified as either *symmetrical* or *skewed*.
Frequency Distribution Tables
The simplest frequency distribution table presents the measurement scale by listing the different measurement categories (X values) in a column from highest to lowest. Beside each X value, we indicate the frequency, or the number of times that particular measurement occurred in the data. Note that the X values in a frequency distribution table represent the scale of measurement, not the actual set of scores. Also, the frequencies can be used to find the total number of scores in the distribution. By adding up the frequencies, you obtain the total number of individuals.
Histograms
To construct a histogram, you first list the numerical scores (the categories of measurement) along the X-axis. Then you draw a bar above each X value so that: a.) The height of the bar corresponds to the frequency for that category. b.) For continuous variables, the width of the bar extends to the real limits of the category. For discrete variables, each bar extends exactly half the distance to the adjacent category on each side. For both continuous and discrete variables, each bar in a histogram extends to the midpoint between adjacent categories. As a result, adjacent bars touch and there are no spaces or gaps between bars.
Real Limits and Frequency Distributions
When a continuous variable is measured, the resulting measurements correspond to intervals on the number line rather than single points. The concept of real limits also applies to class intervals of a grouped frequency distribution table.
Grouped Frequency Distribution Tables
When a set of data covers a wide range of values, it is unreasonable to list all of the individual scores in a frequency distribution table. Therefore, instead we can group the scores into intervals and then listing the intervals in the table instead of listing each individual score.Using these types of tables we are presenting groups of scores rather than individual scores. The groups, or intervals, are called *class intervals*. When the scores are whole numbers, the total number of rows for a regular table can be obtained by finding the difference between the highest and lowest scores and adding 1: Rows= highest - lowest + 1
Graphs for Interval or Ratio Data
When data consists of numerical scores that have been measured on an interval or ratio scale, there are two options for constructing a frequency distribution graph. The two types of graphs are called *histograms* or *polygons*.
Graphs for Nominal or Ordinal Data
When scores are measured on a nominal or ordinal scale (usually non-numerical values), the frequency distribution can be displayed as a *bar graph*.
Distribution of Scores
Whenever the word distribution appears, you should conjure up an image of a frequency distribution graph. The graph provides a picture showing exactly where the individual scores are located. To make this more concrete, you might find it useful to think of the graph as showing a pile of individuals.
Graphs for Population Distributions
When you obtain the exact frequency for each score in a population, you can construct frequency distribution graphs that are essentially the same as histograms, polygons, and bar graphs that are typically used for samples. Although it is still possible to construct graphs showing frequency distributions for extremely large populations, the graphs usually involve two special features: relative frequencies and smooth curves. We can describe a normal distribution as being symmetrical, with the greatest frequency in the middle and relatively smaller frequencies as you move toward either extreme.
