Module 6: Statistics - Measures of Central Tendency and/or Location
Measures of Central Tendency
1. provide information about the center of a distribution 2. mean, median, mod 3. three different approaches 4. no information about variance
Measures of Location
1. Based on same concept as median 2. quartiles 3. percentiles 4. no information about variance
Using Means
1. define for your audience 2. know definition 3. at least interval level data 4. influenced by outliers 5. most common
Percentile Bars
A tool to help visualize the distribution of data Based upon P10, P 25, P50, P75, P90 and the mean Can indicate the shape of the distribution The "bullseye" represents the mean These relationships among the various measures of central tendency or location will be helpful when examining the symmetry or skewness of a distribution, as will be discussed in Module 8
Mean
Arithmetic average: the mean is the arithmetic average mean = sum of all #s / # count x bar represents the mean: we use x bar to represent the mean x bar = sum (x bar) / N See pgs 147-148
At least interval level data
As a general rule, you should have at least interval data before using the mean
Mean, median, mode
Common sense measures of central tendency include mean, median and mode
Using percentile bars
Here we show the percentile bars (market salary data) alongside internal company data, known as I-Bars This is the combination of market salary data and KLS salary structure data. With the whole company's pay structure on one chart, it is helpful to summarize all the data The percentile bars represent the external market The I-bars represent the internal structure Compa-ratios and market indices can be easily determined See example on pg 171
Quartlies
In HR management, we often use quartiles P25, P50 and P75 are the upper limits of the 25th, 50th, 75th percentiles Q1, Q2, Q3 are the upper limits of the 1st, 2nd and 3rd quartiles Triciles, quintiles and deciles are also commonly used
Provide information about the center of a distribution
Measures of central tendency are specific measures of location that provide information about the center of a distribution or about the most typical, representative value
No information about variance
Measures of central tendency do not give any information about how varied or dispersed data are
Based on the same concept as median
Measures of location are based on the same concept used to find the median -the median provides information about the central tendency -other measures of location represent other positions in the distribution
Measures of Location (cont.)
Measures of location are very useful in setting pay philosophy A value that a given percent of the data is less than See example on pg 164
No information about variance (measures of location)
Measures of location offer no information about how varied or dispersed the data are
Percentiles
Percentile bars, based on the 10th, 25th, 75th, 90th percentiles and the mean provide very useful visualizations of a distribution
Comparing Various Measures of Central Tendency
See example on pg 161-162
Visualizing Percentiles for Madrid Data
See example on pg 165
Weighted and Unweighted Calculation Example
See pgs 149-153
Measures of Location - Finding Percentiles
Step 1: locate the percentile in the rank order (right column) Formula: rank order value = [(x/100) x n] + [1-(x/2) Where: x = the desired percentile (25th, 50th, etc.) n = the number of cases in the list Step 2: locate the value in the raw bonus points data (left column) that corresponds to the rank-ordered position. Use the percentile result of Setp 1 to interpolate the percentilein the raw data. Interpolation is an exact value representing a remainder Two-step formula: a. decimal remainder from Step 1- percentile rank order - lower rank order b. Percentile of raw data Lower value + [(higher - lower) x decimal remainder] See example on pgs 167-169
Most common
The mean is the most commonly reported measure of central tendency
Three different approaches
The measures are based upon three different approaches that help identify typical values for distribution
Median Calculation Example
The median corresponds to a point along an ordered distribution such that the same number of observations fall above and below this point. The median measures only the middle value, and totally ignores the actual values - only determining the middle value when all of the values are placed in rank order *The median is a robust measure - it is unaffected by changes in actual values and is only interested in their order* Steps: place the data into rank order, determine the median See pg 155 For an even number of data points, the median is the average of the two middle-most observations See pgs 156-157
Mode
The mode is the value that occurs most often among a data set Steps: place data into rank order, group the data into intervals or common values (a spreadsheet does not require you to rank the values) see pgs 158-160
Weighted vs Unweighted Mean
The weighted mean gives more weight to companies who have more people. The unweighted mean treats all three banks equally and disregards the number of people. The example show is therefore the unweighted mean, since we use the aggregated company data and have no data on individual employee salaries in any of the three companies See pg 149 The distinction between weighted and unweighted means then becomes important
Measures of central tendency are measures of the location of the center of a distribution, or, in other words, the most typical, representative value. Measures of location provide information about other points in a data set.
Together, these tools help make sense out of raw data so the comp professional can interpret and present the results of his or her research. In this module, we will review measures of central tendency and location, and discuss percentile bars
Know difinition
When you read a mean, know its definition Ask questions if you are not sure what kind of mean you are reading
Define for your audience
When you report a mean, define if for your audience Explain clearly whether you are calculating a weighted or an unweighted mean and why
Influenced by outliers
means are influenced by outliers, while medians and modes are not
Weighted mean
obtained by weighting each value by the number of times that value has occurred in the set of data, and then averaging
Unweighted mean
obtained without weighting each value by the number of times that value has occurred in the set of data. In some surveys, raw data are not disclosed -instead, data are presented in summary form. Example: incumbents and average salaries given by a company