Statistics - 3.1 Measures of Central Tendency
Resistant
A numerical summary of data is said to be resistant if extreme values (very large or small) relative to the data do not affect its value substantially.
A measure of central tendency numerically describes the average or typical data value. Three measures of central tendency are the mean, the median, and the mode.
The mean and median are usually used to measure the central tendency of a numerical data set. When the data set is skewed the median is the preferred measure of central tendency. Notice that the data set is almost symmetrical. Therefore, the data set is not skewed. Since the data set is not skewed, the central tendency that best describes the "center" of the distribution is the mean.
Mean
The mean of a variable is computed by determining the sum of all the values of the variable in the data set and dividing by the number of observations. If x 1, x 2,..., xn are n observations of a variable from a sample, then the sample mean, x over bar, is found using the formula below. x overbar=x1+x2+......+xn/n
Arithmetic mean
The arithmetic mean of a variable is computed by adding all the values of the variable in the data set and dividing by the number of observations.
A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median? Why?
When data are either skewed left or skewed right, there are extreme values in the tail, which tend to pull the mean in the direction of the tail. If the distribution of the data is skewed right, there are large observations in the right tail. These observations tend to increase the value of the mean, while having little effect on the median.
Mode
The mode of a variable is the most frequent observation of the variable that occurs in the data set.
Population Arithmetic Mean
The population arithmetic mean (pronounced "mew") is computed using all the individuals in a population. The population mean is a parameter.
What does it mean if a statistic is resistant?
Extreme values (very large or small) relative to the data do not affect its value substantially. A statistic is resistant if it is not sensitive to extreme values.
True or False: A data set will always have exactly one mode.
False The mode of a variable is the most frequent observation of the variable that occurs in the data set. To compute the mode, tally the number of observations that occur for each data value. The data value that occurs most often is the mode. A set of data can have no mode, one mode, or more than one mode. If no observation occurs more than once, the data have no mode.
Sample Arithmetic Mean
The sample arithmetic mean (pronounced "x-bar"), is computed using sample data. The sample mean is a statistic.
The U.S. Department of Housing and Urban Development (HUD) uses the median to report the average price of a home in the United States. Why do you think HUD uses the median?
HUD uses the median because the data are skewed to the right, and the median is better for skewed data.
Median
The median of a variable is the value that lies in the middle of the data when arranged in ascending order. We use "M" to represent the median.
The acidity or alkalinity of a solution is measured using pH. A pH less than 7 is acidic; a pH greater than 7 is alkaline. The accompanying data represent the pH in samples of bottled water and tap water.
Find the mean pH for tap water, rounding to three decimal places. Note that the number of observations is n=12.. x overbar= 7.66 + 7.45 + 7.45 + 7.48 + 7.68 + 7.83 + 7.45 + 7.20 + 7.56 + 7.47 + 7.52 + 7.47 / 12 = 7.518