Ch. 3 Central Tendency

Ace your homework & exams now with Quizwiz!

To find the median

First list the scores from lowest to highest. When the number of scores in the list is odd, the median is the middle score on that list. When the number of scores is even, the median is the mean average of the two middle scores. If the data are continuous vs. discrete, you will need to remember that a median score could lie anywhere between, for ex., 3.5 and 4.5. For discrete data, 4 could be the median, but for continuous, it may not be exactly 4.0.

When to use the median

Four reasons to use the median vs. the mean. In the first three cases, we are dealing with numerical data (where the mean is usually preferred). I.e. interval or ratio scales of measurement. The fourth reason involves ordinal data. In all cases, either the mean cannot be calculated, or the calculation of a mean produces a value that is not central or not representative. 1. When the distribution contains outliers or is skewed. The median is not usually affected by extreme scores. Ex. would be average income. Extremely wealthy people's incomes are outliers and do not represent the majority, so with income averages, the median is preferred. 2. If there is an undetermined score in the distribution. If, for ex., a participant never completed the assigned experimental task. This participant still represents the population, and their score should still be included, but a mean cannot be calculated with their score. 3. If there is no upper or lower limit to a category in a distribution, the distribution is considered open-ended. For ex., if a study examined how many pizzas a person in a sample of students could eat, and one of the categories was "5 or more," that isn't a measurable score. 4. Ordinal scale data. Mean measured distance (remember the see-saw, where the scores on each side of the fulcrum equal the same amount of total DISTANCE). But ordinal scales don't tell you distance, only order. So the mean doesn't work.

Characteristics of the mean average

If you change even one value of one score, this changes the mean because it first changes sum of x. If you add or remove a score, it usually changes the mean, but not always. The exception is when the new or removed score is the exact same value as the mean before the change. Think of this score as sitting right on top of the fulcrum under the see-saw. It won't change the balance. Also, if you increase or decrease each score by the same amount, it will have the exact same effect on the mean, either increasing or decreasing it by the same amount. Ex: If you "added" a treatment to a sample group that changes each participant's pretreatment score by 2 points, it would change the new mean by two points in the same direction. When multiplying or dividing each score by a constant as well, the same thing goes for the mean as it does when adding or subtracting the same from each score. Ex: If you measure using inches, and you want to convert those scores to feet, you divide by 12. So the mean would change in the same way. Just remember that the measurement is still the same, even if you report it in inches vs. feet vs. cm, etc.

Mean, Median, Mode

It's not always agreed upon what the center of a distribution is, or what the average representative is of a population. Statisticians have developed three types of central tendency measures to be used in different situations.

Mean

Known as the arithmetic average. Computed by adding all scores and dividing by number of scores. Mean for a population is identified by the Greek letter mu (pronounced "mew") Mean for a sample is identified by the letter M or by an x with a bar over it, called "x-bar." However, in publications M is used for a sample mean, not x-bar. In general, Greek letters are used to describe population parameters and English letters are used to describe sample statistics. The formula for the M (or mu) = sum of x over n (or N) An alternate way of thinking of the mean is the amount each participant receives when the total is divided equally. Another alternate way to think of the mean is as a balance point underneath a seesaw. The mean will balance the seesaw because the total distance from the mean (adding up each score's distance from the mean) below the mean will equal the total distance from the mean above the mean. This helps you to consider how the mean would have to change if a score was added, removed, or changed. The seesaw would tip unless you compute a new mean. The mean is almost always used in journal writing. It's usually the most stable average. The mean usually preferred when dealing with numerical scores. The mean also has the advantage of being closely related to variance and standard deviation, the most common measures of variability. The disadvantage of a mean is that it is misleading when scores are skewed and have outliers. Sometimes a mean cannot be calculated or it is not representative of the distribution.

Weighted mean

Sometimes you need to find the mean of two discrete sets of scores. The way you do this is "sum of X" for the two sets first, which is the overall sum of scores for both sets combined. Then you divide by the total number of scores in both sets combined. This is written as the sum of x over n. Another way this is written is the sum of x 1 (subscript) + the sum of x 2 (subscript) all over n 1 (subscript + n 2 (subscript). Remember when calculating sum of x to list out all scores (i.e. the frequency of each score), not just those listed in a distribution table. For ex., there may be three 10s, not just one, and so on. If fx (frequency multiplied by x) is calculated in the table, you can just add all these fx's together to get the sum of x. You can't just find the middle of the two sample means to find the weighted mean because one sample may be larger or more "weighted" than the other. That's why you use this formula.

Median

The midpoint of a list of scores ordered from smallest to largest. I.e. the point below which 50% of the scores are located. The median is the preferred average to find when the data are highly skewed. This means dividing the scores into two equal sized groups (same number of scores in each group). DISTANCE VS. # of SCORES The mean and median use different definitions to mean "middle" of the distribution. For the MEAN, middle is measured by even DISTANCES between scores on each side of the mean. The MEDIAN defines the middle in terms of SCORES, not distance. So half the scores are on each side of the median. The median can be equal to a score on the list or fall between two scores. There is no equation for computing the median, which means there is a degree of subjectivity when determining its value.

Mode

The score that has the greatest frequency. In common usage, the term means "a popular style" so it represents the most common score in a distribution. The definition of the mode is the same for a sample as it is for the population. There are no symbols or notation for the mode, and no symbol to differentiate the sample mode from the population mode. The mode can be used for analyzing any scale of measurement, including nominal scales (e.g. what your favorite ice-cream is if the ice cream flavors were coded with a number). The mode is the only average usable with nominal data (and also with discrete data). The mode also shows an actual score in the distribution because its the most frequent score, whereas the mean and median are often calculated values that are not actual scores in the distribution. Remember that the mode may not always be at the center of the distribution. Its just what is most frequent. Three main reasons to use the mode: 1. Nominal data. These data don't tell you anything about value or distance, just name differentiation. So you can't use a mean or median. 2. Discrete variables. Even though you can use a mean when discrete variables produce a numerical score (when they don't, you can't calculate a mean), the mode is still the better measure of central tendency. For ex., you can calculate the mean number of children in an American household as 2.4, but people would rather hear a whole number given for discrete variables. So the mode would be better. Often the mode is included in text to show the overall shape of the distribution. The mode shows the peak(s) in the distribution, so it helps give a visual of the data.

Central Tendency

This represents an average number within a distribution. The goal in measuring central tendency is to describe a distribution of scores by determining a single value that identifies the center of the distribution and shows what is most typical within that distribution. Ideally it is a value that is the best representative of the distribution. Not all averages represent the distribution the same. If this is done well, the average should be able to represent an entire population. Central tendency is also very useful for comparing two or more very large populations. Ex. avg. rain fall in Seattle compared to avg. rain fall in Arizona. "Number crunching" is a term that is often used to represent this activity. There is no single, standard procedure for measuring central tendency. I.e. no single measure produces a single value that is MOST representative in every situation.

Unimodal, Bimodal, and Multimodal Distributions

Usually a distribution only has one score that is the most frequent, and this is a unimodal distribution. If a distribution has two scores that are tied for being the most frequent, this would be a bimodal distribution. If there are 3 or more like this, it would be multimodal distribution. If there are a high number of modes, the distribution is said to have no mode. A bimodal distribution usually indicates two distinct groups within a sample or population. For example, a measure of height will likely produce a bimodal distribution because of the distance differences in the avg. male vs. female height. Sometimes mode is used more casually in a distribution to ID the peaks, even if one or more isn't technically the highest frequency. Sometimes the terms major mode and minor mode refer to the two modes in a bimodal distribution that are not exactly the same, but close.

The center can be elusive.

When the distribution does not show a traditional bell curve, or close to it, it can be difficult to know what the center really is.


Related study sets

Bushong 1-1 Nature of our Surroundings

View Set

Pharmacology - Cardiovascular Medications

View Set

Ch 15 - Physiological and Behavioral Responses of the Neonate

View Set

Chapter 51: Care of Patients with Musculoskeletal Trauma

View Set

Chapter 11- Measuring the Cost of Living

View Set