Week 13 COMPLETE

Ace your homework & exams now with Quizwiz!

The standard deviation is known as the

"mean of the mean," and often can help you find the true meaning behind the data. The standard deviation describes how vary or deviate from their mean. To understand this concept, it will be beneficial if you will be able to recall normal distribution of data. As you know, a normal distribution of data means that most of the examples in a set of data are close to the "average," while relatively few examples tend to one extreme or the other. If you report the mean and standard deviation of a set of scores - we are usually able to convey a good picture of a distribution.

The mode is rarely used as a sole descriptive measure of central tendency because

it may not be unique: there may be two or more modes. These are called bimodal (two modes) or multimodal (several modes).

is the symbol for summation, or sum; and

advantage of using range

it is easy to calculate

the median and mode can be unaffected by extreme values

, while the mean can be distorted as a measure of central tendency by outliers.

Should one patient or a few patients in a population affect the average to this degree? If they do, is the statistical computation meaningful for decision-making purposes? In this situation, the hospital has two (2) options:

1) A notation can be made on the report that either the ALSO of 9.7 includes one patient who stayed 365 days, or the ALOS of 7.0 excludes one patient who stayed 365 days. Both calculations can be made. Appropriate notes should be attached to the report to indicate the difference. Or: 2) The computation using the median rather than the mean can be used.

Obviously, manual computation of the median is much more time-consuming than computation of the arithmetic mean.

Also, it would be impractical with a large number of discharged patients. If the statistical computation is manual, it would be better to use the first option above. However, if the statistical computation is computerized, it would be better to use the second option.

The measure of central tendency that best represents various kinds of data is:

Skewed distribution—median Nominal scores—mode Ordinal scores—median Interval scores—mean

For example, if each subject in a population had a score of 20, then the mean of the scores would be 20. When calculating the variance

Each score of 20 is subtracted from the mean of 20, resulting in deviations that are all zero ; When each deviation is squared, they will equal zero ; When all squared deviations are summed, they will equal zero ; When zero is divided by 19 (N - 1), the quotient is zero. This is as it should be; there is no variation when each observation is the same. When zero is entered into the numerator of the standard deviation formula, the solution will be zero . Again, when there is no variation, the standard deviation will equal zero - indicating that there is no variation in the observations. Don't forget that the standard deviation is a first cousin of the mean. Thus, when researchers report the mean (the most popular measure of central tendency), they usually also report the standard deviation.

Advantage of the mode

It can be calculated for both quantitative and qualitative data. The mean and median are only appropriately used with quantitative data.

Weaknesses of the mode

It may not be descriptive of the distribution It may not be unique The entire distribution is not represented

∑ scores / N = X with a line over it

Total sum of all the values / Number of the values involved = Mean

Suppose for another group, the mean of their normal distribution also equals 70, but their standard deviation equals 5. Then, 68% of the cases lie within 5 points of the mean as illustrated below.

Regardless of the value of the standard deviation, 68% of the cases lie within one standard deviation in a normal curve. This is a property of the normal curve. When you are calculating the standard deviation, you are actually calculating the number of points that one must go out from the mean to capture the middle 68% of the cases. However, this two-thirds rule does not strictly apply if the distribution is not normal. The less normal (or skewed) a distribution is, the less accurate the rule is.

When you are presented with a standard deviation, keep the following multipliers in mind: Example: Standard Deviation Multipliers

Rule Multiplier 68% 1 95% 2 99% 3 Multiply the standard deviation for a set of scores by the appropriate multiplier before adding and subtracting from the mean.

If the frequency distribution is a bilaterally symmetrical, unimodal distribution,

then all three measures of central tendency will be equal.

Advantages to using the median is that

it is relatively easy to calculate; takes into consideration the entire distribution and is not influenced by outliers.

No measure of central tendency is universally preferred. Each is better under certain situations.

The most commonly used measure is the mean, followed by the median. The mean uses each value of the data set. The median is a better representation when outliers are present.

The larger the deviation from the mean, the larger the standard deviation.

The smaller the deviations from the mean, the smaller the standard deviation.

The 68% Rule

The standard deviation takes on a special meaning in relation to the normal bell curve. This is because it was designed expressly to describe this distribution. Here's a simple rule to remember: about two-thirds of the cases (or observations) lie within one standard deviation of the mean in a normal distribution. (Note that "within one standard deviation" means one standard deviation on both sides of the mean .)

that is, how wide the bell curve is around the mean.

When we look at a the data for a population, often the first thing we do is look at the mean. But even if we know that the distribution is perfectly normal, the mean isn't enough to tell us what we need to know about the population. We also need to know something about how the data is spread out around the mean -

In other words, the median is the point

above and below which 50% of all values fall.

To arrive at the median in an even-numbered distribution,

add the two middle values together and divide by 2. When the two middle values are the same, the median is that value.

Bilaterally Symmetrical Curves

are curves when folded down the center vertically, is identical on both sides of the fold. All three measures of central tendency (mean, median, mode) will be identical. These three measures will lie at the center of the distribution and thus are called measures of central tendency. There are other curves which resemble the bell-shaped curve and are symmetrical, but they vary in their ratios of height to width. Some are referred to as "peaked" curves and others as "flat" curves.

The mean is the

arithmetic average. It is common to use the term average to designate mean. It is computed by dividing the sum of all observations (scores) by the total number of observations (scores). It is the most commonly used measure of central tendency.

The median value is obtained by

arranging the numerical observations in ascending or descending order and then determining the value in the middle of the array. This may be the middle observation (if there is an odd number of values) or a point halfway between the two middle numbers (if there is an even number of values). It is the midpoint of a distribution in which 50% of scores lie above the middle score and 50% lie below the middle score.

Negative skewness

curve skewed to the left mean is shifted to the left mode shifts to the right The mode will shift to the high point of the curve or away from the tail. The median will lie between the mean and mode. In summary, skewness affects measures of central tendency.

Positive skewness

curve skewed to the right mean is shifted to the right mode shifts to the left

the mean is sensitive to

extreme values, called outliers, which may distort its representation of the typical value of a set of numbers. The effect of outliers causes skewed curves, rather than bell-shaped curves. Therefore, the mean is not always the best measure of central tendency, because of the effect of outliers on its value.

Skewed curves

have scores concentrated at either the high or low end of the curve. They are not symmetrical in shape. The direction of skewness refers to the location of the tail, rather than the end where the height of the curve (where scores tend to be concentrated) occurs. A curve skewed to the right is one with the tail to the right end; a curve skewed to the left has a tail extending to the left . Skewness affects the location of the measures of central tendency. They will not be identical. Their location will shift.

The mode score is located at the

highpoint of a curve. In a truly normal distribution, the mode score value is located in the center of that curve. The mean, median, and mode, will all be equal in a normal, unimodal (single mode), symmetrical, bell-shaped curve.

Normal Bell-Shaped Curve

is also referred to as a normal curve. It represents a "normal" distribution of data within certain parameters. Is considered the mathematical ideal

As stated earlier, the median

is not sensitive to outliers as is the mean.

X with a line over it

is the symbol for mean

N

is the symbol for numbers

The mode is simple to locate but is impractical for most situations. The choice of a measure of central tendency depends on the

number of values and the nature of their distribution. Sometimes the mean, median, and mode will be identical. For statistical analysis, however, the mean is preferable, whenever possible, because it includes information from all observations. However, if the series of values contains a few that are unusually high or low, the median may represent the series better than the mean. The mode is often used in samples where the most typical value is preferred. The mode does not have to be numerical. If you ask every person in this class what his or her favorite food is and tally the answers, you stand a pretty good chance of finding a mode.

The mode is the value that

occurs most frequently and, in this sense, the value that is most typical. It is also the simplest of the measures of central tendency because it does not require any calculations. In the case of a small number of values, each value will likely occur only once and there will be no mode.

These are measures of dispersion

range variance standard deviation

what term is used to describe the plotted graph that results when values are concentrated at either the high end or low end of the distribution, not around the mean?

skewed

The approximate 95% rule says

that if you go out 2 standard deviations on both sides of the mean in a normal distribution, you will find approximately 95% of the cases. Example 1 The mean for a group equals 35 and the standard deviation equals 6. Two standard deviations equals 12 points (2 x 6 = 12). Thus, if you (a) go up 12 points from the mean (35 + 12 = 47) and (b) go down 12 points from the mean (35 - 12 = 23), you have identified the scores (47 and 23) between which approximately 95% of the cases lie. The 99% rule says that if we go up and down 3 standard deviations from the mean, we find approximately 99% of the cases. For the information in Example 1, multiply 3 times the standard deviation (3 x 6 = 18). Going up and down 18 points from the mean yields these scores: 53 and 17. Example 2 If the mean = 35 and the standard deviation 6, then approximately: 68% of the cases lie between 29 and 41; 95% of the cases lie between 23 and 47; and 99% of the cases lie between 17 and 53. You can see that almost all cases (99%) in a normal distribution lie within 3 standard deviations of the mean. Thus, for practical purposes we can say that a normal distribution has only six standard deviations - three above the mean and three below the mean.

The range is the difference between

the highest and lowest value in a data set. It is the easiest measure of variation to compute, but only the high and low scores are used in the computation. As sample size increases, the range also tends to increase. A major disadvantage of the range is that it does not include all scores (only the largest and smallest) in the distribution in its calculation. The range, like the mean, has the disadvantage of being influenced by outliers. The range is not a good measure of dispersion when there are outliers and is not regarded as a satisfactory measure of dispersion.

The most commonly reported measures of central tendency include

the mean, median and mode.

the median is

the midpoint (center) of the distribution of values.

At the extreme, when all the scores are the same,

the standard deviation equals zero.

what is used to describe the average squared deviation from the mean of scores in a set of data?

variance

Statistically speaking, the standard deviation is the square root of the average squared deviation from the mean , the average squared deviation from the mean is called the

variance or measure of variation.


Related study sets

Chapter 12 Distributing and Promoting Products and Services

View Set

Unit 12 Lesson 58 Notgrass Exploring America

View Set

English File pre-intermediate Unit 1 - 1

View Set

Chapter 7: Introduction to Structured Query Language (SQL)

View Set

Midterm review for CIS 263 (ethical hacking and countermeasures)

View Set