Unit 6: Describing Data, Statistics: Mean, Median, Mode, Range, Interquartile Range, Standard Deviation, and Mean Absolute Deviation

¡Supera tus tareas y exámenes ahora con Quizwiz!

3.6

What is the mean absolute deviation of these numbers? 14, 7, 6, 5, 13

3.1

What is the mean absolute deviation of these numbers? 20, 21, 30, 17, 14, 18, 20

2.5

What is the mean absolute deviation of these numbers? 22, 20, 28, 24

1.25

What is the mean absolute deviation of these numbers? 5, 7, 4, 6, 3, 5, 3, 7

2.0

What is the mean absolute deviation of these numbers? 51, 57, 52, 55, 53, 50

2.9

What is the mean absolute deviation of these numbers? 90, 80, 92, 91, 88, 89

28

What is the median for this data set: 19, 27, 14, 28, 30, 51, 28 ?

20

What is the median for this data set: 20, 21, 18, 17, 20, 15, 56?

58

What is the median for this data set: 52, 72, 96, 21, 58, 40, 75?

bimodal distribution

"Bi" means two, so there are two local maximums (peaks) in a bimodal distribution. The bimodal distribution can be symmetrical if the two peaks are mirror images. (Mode refers to number of peaks)

steps to find the MAD

1) find the mean of the data set; 2) subtract to find the distance each value is from the mean; 3) find the mean of the distances (divide the sum of the distances by the number of values)

10

Find the mean of the data set: 5, 8, 17, 20, 0

44

Find the mode(s): 44, 29, 13, 44, 15

75

What is the Q3 for this data set: 52, 72, 96, 21, 58, 40, 75?

Statistics

Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation.

statistics

The science of collecting, organizing, analyzing, and interpreting data.

11

What is the IQR for this data set: 19, 27, 14, 28, 30, 51, 28 ?

4

What is the IQR for this data set: 20, 21, 18, 17, 20, 15, 56?

Symbol for Mean

X Bar

95%

_____________ of the data points SHOULD fall in the +/- of 2 SD value

sigma notation

a form of notation using the symbol sigma to express a series; Add all the terms after subtracting the mean for MAD.

Standard Deviation

equation for __________________

Large

poor measuring, poor precision, or containment is the cause of a _______ "s"

Range

standard deviation and variance are better ways to measure variation than ______________

Precise

standard deviation is a good measurement to find out how _________ data is

Standard Deviation

the square root of the variance and is a measure of how closely the individual points cluster around the mean; to find the ______________________, you MUST first find the variance

Variance

mathematically defined as the average squared distance from the mean; square of standard deviation

mean absolute deviation

one measure of variability; the average of how much the individual scores of a data set differ from the mean of the set. - abbreviation: MAD

two types of data

quantitative - numerical information (quantity) qualitative - descriptive information (quality)

Numbers

standard deviation takes into account all of the ____________ and indicates how similar or different the numbers are in the data set

deviation

the DIFFERENCE between one set of values and some fixed value, usually the mean of the set.

steps to find the IQR

1) order the data; 2) find the median of the data; 3) find the median of the lower half of the data (Q1); 4) find the median of the upper half of the data (Q3); 5) subtract Q3-Q1

Second Quartile (Q2)

50th percentile (median) - middle of the data

asymmetrical distribution

A distribution that is skewed right (also known as positively skewed) and a distribution that is skewed left (also known as negatively skewed) Now the picture is not symmetric around the mean anymore. For a right skewed distribution, the mean is typically greater than the median. Also notice that the tail of the distribution on the right hand (positive) side is longer than on the left hand side. From the box and whisker diagram we can also see that the median is closer to the first quartile than the third quartile. The fact that the right hand side tail of the distribution is longer than the left can also be seen. A distribution that is skewed left has exactly the opposite characteristics of one that is skewed right: the mean is typically less than the median; the tail of the distribution is longer on the left hand side than on the right hand side; and the median is closer to the third quartile than to the first quartile.

measure of spread and variation

A measure that describes the distribution (the spread) of a data set. - Range - IQR - Mean Absolute Deviation

measure of center

A measure that describes the typical value of a data set (such as mean, median, mode)

statistical question

A question which you do not expect to get a single answer.

Unimodal Distribution

A unimodal distribution is a distribution with one clear peak or most frequent value. The values increase at first, rising to a single peak where they then decrease. The "mode" in "unimodal" doesn't refer to the most frequent number in a data set — it refers to the local maximum in a chart. Technically there are the same thing: one mode (one common number) will equal one peak in a graph. However, when you are looking at a graph and trying to decide if it's a unimodal distribution or not, there's no list of numbers to guide you. The normal distribution is an example of a unimodal distribution; The normal curve has one local maximum (peak).

mode

Although you might commonly associate "mode" with being the most frequently occurring number in a data set, the term mode actually has two meanings in statistics, which can be confusing: it can either be a local maximum in a chart, or it can be the most frequently occurring score in a chart. The "mode" in bimodal distribution means a local maximum in a chart (i.e. a local mode). The two terms actually mean the same thing, as the most commonly found item in a data set will have a peak. But when you're trying to categorize graphs, it's easier to think of the mode as a "peak" rather than a common number. This helps especially if the axes are not labeled.

when to use mode?

Categorical data - data is not numerical or to find values that most frequently appear Ex. Find the most common or most popular results of surveys, elections, and lists of things - Which school won the tennis championship the most times or a survey on students' favorite pizza toppings

Other Symmetric Distributions

Cauchy distributions have symmetry. You're unlikely to come across these in elementary stats. They are a family of distributions where the expected value doesn't exist. The logistic distribution, which has long tails. The logistic and Cauchy distributions are used if the data is symmetric but there are more extreme values than you would expect to find in a normal distribution. Read more about these distribution types here. The uniform distribution is symmetric. The probabilities are exactly the same at each point, so the distribution is basically a straight line. An example of a uniform probability distribution could be picking a card from a deck: the probability of picking any one card is the same: 1/52.

examples of continuous and discrete data

Discrete - kids in a classroom (no fraction of a student); rolling dice (can not take numbers above 6) Continuous - person's height (in a range of human heights); time in a race (measure to fractions of a second)

34.5

Find the mean: 39, 34, 35, 42, 45, 12,

5.5

Find the mean: 9, 7, 2, 4, 3, 5, 9, 6, 8, 0, 3, 8

7.5

Find the median of the data set: 7, 12, 1, 5, 7, 8

how to find the median in an even numbered set of data

Find the median or the mean between the two points between the median Ex. 64, 77, 79, 80, 84, 86, 90, 90, 94, 96, 97, 98 The median is between 86 and 90. Find the mean between the two points to get the median

52.5

Find the median: 21, 32, 45, 60, 13, 25

88

Find the median: 79, 86, 86, 97 100, 95, 67

45

Find the median: 87, 56, 45, 23, 23, 34, 64, 23, 54

7

Find the mode(s) of the data set: 7, 12, 1, 5, 7, 8, 7

64

Find the mode(s): 89, 72, 34, 35, 35, 35, 64, 64, 77, 88, 64, 88, 64

100 and 13.5

Find the mode(s): 92, 100, 100, 23, 24, 99, 13.5, 13.5

14

Find the range of the data set: 2, 4, 6, 8, 12, 16

54

Find the range: 2, 8, 10, 12, 56, 9, 5, 2, 4,

88

Find the range: 56, 79, 80, 80, 93, 16, 16, 17, 5

96

Find the range: 67, 35, 67, 67, 13, 108, 109

53

Find the range: 73, 45, 67, 87, 98

Multimodal Distribution

Multimidal distributions have more than two peaks. If you can't clearly find one peak or two peaks in a graph, the likelihood is that you either have a uniform distribution (where all the peaks are the same height) or a multimodal distribution, where there are several peaks of the same height. A multimodal distribution is a probability distribution with more than one peak, or "mode." A distribution with one peak is called unimodal A distribution with two peaks is called bimodal A distribution with two peaks or more is multimodal A bimodal distribution is also multimodal, as there are multiple peaks. A comb distribution is so-called because the distribution looks like a comb, with alternating high and low peaks. A comb shape can be caused by rounding off. For example, if you are measuring water height to the nearest 10 cm and your class width for the histogram is 5 cm, this could cause a comb shape. comb distribution An edge peak distribution is where there is an additional, out of place peak at the edge of the distribution. This usually means that you've plotted (or collected) your data incorrectly, unless you know for sure your data set has an expected set of outliers (i.e. a few extreme views on a survey). edge distribution A multimodal distribution is known as a Plateau Distribution when there are more than a few peaks close together. plateau distribution Causes of a Multimodal Distribution A multimodal distribution in a sample is usually an indication that the distribution in the population is not normal. It can also indicate that your sample has several patterns of response or extreme views, preferences or attitudes. When thinking about the cause of the multimodality, you may want to take a close look at your data; what may be going on is that two or more distributions are being mapped at the same time. This is opposed to a true multimodal distribution, where only one distribution is mapped. For example, the following image shows two groups of students, one of which studied (the peak on the left) and one of which didn't (the peak on the right). multimodal distribution

when to use mean?

Normally distributed data (symmetrical); No extreme highs or lows (outliers), Extreme numbers (Outliers) will skew the data, so the mean becomes affected; Mean gives equal weight to every value Ex. Useful in calculating grade averages, sports statistics, average speeds Asymmetrical: 10, 12, 12, 25, 11, 17, 15, 2

when to use median?

Not normally distributed data (asymmetrical); Less weight to the extreme values (outliers) Ex. Useful in comparing prices (a consumer/customer may not want to buy the cheapest or most expensive but wants to find a price right in the middle) - A buyer looking to buy a house in a neighborhood might look to find the median price of the houses sold in that area, so the buyer can compare prices to see if a house is overpriced (expensive compared to the other houses) or undervalued (inexpensive compared to the other houses) due to its condition.

range

The difference between the greatest value and the least value.

18

What is the IQR for this data set: 22, 83, 80, 87, 75, 69, 88 ?

35

What is the IQR for this data set: 52, 72, 96, 21, 58, 40, 75?

19

What is the Q1 for this data set: 19, 27, 14, 28, 30, 51, 28 ?

17

What is the Q1 for this data set: 20, 21, 18, 17, 20, 15, 56?

69

What is the Q1 for this data set: 22, 83, 80, 87, 75, 69, 88 ?

40

What is the Q1 for this data set: 52, 72, 96, 21, 58, 40, 75?

30

What is the Q3 for this data set: 19, 27, 14, 28, 30, 51, 28 ?

21

What is the Q3 for this data set: 20, 21, 18, 17, 20, 15, 56?

87

What is the Q3 for this data set: 22, 83, 80, 87, 75, 69, 88 ?

2.4

What is the mean absolute deviation of the numbers? 33, 38, 31, 36, 37

4

What is the mean absolute deviation of the values? $63, $59, $72, $68, $61, $67

10.0

What is the mean absolute deviation of these numbers? 110, 108, 100, 90, 88, 80

4.0

What is the mean absolute deviation of these numbers? 12 ,9, 4,2, 12, 1, 9

80

What is the median for this data set: 22, 83, 80, 87, 69, 66, 88 ?

68%

_____________ of the data points SHOULD fall in the +/- of the 1 SD value

99%

_____________ of the data points SHOULD fall with in +/- of the 3 SD value

normal symmetric distribution

a distribution in which the data values are uniformly distributed about the mean; A symmetric distribution is a type of distribution where the left side of the distribution mirrors the right side. By definition, a symmetric distribution is never a skewed distribution. The normal distribution is symmetric. It is also a unimodal distribution (it has one peak). In a symmetric distribution, the mean, mode and median all fall at the same point. The mode is the most common number and it matches with the highest peak (the "mode" here is different from the "mode" in bimodal or unimodal, which refers to the number of peaks). Represented in a bell curve.

trick to find the median

add 1 to the number of terms, then divide by 2. The number which results tells which term is the median. Ex. 64, 77, 79, 80, 84, 86, 90, 90, 94, 96, 97, 98 There are 12 terms. 12 + 1 = 13. 13/2 = 6.5. The 6.5th term is the median

two types of qualitative data

discrete - can only take certain values (ex. whole numbers): counted continuous - can take any value within a range: measured

Variance

equation for __________________

data

group of facts; collection of information such as: numbers, words, measurements, observations, or just a description of things

Values

if a data set had the standard deviation of s=1, the data set would have similar __________, while if s=10, the data set would have very different values

first quartile (Q1)

the median of the lower half of a data set.

third quartile (Q3)

the median of the upper half of a data set.

median

the middle value in a data set.

mean

the sum of the data divided by the number of items in the data set.


Conjuntos de estudio relacionados

MS2- Musculoskeletal- NCLEX-RN Book

View Set

Florida 2-40 health exam chapter 4

View Set

Chapter 19: Speed, Agility, and Quickness Training Concepts

View Set

Texas Principles of Real Estate 2 - Chp. 3 Real Estate Financing Principles

View Set

Common Birth Defects Practice questions

View Set

Ch 08: Security Strategies and Documentation

View Set