STATS Chapter 3 START ALL GRAPHS AT ZERO

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Median examples

25 30 33 39 39 40 53 67 69 94 130 • Answer = 40 • 22 28 30 32 37 40 • Answer = 31 • 22 28 31 31 37 40 Answer = 31

How to find the median

Arrange the numbers in order from least to greatest • Find the number in the middle

Notation

For a sample, the notation for the mean is (line of x symbol)

Standard Deviation (LOOK AT THE LECTURE AND POWERPOINT AGAIN TO LEARN HOW TO CALCULATE THIS)

Found by taking the square root of the variance

Mode

Is the value that occurs the most frequently in the data set

How to find the range

Largest data value-smallest data value (largest number - smallest number)

Z score example

Let's try an example • The heights of women aged 20 - 29 are approximately Normal with µ=64 inches and σ=2.7 inches. Find the z-score of a woman who is: • 70 inches tall Z = (70 - 64)/2.7 = 2.22 • 60 inches tall Z = (60 - 64)/2.7 = -1.48 • Given the information from the problem above for the heights of women, what is the height of a woman with a z-score of -1.6? • Z = (obs - µ)/σ • Therefore, -1.6 = (obs - 64)/2.7 • And, (-1.6 * 2.7) + 64 = obs • Or obs = 59.68 inches ( must put the number in with parenthesizes just like in this example it will be wrong if we do not use them)

Determining the Shape from a Box Plot

On power point, review before quiz and tests

Box Plot example

On slide 53 and 54

How to check midpoint on calculator (watch in the lecture again)

Stat. Edit list 1. Enter mid points. List 2 enter frequencies of those midpoints.

Approximating the mean for group data (LOOK AT THE SLIDE NOTES FOR THE EXAMPLE OF THIS)

Steps: 1. Compute the midpoint of each class by taking the average of the lower limit of the class and next larger class. 2. Multiple the class midpoint by the class frequency. 3. Sum the products of the midpoints and frequency 4. Divide the sum of the products by the sum of the frequencies.

n

The number of items in my data list

Median

The number splits the data in half or is the center value

Range

The simplest measure of the spread of the data is the range. This is simply telling someone how widely the data is dispersed.

Variance

The variance measures how the data values are dispersed from the mean.

Mean

To find the mean, we total up the numbers in the list (find the sum) and then divide that total by n.

How to calculate mean on calculator

add numbers to list. Press stat and select 1-Var. press 2nd 1. Press enter.

How to find the median on the calculator

add numbers to list. Press stat and select 1-Var. press 2nd 1. Press enter. Scroll down until you see "med"

A statistic is said to be resistant if

its value is not influenced by extreme values

If the distribution is symmetric we use

mean

if the distribution is skewed we use

median

Box Plot (IMPORTANT)

• A boxplot is a graph that presents the five number summary. • We will be making modified boxplots which gives a little more information than just the five number summary. • For our boxplots, we will represent any outliers with an asterisk.

Percentiles Example

• Let's look at the Tornado Data • 25 30 33 39 39 40 53 67 69 94 130 • Median 40 deaths • 25% Percentile = 33; L = (25/100 * 11) = 2.75. So choose position 3 or 33. • 75% Percentile = 69; L = (75/100 * 11) = 8.25. So choose position 9 or 69. • 10% Percentile = 30; L = (10/100 * 11) = 1.1. So choose position 2 or 30. • 90% Percentile = 94; L = (90/100 * 11) = 9.9. So choose position 10 or 94. Watch lecture on this part again

Five number summary example

• Let's look at the Tornado Data: 25 30 33 39 39 40 53 67 69 94 130 • Minimum = 25 • Q1 = 33 • Median = 40 • Q3 = 69 • Maximum = 130 • IQR = 69-33 = 36 • Lower outlier boundary = 33 - 1.5(36) = -21 • No outliers on lower side (cannot have -21 deaths) • Upper outlier boundary = 69 + 1.5(36) = 123 • One outlier at 130

Z Scores and the Empirical Rule

• Let's take the empirical rule and put it in terms of z-scores. When a population has a histogram that is approximately bell-shaped, then • Approximately 68% of the data will have z-scores between -1 and +1 • Approximately 95% of the data will have z-scores between -2 and +2 • Approximately 99.7% (or almost all) of the data will have z scores between -3 and +3

Empirical Rule (68-95-99.7 MUST KNOW THESE NUMBERS)

• Many histograms have a bell-shape and are symmetric. • When this happens, then we can use the Empirical rule to help us understand how the data is distributed: • Approximately 68% of the observations fall within 1 standard deviation of the mean • Approximately 95% of the observations fall within 2 standard deviations of the mean • Approximately 99.7% of the observations fall within 3 standard deviations of the mean • These approximations are helpful when we don't have the tools we need for detailed calculations. Always says approximately • Important note: The empirical rule can only be used when the distribution is symmetric.

Percentiles (on HW but only 1 or 2 questions on quizzes and tests)

• Percentiles: Percentiles divide a data set up into hundredths. When we report a value as being at the pth percentile, then we are saying that this value separates the lowest p% of the data from the highest (100 - p)%. • Now we take p% of n. (L= p/100 * n). If L is not a whole number, round this result to the next higher whole number. This value will tell us the position in the list in which our percentile value is located. If L is a whole number, then the p% of n is the average of the position L and L+1.

Computing the Percentile for a data point

• Procedure for computing the percentile for a given data value or point • Step 1: Arrange the data in increasing order • Step 2: Let x be the data value whose percentile is to be computed. Round your answer. Percentile = 100 x (number of values less than x) +0.5 / Number of values in the data set Tornado Data • Let's look at the Tornado Data • 25 30 33 39 39 40 53 67 69 94 130 • Median 40 deaths • What Percentile is data point 40 = 100 * [(5+0.5)/11] = 50% • What Percentile is data point 67 = 100 * [(7+0.5)/11] = 68%

Quartiles

• Quartiles divide the data into quarters or four equal parts once it has been ordered from smallest to largest. • Basically, we take the median of the first half of the data for Q1 and then do the same for the second half of the data to get Q3. • If there are an odd number of values, we don't include the median in each half when determining the quartile values. • Let's look at the tornado example to see the quartiles: • 25 30 33 39 39 40 53 67 69 94 130 • Another method for finding the quartiles ( as explained by your book) is to find the percentiles. Q1 is at the 25th percentile and Q3 is at the 75th percentile.

Chebyshev's Inequality

• Suppose we don't know if the distribution is symmetric or not? We have another rule that gives an idea of how the distribution spreads. • Summary of Chebyshev's inequality: • At least ¾ (75%) of the data will be within two standard deviations of the mean • At least 8/9 (88.9%) of the data will be within three standard deviations of the mean. • Does the empirical rule meet this criteria? Yes Always says at least

Chebyshev's Inequality/Empirical Rule Problems

• The empirical rule can only be used with distributions which are approximately bell shaped. • Chebyshev's can be used with any distribution whether is it symmetric or not. • Use the following population parameters to answer the questions: mean is 120 and the standard deviation is 8. • Assume the population distribution is bell-shaped. What are the cut-off values for the middle 95%? • Assume the population distribution is bell-shaped. What percentage of the values are between 112 and 128? • If we do not assume that the population is bell-shaped, at least what percent of the values are between 104 and 136?

Chebyshev's Inequality/Empirical Rule Problems

• The empirical rule can only be used with distributions which are approximately bell shaped. • Chebyshev's can be used with any distribution whether is it symmetric or not. • Use the following population parameters to answer the questions: mean is 120 and the standard deviation is 8. • Assume the population distribution is bell-shaped. What are the cut-off values for the middle 95%? Empirical rule, 95% is ± 2 standard deviations. 120 ± 2*8 = 104, 136 • Assume the population distribution is bell-shaped. What percentage of the values are between 112 and 128? • If we do not assume that the population is bell-shaped, at least what percent of the values are between 104 and 136?Chebyshev's Inequality/Empirical Rule Problems • The empirical rule can only be used with distributions which are approximately bell shaped. • Chebyshev's can be used with any distribution whether is it symmetric or not. • Use the following population parameters to answer the questions: mean is 120 and the standard deviation is 8. • Assume the population distribution is bell-shaped. What are the cut-off values for the middle 95%? Empirical rule, 95% is ± 2 standard deviations. 120 ± 2*8 = 104, 136 • Assume the population distribution is bell-shaped. What percentage of the values are between 112 and 128? Empirical rule. (112 - 120)/8 = -1 standard deviations. (128 - 120)/8 = 1 standard deviation. So, ± 1 standard deviation is 68% of values • If we do not assume that the population is bell-shaped, at least what percent of the values are between 104 and 136

Five number summary (VERY IMPORTANT FOR QUIZ TEST AND FINAL NEED TO KNOW EVERYTHING ON THIS)

• The five number summary includes the following values calculated from a dataset: • Minimum • Q1 • Median • Q3 • Maximum • Interquartile Range (IQR): The IQR is the difference between Q3 and Q1. • Outliers: An outlier is a value that is much larger or smaller than the rest of the data. These values could be the result of error or may be an anomaly that we need to study further. • Many times an outlier is obvious but it is helpful to have a formal calculation for determining if a value is an outlier or not. To do this, we'll calculate outlier boundaries, also referred to as fences. • Lower outlier boundary = Q1 - 1.5(IQR) • Upper outlier boundary = Q3 + 1.5(IQR)

Comparing Two different distributions

• There are many times that we want to compare two different distributions. However, if they have different means and different standard deviations, the comparisons are like trying to compare apples to oranges. We can standardize the values from the two different distributions. • The mean length of one-year-old spotted flounder is 126 mm with standard deviation of 18 mm and the mean length of two-year-old spotted flounder is 162 mm with a standard deviation of 28 mm. The distribution of flounder lengths is approximately bell-shaped. • Anna caught a one-year-old flounder that was 150 mm in length. What is the z score for this weight? Z = (150 - 126)/18 = 1.33 • Luis caught a two-year old flounder that was 190 mm in length. What is the z-score for this length? Z = (190 - 162)/28 = 1.00 • Whose fish is longer relative to fish the same age? Anna's fish is longer relative to age since her fish is 1.33 standard deviations above the mean while Luis's fish is only 1 standard deviations above the mean.

Measures of Position Z scores

• Z scores: If we know the mean and standard deviation for a distribution, we can convert the original values of the distribution into z-scores. The Z-score for any value is calculated like this: z= Observed-mean/standard deviation • Basically a z-score tells us how many standard deviations the original observation falls away from the mean and in which direction. • The z-score is also known as a standardized value.


Kaugnay na mga set ng pag-aaral

PCC 2 Diabetes practice questions

View Set

Accounts receivables and liabilities

View Set

Chapter 9: Muscles and Chapter 10 Test 3 (BIO 201 SUMMER 2019), A&P Lecture Ch. 8,9,11, Mastering HW Ch 9, Mastering Chapter 10 Activities, Ch. 7 Muscles A&P

View Set