Stats Unit 3

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Symmetric the mean...

equal to the median

Outliers can either be caused by...

errors or can be an actual data point.

(T/F): When an observation that is much larger than the rest of the data is added to a data​ set, the value of the median will increase substantially.

false

To find the standard deviation for a frequency class you must first...

find the mean (if not already given)

When an observation that is much larger than the rest of the data is added to a data​ set, the value of the mean will...

increase.

Jennifer just received the results of her SAT exam. Her math score of 600 is at the 74th percentile. What is the kth? Observation?

kth: 74th percentile Observation: 600 math score

Skewed right the mean...

larger than the median

If a distribution is skewed left, what is the relation between the mean and median?

mean<median

If a distribution is symmetric, what is the relation between the mean and median?

mean=median

If a distribution is skewed right, what is the relation between the mean and median?

mean>median

The​ z-score corresponding to the individual is 0.14 and indicates that the data value is standard​ deviation(s) above the mean.

0.14

Benjamin owns a small Internet business. Besides​ himself, he employs nine other people. The salaries earned by the employees are given below in thousands of dollars​ (Benjamin's salary is the​ largest, of​ course). Complete parts​ (a) through​ (d). 25,35,40,50,50,50,55,55,65,75 ​(c) As a second​ option, Benjamin can give each employee a bonus of​ 5% of his or her original salary. Add the bonuses under this second plan to the original salaries to create a new data set. Recalculate the​ mean, median, and mode. How do they compare to the​ originals? 1. How do you solve problem c? 2. How do the​ mean, median, and mode compare to the​ originals?

1. 30+(0.05x30) <- do this to each data value 2. All three measures increased by​ 5%.

Benjamin owns a small Internet business. Besides​ himself, he employs nine other people. The salaries earned by the employees are given below in thousands of dollars​ (Benjamin's salary is the​ largest, of​ course). Complete parts​ (a) through​ (d). 25,35,40,50,50,50,55,55,65,75 (b) Business has been​ good! As a​ result, Benjamin has a total of​ $25,000 in bonus pay to distribute to his employees. One option for distributing bonuses is to give each employee​ (including himself)​ $2,500. Add the bonuses under this plan to the original salaries to create a new data set. Recalculate the​ mean, median, and mode. How do they compare to the​ originals? 1. How do you solve problem b? 2. How do the​ mean, median, and mode compare to the​ originals?

1. Add 2.5 onto each of the data values and recalculate 2. All three measures increased by​ $2,500.

Find the varience Standard deviation: 9.8

96.04^2 (make sure to include the squared symbol)

For the distribution​ shown, which letter represents the​ mode?

A

Benjamin owns a small Internet business. Besides​ himself, he employs nine other people. The salaries earned by the employees are given below in thousands of dollars​ (Benjamin's salary is the​ largest, of​ course). Complete parts​ (a) through​ (d). 25,35,40,50,50,50,55,55,65,75 (d) As a third​ option, Benjamin decides not to give his employees a bonus at all.​ Instead, he keeps the​ $25,000 for himself. Use this plan to create a new data set. Recalculate the​ mean, median, and mode. How do they compare to the​ originals? 1. How do you solve problem d? 2. How do the​ mean, median, and mode compare to the​ originals?

1. Add 25 to the largest data set (75+25=100) and then recalculate with this new data point 2. The mean increased by​ $2,500, but the median and the mode did not change.

Steps to calculating standard deviation sample

1. Calculate the mean 2. Data value-mean=x 3. Square x 4. Find sum of all x2 5. Divide sum by n-1 6. Square root divided sum

Steps to calculating standard deviation population

1. Calculate the mean 2. Data value-mean=x 3. Square x 4. Find sum of all x2 5. Divide sum by pop. 6. Square root divided sum

Using a boxplot, how can you tell which brand has more chips per cookie? How can you tell which one is more consistant?

1. Which ever brand has the higher median 2. look at the range

In​ 1994, major league baseball players went on strike. At the​ time, the average salary was​ $1,049,589, and the median salary was​ $337,500. If you were representing the​ owners, which summary would you use to convince the public that a strike was not​ needed? If you were a​ player, which would you​ use? Why was there such a large discrepancy between the mean and median​ salaries? Explain. If you were representing the​ owners, you would use the _______ to convince the public that a strike was not needed. If you were a​ player, you would use the _______ to convince the public that a strike was needed. The average and median salaries differ so greatly because ________.

1. average salary 2. median salary 3. the distribution of salaries is skewed right.

According to Empirical Rule, with a mean of 100 and a standard deviation of 16.1, what data points lie at 68%, 95%, and 99.7%?

100+16.1= 116.1->68% 116.1+16.1=132.2->95% 132.2+16.1=148.3->99.7%

If a variable has a distribution that is​ bell-shaped with mean 18 and standard deviation 4​, then according to the Empirical​ Rule, 68.0​% of the data will lie between which​ values?

14 and 22

1st quartile =

25th percentile

2nd quartile =

50th percentile (Median)

1 standard deviation = 2 standard deviation = 3 standard deviation =

68% 95% 99.7%

3rd quartile =

75th percentile

Use the Empirical Rule to determine the percentage of candies with weights between 0.7 and 0.98 gram. ​Hint: mean=0.84 STD=0.07

95%

List the conditions for determining when to use the following measures of central tendency: A) Mean: B) Median: C) Mode:

A. When data are quantitative and the frequency distribution is roughly symmetric B. When the data are quantitative and the frequency distribution is skewed left or skewed right C. When the most frequent observation is the desired measure of central tendency or the data are qualitative

How do you calculate the midpoint of the last class when the numbers do not have a trend?

Add one to the upper limit of the last class and use that for the midpoint formula of the last class. Ex: 77+81/2= 79

What can be said about a set of data with a standard deviation of​ 0?

All the observations are the same value.

Comment on the role that the number of observations plays in resistance.

As the sample size​ increases, the impact of the misrecorded data on the mean decreases.

Explain why it is helpful to think of the mean as the center of gravity.

Because the mean is the value such that a histogram of the data is perfectly balanced, with equal weight on each side of the mean.

In order to use the empirical rule, what shape does the graph need to be?

Bell shaped or roughly bell-shaped

How do we determine the better negative z-score?

By determining which score is more negative, as that is the one farthest away from the mean (even though it is not the largest)

(T/F): Range is resistant

False

Identify the given statement as either true or false. The standard deviation is a resistant measure of spread.

False

Identify the given statement as either true or false. The standard deviation can be negative.

False (because of squaring)

The U.S. Department of Housing and Urban Development​ (HUD) uses the median to report the average price of a home in the United States. Why do you think HUD uses the​ median?

HUD uses the median because the data are skewed right.

Which of the following are resistant measures of​ dispersion?

IQR

What does a positive z-score for a data value indicate? What does a negative z-score indicate?

If a data value is larger than the mean, the z-score is positive. If a data value is smaller than the mean, the z-score is negative. If a data value equals the mean, the z-score is zero.

Using standard deviations, how can you tell if x inches is far away from a mean of x ​inches?

If the data point is more than two standard deviations from the mean then it is far. If the data point is less than two standard deviations from the mean then it is close.

If the shape of a distribution is skewed left or skewed right, which measure of central tendency and which measure of dispersion should be reported? Why?

If the shape of a distribution is skewed, it is best to use the median as the measure of central tendency and the interquartile range as the measure of dispersion because these measures are resistant.

If the shape of a distribution is symmetric, which measure of central tendency and which measure of dispersion should be reported?

If the shape of a distribution is symmetric, the mean should be used as the measure of central tendency, while the standard deviation should be used as a measure of dispersion.

Under what conditions will a set of data have two modes?

If there are multiple numbers that occur with equal frequency, and more times than the others in the set.

List the three steps for finding quartiles.

Step 1: Arrange the data in ascending order. Step 2: Determine the median, M, or second quartile, Q2 Step 3: Divide the data set into two halves: the observations less than M and the observations greater than M. The first quartile, Q1, is the median of the bottom half, and the third quartile, Q3, is the median of the top half. Do NOT include M in these halves.

Over the past 10​ years, five mutual funds all had the same mean rate of return. The standard deviations for each of the five mutual funds are shown below. Capital​ Investment: 4.2​%; ​Vanity: 7.5​%; Global​ Advisor: 5.7​%; International​ Equities: 8.3​%; ​Nomad: 6.1​% Which mutual fund was least consistent in rate of​ return? Why?

International​ Equities: 8.3​% because it has the largest dispersion

Table 14 shows the red blood cell mass (in millimeters) for 14 rats sent into space (flight group) and for 14 rats that were not sent into space (control group). Construct side-by-side boxplots for red blood cell mass for the flight group and control group. Does it appear that space flight affects the rats' red blood cell mass?

It appears that the red blood cell mass has been lessened by space flight. First, note that the median for the flight group was about 7.7mm, while the median group was about 8.6mm. Another thing to consider is that the interquartile range (the width of the box for each boxplot), is about the same. Solution: It appears that spaceflight has reduced the red blood cell mass of the rats.

What does a measure of central tendency describe?

It numerically describes the average or typical data value

How is dispersion related to range?

Just like standard deviations, the higher the range the higher the dispersion

According to Empirical Rule, with a mean of 100 and a standard deviation of 16.1, what % of students lie between 116.1 and 148.3?

Looking at the Empirical Rule Chart: 13.5+2.35= 15.85%

Percentiles are used to give the relative standing of an observation. Give a real life example

Many standardized exams, such as the SAT, use percentiles to let students know how they scored on the exam in relation to all others who took the exam.

Who did better in the race? Bob's z-score: -0.49 Mary's z-score: -0.65

Mary did relatively better. Even though Bob's z-score is larger, Mary did better because her time is more standard deviations below the mean.

Which of the following are resistant measures of central​ tendency?

Median

Interpret boxplot. Where is the Median, Q1, Q3, Min #, Max #

Median= 24 (line in the middle) Q1= 22.9 (left side of box) Q3=27.9 (right side of box) Min= 21 (left line) Max= 31 (right line) *There are no outliers in this boxplot

What values does the five-number summary consist of?

Minimum, Q1, M, Q3, Max

When determining the actual percentages of a problem, do we use the empirical rule?

NO, only find percentages using the empirical rule when prompted in the question. When determining actual percentages, use the data table provided

Is standard deviation resistant? Why or why not?

No, because extreme measures greatly impact the standard of deviation.

List the four steps for checking for outliers by using quartiles.

Step 1: Determine the first and third quartiles of the data Step 2: Compute the interquartile range Step 3: Determine the fences. Fences serve as cutoff points for determining outliers. Step 4: If a data value is less than the lower fence or greater than the upper fence, it is considered an outlier.

If the raw data is this: 23, 11, 3, 9, 8 how many decimal places will the standard deviation be?

One: 7.4 (we add one more decimal place than the raw data)

Red Sox Z-Score: 2.67 Rockies Z-Score: 2.05 Who had a better year?

Red Sox, as they were farther above the mean (2.67) than the Rockies (2.05)

Who did better in the race? Sam's z-score: -2.63 Jane's z-score: -1.41

Same had a more convincing victory because of a lower z-score

When the left whisker is longer than the right whisker, and the Median is right of the center of the box, what is the most likely shape of the distribution?

Skewed left

If the right whisker of a boxplot is longer than the left whisker and the median is left of the center of the box, what is the most likely shape of the distribution?

Skewed right

List the five steps for drawing a boxplot.

Step 1: Determine the lower and upper fences. Step 2: Draw a number line long enough to include the maximum and minimum values. Insert vertical lines at Q1, M, and Q3. Enclose these vertical lines in a box. Step 3: Label the lower and upper fences with a temporary mark Step 4: Draw a line from Q1 to the smallest data value that is larger than the lower fence. Draw a line from Q3 to the largest data value that is smaller than the upper fence. These lines are called whiskers. Step 5: Plot any data values less than the lower fence or greater than the upper fence as outliers. Outliers are marked with an asterisk (*). Remove the temporary marks labeling the fences.

When the median is in the center of the box, and the whiskers are roughly the same length, what is the shape?

Symmetric

Which measure, the mean or the median, is least affected by extreme observations?

The Median is least affected by extreme observations.

State the reason that we compute the mean.

because a lot more of statistical inference is based on the mean (because it is easily swayed).

Why can outliers distort both the mean and standard deviation?

because neither is resistant.

What is the mean of the data?

The center of gravity (average)

In a statistics​ class, the standard deviation of the heights of all students was 4.3 inches. The standard deviation of the heights of males was 3.3 inches and the standard deviation of females was 3.1 inches. Why is the standard deviation of the entire class more than the standard deviation of the males and females considered​ separately?

The distribution of the heights for the entire class generally has more spread than the distribution of the heights of the individual sexes.

Jennifer just received the results of her SAT exam. Her math score of 600 is at the 74th percentile. Interpret this result.

The fact that an observation is at the kth percentile means that k percent of the observations are less than or equal to the observation. A percentile rank of 74 means that 74% of the SAT math scores are less than or equal to 600 and 26% of the scores are greater than 600. So 26% of the students who took the exam scored better than Jennifer.

Explain the circumstances for which the interquartile range is the preferred measure of dispersion. What is an advantage that the standard deviation has over the interquartile​ range?

The interquartile range is preferred when the data are skewed or have outliers. An advantage of the standard deviation is that it uses all the observations in its computation.

Define the interquartile range, IQR.

The interquartile range, IQR, is the range of the middle 50% of the observations in a data set.

What does the kth percentile represent?

The kth percentile, denoted Pk, of a set of data, is a value such that the K percent of the observations are less than or equal to the value.

When comparing two populations, what does a larger standard deviation imply about dispersion?

The larger the standard deviation, the larger or greater the dispersion is (provided the variable of interest from the two populations has the same unit of measure).

Why is the median​ resistant, but the mean is​ not?

The mean is not resistant because when data are​ skewed, there are extreme values in the​ tail, which tend to pull the mean in the direction of the tail. The median is resistant because the median of a variable is the value that lies in the middle of the data when arranged in ascending order and does not depend on the extreme values of the data.

A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be​ larger, the mean or the​ median? Why?

The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.

Define the median of a variable.

The median of a variable is the value that lies in the middle of the data when arranged in ascending order. We use M to represent the median.

Define the mode of a variable.

The mode of a variable is the observation of the variable that occurs most frequently in the data set.

What does a z-score represent?

The z-score represents the distance that a data value is from the mean in terms of the number of standard deviations.

If the raw data is this: 23.2, 11.5, 3.9, 9.6, 8.1 how many decimal places will the standard deviation be?

Two: 7.43 (not the actual standard dev of these data points, just an example)

Based on standard deviations, which university is more dispersed? University A: 16.1 University B: 8.4

University A have more dispersion, as the scores have a higher standard deviation.

Determine the percentage of students who have IQ scores between 67.8 and 132.2 according to the Empirical Rule. Mean: 100, Standard Dev: 16.1 ^How do you solve problems like this?

Use this formula: Data Value-Mean/Standard deviation 67.8-100/16.1=-2 ->67.8 is 2 standard dev from the mean 132.2-100/16.1=2 ->132.2 is 2 standard dev above the mean 2 standard dev=95% 95% is the answer.

Which variable has more​ dispersion? Why?

Variable y—the interquartile range of variable y is larger than that of variable x.

How can you tell what part of the problem is weights and what part is the values?

Weights will be a whole number, while values will not be.

In weighted mean problems, what are the usual weights and values?

Weights: # of hours Values: grading scale

Charlene and Gary want to make soup. In order to get the right balance of ingredients for their tastes they bought 3 pounds of potatoes at $4.87 per pound, 2 pounds of cod for $2.82 per pound, and 5 pounds of fish broth for $3.29 per pound. Determine the cost per pound of the soup. What are the weights in this problem? Values?

Weights: pounds Values: money

What does it mean when we say that a data set is bimodal? Multimodal?

When a data set has two modes, it is called bimodal. When a data set has three or more modes, it is called multimodal.

What is the standard deviation rules for samples?

When computing the sample standard deviation, be sure to use x with as many decimal places as possible to avoid round-off error. However, report the standard deviation to one more decimal place than the original data.

How can you tell which boxplot has a larger standard deviation?

Which ever boxplot is wider has more spread, thus a larger standard deviation

Is a measure of 30 inches​ "far away" from a mean of 20 ​inches? As someone with knowledge of​ statistics, you answer​ "it depends" and request the standard deviation of the underlying data. ​(a) Suppose the data come from a sample whose standard deviation is 2 inches. How many standard deviations is 30 inches from 20 ​inches? ​(b) Is 30 inches far away from a mean of 20 ​inches? ​(c) Suppose the standard deviation of the underlying data is 6 inches. Is 30 inches far away from a mean of 20 ​inches?

a. 30 inches is 5 standard deviations away from 20 inches. b. Yes, because 30 inches is more than two standard deviations from the mean c. No, because 30 inches in less than two standard deviation from the mean

The weight of an organ in adult males has a​ bell-shaped distribution with a mean of 350 grams and a standard deviation of 35 grams. Use the empirical rule to determine the following. ​(a) About 68​% of organs will be between what​ weights? ​(b) What percentage of organs weighs between 245 grams and 455 ​grams? ​(c) What percentage of organs weighs less than 245 grams or more than 455 ​grams? ​(d) What percentage of organs weighs between 315 grams and 455 ​grams?

a. 315 and 385 b. 99.7% c. 0.3% d. 83.85%

Explain the meaning of the accompanying percentiles. ​(a) The 5th percentile of the head circumference of males 3 to 5 months of age in a certain city is 41.0 cm. ​(b) The 90th percentile of the waist circumference of females 2 years of age in a certain city is 49.8 cm

a. 5​% of ​3- to​5-month-old males have a head circumference that is 41.0 cm or less. b. 90​% of ​2-year-old females have a waist circumference that is 49.8cm or less.

Explain the meaning of the following percentiles in parts​ (a) and​ (b). ​(a) The 5th percentile of the weight of males 36 months of age in a certain city is 11.0 kg. ​(b) The 90th percentile of the length of newborn females in a certain city is 54.3 cm.

a. 5​% of ​36-month-old males weigh 11.0kg or​ less, and 95​% of​ 36-month-old males weigh more than 11.0kg. b. 90​% of newborn females have a length of 54.3cm or​ less, and 10​% of newborn females have a length that is more than 54.3cm

Scores of an IQ test have a​ bell-shaped distribution with a mean of 100 and a standard deviation of 18. Use the empirical rule to determine the following. ​(a) What percentage of people has an IQ score between 46 and 154​? ​(b) What percentage of people has an IQ score less than 64 or greater than 136​? ​(c) What percentage of people has an IQ score greater than 136​?

a. 99.7% b. 5% c. 2.5%

is range, standard deviation, and variance resistant?

no

​_______ divide data sets in fourths.

quartiles

s= σ=

sample population

Skewed left the mean...

smaller than the median

The fact that an observation is at the kth percentile means....

that k percent of the observations are less than or equal to the observation.

When the mean and median are close to each other in value, which measure of central tendency best describes the average birth weight? Ex: Mean= 7.49 Median= 7.35

the mean

What does a z-score measure?

the number of standard deviations an observation is above or below the mean For example, a z-score of 1.24 means the data value is 1.24 standard deviations above the mean. A z-score of -2.31 means that the data value is 2.31 standard deviations below the mean.

Why do we use the five-number summary?

to learn information about the extremes of the data set.

How are z-scores rounded?

to the nearest hundredth.

are quartiles resistant?

yes

The sum of the deviations about the mean always equals...

zero

Violent crimes include​ rape, robbery,​ assault, and homicide. The following is a summary of the​ violent-crime rate​ (violent crimes per​ 100,000 population) for all states of a country in a certain year. Q1=273.8​ Q2=388.5​ Q3=529.1 Provide an interpretation of these results. Choose the correct answer below.

​25% of the states have a​ violent-crime rate that is 273.8 crimes per​ 100,000 population or less.​ 50% of the states have a​ violent-crime rate that is 388.5 crimes per​ 100,000 population or less.​ 75% of the states have a​ violent-crime rate that is 529.1 crimes per​ 100,000 population or less.

Is the trimmed mean​ resistant?

​Yes, because extreme values do not affect it.

A sample of 20 registered voters was surveyed in which the respondents were​ asked, "Do you think​ Chang, Johnson,​ Ohm, or Smith is most qualified to be a​ senator?" The results of the survey are shown in the table. Chang occured the most Do you think it would be a good idea to rotate the candidate choices in the​ question? Why?

​Yes, to avoid response bias


Kaugnay na mga set ng pag-aaral

BMS3021 Lecture 15 Alzheimer's disease

View Set

Genetics Exam 4, Practical Biology Chapter 13, Biology Final Exam, CHEM 527 Exam Past, CHEM 527 Exams

View Set

VET115 Large Animal Husbandry & Disease Midterm Review

View Set

Ch. 2: Homeostasis, Allostasis, and Adaptive Response to Stressors, pp. 12-23

View Set

Chapter 6- Public Opinion and Political Action

View Set