Stats 2 Chapter 3

¡Supera tus tareas y exámenes ahora con Quizwiz!

Both the Empirical Rule and​ Chebyshev's Inequality can be used to determine the percentage of data that lie within a certain range. What assumptions must be made about the underlying distribution before using these​ rules?

The distribution must be roughly​ bell-shaped for the Empirical Rule to​ apply, but​ Chebyshev's Inequality holds for distributions of any shape.

State an advantage and a disadvantage of using the range instead of the variance as a measure of dispersion in sample data.

The range is easier to​ calculate, but it is too affected by extreme values in the data set.

In a typical​ boxplot, the length of the box indicates which measure of​ spread?

interquartile range IQR

The value that divides a histogram into two equal areas is called the​ ____________. The value that serves as a balancing point for a histogram is the​ ____________.

The value that divides a histogram into two equal areas is called the MEDIAN. The value that serves as a balancing point for a histogram is the MEAN.

What is the symbol used to represent the sample mean?

_ x

Sample population deviation

1. Stat edit 2. insert numbers 3. Stat edit 4. vars-1 5. σx

Sample population variance

1. Stat edit 2. insert numbers 3. Stat edit 4. vars-1 5. σx squared

Shape of distribution

Because the mean and median are very close, this distribution's shape is symmetrical Because the mean is greater than the median, this distributions shape is right skewed Because the mean is less than the median, this distribution's shape is left skewed.

quartiles

first: 25 second: 50 third: 75

percentile

p i= ---- n 100 p= desired percent n= numbers in set ex i=2 second position in the ordered data set

What is the symbol used to represent the sample standard​ deviation?

s

Explain why the mean should not be found for a sample of zip codes. Which measure of center should be used​ instead?

Even though they are numeric​ data, zip codes are qualitative since they do not measure or count anything. The mean cannot be found since adding zip codes would be meaningless. For qualitative​ data, the mode is the only measure of center that can be found.

Which measure of center must be equal to an actual data​ value? Explain why.

Since the mode is the most frequent observation that occurs in the data​ set, it must be an actual value from the data set.

Chebyshev's Theorem atleast 75% 2 SD at least 89% 3 SD

Using​ Chebychev's Theorem, determine the range of prices that includes at least 82​% of the homes around the mean. According to​ Chebyshev's Theorem, for any​ distribution, the percent of the values that fall within z standard deviations from the mean will be at least (1-1/z^2)•100%​, for z greater

To find how many of the data values fall within one standard deviation from the​ mean, find the upper and lower bounds of the interval.

_ x - s _ x + s

Name a feature of a distribution that is more easily seen in a histogram than a boxplot.

the shape of distribution

Empirical Rule

if a distribution follows a​ bell-shaped, symmetrical curve centered around the​ mean, it is expected that approximately​ 68, 95, and 99.7 percent of the values will fall within​ one, two, and three standard deviations of the mean respectively. The formula for expressing the​ z-score in terms of x is the​ following, where μ is the population​ mean, σ is the population standard​ deviation, and z is the​ z-score. x=μ+zσ

Which of the following is NOT a measure of​ spread?

midrange

5 number summary

min value Q1 Q2 Q3 Max value

variability

more variability = greater variation less variability = less variation

What is the symbol used to represent the sample​ variance?

s^2

A​ z-score represents how many​ ______________ a data value is above or below the​ ______________.

standard deviations mean

upper bound

+

lower bound

-

Sample standard deviation

1. Stat edit 2. insert numbers 3. Stat edit 4. vars-1 5. Sx

Sample Variance

1. Stat edit 2. insert numbers 3. Stat edit 4. vars-1 5. Sx squared

Since i is an​ integer

25th 50th 75th percentile is the average of the values in position i and position i+1

coefficient of variation

CV=s/x(100) sample standard deviation / sample mean

Interquartile range

Q3-Q1

Range

largest number minus smallest number

population z score ??????

x-μ --------- σ

What is the symbol used to represent the population​ mean?

μ

What is the symbol used to represent the population standard​ deviation?

σ

What is the symbol used to represent the population​ variance?

σ^2

Suppose the list below shows how many text messages Elyse sent each day for the last 10 days. If Elyse wants to know how many text messages she typically sends each​ day, which measure of central tendency better describes the typical number of text messages per​ day? 21 22 24 26 26 29 32 32 33

​Median; The median of 27.5 is a better representative of the center since it is resistant to the one extreme value. The mean of 33.3 is not representative of the typical number of texts since only one number is larger than the mean.

Why does the formula for calculating the sample​ variance, S^2=∑(x−x-)2/n−1​, involve division by n−1 instead of​ n?

If the formula involved division by​ n, the sample variance would be biased and consistently underestimate the population variance.

Suppose a pediatrician is wondering whether there is more variability in the heights or weights of the​ 2-year-old boys that he sees and collects the data below for a sample of 100​ 2-year-old boys in his practice. He concludes that the​ boys' weights vary more than their heights since the standard deviation is greater for weight than for height. What is wrong with this​ conclusion? ​Heights: mean=30.2 ​in., standard deviation=1.9 in. ​Weights: mean= 29.4 ​lb, standard deviation= 2.1 lb

Since the standard deviations have different​ units, he cannot compare them directly. The coefficient of variation should be used instead. The coefficient of variation​ (CV) should be used since the heights and weights have different units.​ Otherwise, he is comparing inches to pounds.

Which measure of center​ (mean or​ median) is​ resistant? Explain what it means for that measure to be resistant.

The median is resistant because it is not sensitive to extreme values in the data set. If the largest observation was​ doubled, for​ example, the median would not change since that largest value does not factor into its computation.

sample z score

_ x-x ------ s


Conjuntos de estudio relacionados

physical assessment final exam - ATI questions

View Set

Number Properties, Prime Numbers, Prime Factorization

View Set

Ch 9 AA Partnerships: Formation and Operation: Problems

View Set

5 barriers to consumer satisfaction

View Set

chapter 13: lower leg, ankle, and foot ( magee p888)

View Set