STATS Chapterr 3

Ace your homework & exams now with Quizwiz!

A company advertises a mean lifespan of 1000 hours for a particular type of light bulb. If you were in charge of quality control at the​ factory, would you prefer that the standard deviation of the lifespans for the light bulbs be 5 hours or 50​ hours? Why?

5 hours would be preferable since a smaller standard deviation indicates more consistency. Your answer is correct. The company would prefer to have a consistent product on which customers can depend. A smaller standard deviation is more desirable since it indicates that the light bulb lifespans do not vary much.

According to the Empirical​ Rule, ________ will be within two standard deviations of the mean.

According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 95% of the observations will be within two standards deviation of the mean.

A​ z-score represents how many​ ______________ a data value is above or below the​ ______________.

A​ z-score represents how many standard deviations a data value is above or below the mean.

Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme​ value, we say the median is​ _____ to outliers.

Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme​ value, we say the median is resistant to outliers.

In a typical​ boxplot, the length of the box indicates which measure of​ spread?

IQR Since the box in a boxplot runs from the first quartile to the third​ quartile, the length of the box is the IQR.

The length of the box in a boxplot is proportional to which of the​ following?

IQR The length of the box in a boxplot is proportional to the IQR. The left edge of the box is at the first quartile and the right edge is at the third quartile.

If all the data values in a set are​ identical, what can you conclude about the standard​ deviation?

If the data values are all​ identical, then the mean is equal to that data value.​ Therefore, there is no spread from the​ mean, and the standard deviation is zero.

In a​ boxplot, potential outliers are points that are more than​ ___ IQRs from the edges of the box.

1.5 In a​ boxplot, potential outliers are points that are more than 1.5 IQRs below the first quartile or above the third quartile.

The median is often used for which of the following types of​ distribution?

Skewed The median is often used for skewed distributions. The mean is not often used for skewed distributions because skew affects the mean more than it affects the median.

If the standard deviation for a data set is​ zero, what can you conclude about the​ data?

The data values must all be equal. Standard deviation measures spread. If there is no​ spread, one can conclude that the values are all the same.

Which statement is NOT true regarding the​ mean? A.The mean should be used when the distribution is roughly symmetric. B.The mean is the center of gravity or balancing point for the data set. C.The mean is always the best measure of center. D.The calculation of the mean uses all the values in the data set.

The mean is not always the best measure of center. If the distribution is​ skewed, it might be better to report the median since it is resistant. If the data is​ qualitative, the mean cannot be used.

Suppose, on the warmest day of the​ month, the daily high temperature in a city is accidentally recorded as 700 instead of 70 degrees Fahrenheit. Compare the effect this mistake will have on the mean monthly high temperature to the effect on the median monthly high temperature.

The mean will increase​ significantly, but the median will not change as a result of the mistake. Unlike the​ median, the mean is not resistant to extreme values and will be significantly affected by changing only one temperature to such an extreme value.

Which statement is NOT true regarding the​ median?

The median is always one of the values in the data set. The median is not always one of the values in the data set. If there is an even number of​ values, then the median will be the mean of the middle two​ values, which will not necessarily be a value in the data set.

Which measure of center​ (mean or​ median) is​ resistant? Explain what it means for that measure to be resistant.

The median is resistant because it is not sensitive to extreme values in the data set. If the largest observation was​ doubled, for​ example, the median would not change since that largest value does not factor into its computation. why we use it for skewed data

If an observation has a​ z-score of​ 0, this means which of the​ following?

The observation is equal to the mean. If an observation has a​ z-score of​ 0, then it is equal to the mean. The mean is 0 standard deviations away from​ itself, so it has a​ z-score of 0.

Describe the sample standard deviation in words rather than with a formula.

The sample standard deviation is the square root of the quotient of the sum of the squared deviations from the mean and (n−​1). The formula for the sample standard deviation is s=√∑(x−x)^2/n−1 where x is the sample mean an n is the sample size. The sample​ variance, s^2​, is the square of the sample standard deviation.

Name two measures of the variation of a​ distribution, and state the conditions under which each measure is preferred for measuring the variability of a single data set.

The standard deviation is preferred when the data is relatively symmetric. The interquartile range is preferred when the data is strongly skewed or has outliers.

If you calculate the​ z-score for your height in​ inches, what unit is used on the​ z-score?

The​ z-score will have no units. Calculating the​ z-score involves subtracting the mean in inches and then dividing by the standard deviation in inches. The inches divide out and leave a number with no units.

Identify when the interquartile range is better than the standard deviation as a measure of dispersion and explain its advantage.

When the distribution is skewed left or right or contains some extreme​ observations, then the interquartile range is preferred since it is resistant. The IQR is resistant to extreme values in the​ data, making it a better choice for a skewed distribution.

In a​ symmetric, unimodal​ distribution, about​ two-thirds of the observations are​ where?

Within one standard deviation of the mean Approximately​ two-thirds of the observations in a​ symmetric, unimodal distribution are within one standard deviation of the mean.

Which of the following is NOT needed to construct a​ boxplot?

Mean A boxplot uses the median as a measure of the​ center, not the mean.

The value that would be right in the middle if you were to sort the data from smallest to largest is called the​ ______.

Median

Suppose the list below shows how many text messages Elyse sent each day for the last 10 days. If Elyse wants to know how many text messages she typically sends each​ day, which measure of central tendency better describes the typical number of text messages per​ day? 21 22 24 26 26 29 32 32 33 88

Median; The median of 27.5 is a better representative of the center since it is resistant to the one extreme value. The mean of 33.3 is not representative of the typical number of texts since only one number is larger than the mean. The one extremely high value of 88​ texts/day increases the mean substantially so that it is no longer a good measure of the typical number of texts. The median is often a better measure of center when there are extreme values.

The interquartile range tells us how much space the​ _____ of the data occupy.

Middle 50% The interquartile range tells us how much space the middle​ 50% of the data occupy. It is found by subtracting the third quartile from the first quartile. Next Question

Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set.

One measure of the center of a distribution is the mean. This measure is preferred when the distribution is relatively symmetric. One measure of the center of a distribution is the median. This measure is preferred when the distribution is strongly skewed.

When an odd number of data values are arranged in​ order, the​ _________ is the middle value.

median

The Empirical Rule applies to distributions that are​ ________.

symmetric and unimodal According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 68% of the observations will be within one standard deviation of the​ mean, approximately​ 95% of the observations will be within two standard deviations of the​ mean, and nearly all the observations will be within three standard deviations of the mean.

The interquartile range​ (IQR) is the difference between the​ _______ quartile and the​ _______ quartile.

third and first Quartiles divide the ordered data into four equal parts. The first quartile is the median of the bottom half of the data and separates the first​ quarter, or​ 25%, of the data from the upper​ three-quarters, or​ 75%, of the data. The second quartile is the median. The third quartile is the median of the top half of the data and separates the lower​ three-quarters, or​ 75%, of the data from the upper​ quarter, or​ 25%, of the data. The IQR is the difference between the third and first​ quartiles, that​ is, IQR=Q3−Q1.

Can the variance of a data set ever be​ negative? Explain.

​No; since the variance is based on the squared deviations from the mean and​ N, it cannot be negative. A population variance is the sum of the squared deviations from the​ mean, divided by N. Since the deviations from the mean are​ squared, each one is zero or​ positive, never negative. N is the number of objects in the population and must be positive. As a​ result, the variance must be zero or a positive​ number, never negative.

A community college faculty is negotiating a new contract with the school board. The distribution of faculty salaries is skewed right by several faculty members who make over​ $100,000 per year. If the faculty want to give the community the impression that they deserve higher​ salaries, should they advertise the mean or median of their current​ salaries?

The faculty should use the median to make their argument. The median will be lower than the mean since the mean is influenced by the few extremely high salaries. The median is resistant to the extremely high salaries and will not be influenced as much as the mean. As a​ result, the median will be lower than the mean. By reporting the lower measure of​ center, the faculty can better make their case that they deserve higher salaries.

If all the data values in a population are converted to​ z-scores, the distribution of​ z-scores will have what​ mean?

The mean of the​ z-scores will be zero. When data is standardized by converting it to​ z-scores, the new distribution has a mean of zero and a standard deviation of 1.

The mean represents the typical value in a set of data for what type of​ distribution?

The mean represents the typical value in a set of data for distributions that are roughly symmetric.

How can you tell from a boxplot if the distribution is​ symmetric?

The median is in the center of the​ box, and the left and right whiskers are approximately the same length. If a distribution is​ symmetric, then the distance from the median to the first quartile will be about the same as the distance from the median to the third​ quartile, and the distance from min to Q1 is about the same as the distance from Q3 to max

A community college school board is negotiating a new contract with the college faculty. The distribution of faculty salaries is skewed right by several faculty members who make over​ $100,000 per year. If the school board wants to give the community the impression that the faculty are already​ overpaid, should they advertise the mean or median of the faculty​ salaries?

The school board should use the mean to make their argument. The mean will be higher than the median since it will be influenced by the few high salaries. The mean is not resistant and will be pulled in the direction of the tail for a skewed distribution. Since the distribution is​ right-skewed, the mean will be pulled to the right which is a higher value.

To compute the​ variance, what should one​ do?

The variance is the square of the standard deviation. It is represented symbolically by s^2.

If​ someone's gross annual income has a​ z-score of positive​ 2, what can be​ concluded?

Their income is 2 standard deviations above the mean income.

a.In your own​ words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What​ action(s) should be taken with an​ outlier? b. Which measure of the center​ (mean or​ median) is more resistant to​ outliers, and what does​ "resistant to​ outliers" mean?

a.Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are​ mistakes, they might be removed or corrected. If they are not​ mistakes, you might do the analysis​ twice, once with and once without the outliers. Outliers are observed values that lie outside the range of the main group of data. When an outlier is​ present, the observer needs to consider it more closely. Sometimes it is just a mistake that happened while collecting the data and can be corrected or discarded. Other times it is a legitimately observed value. In those​ cases, the analysis needs to be presented once with outliers and once without outliers to give a better idea of what a typical value is. Note that in​ statistics, potential outliers are defined as observations that are more than 1.5 interquartile ranges below the first quartile or above the third​ quartile, not above or below the median. Also note that a potential outlier is not the same thing as an outlier. b. The median is more​ resistant, which indicates that it usually changes less than the mean when comparing data with and without outliers. The median is more resistant to outliers than the​ mean, especially when the outliers have extreme values. The presence of an extreme value can cause the mean to become very skewed because it will shift heavily in the direction of the extreme value. The amount the median shifts by is based on the number of data​ observations, because it is determined by the middle value after ordering all the observations from lowest to highest. If there is only one outlier with an extremely large value the median will shift very​ slightly, while the mean will change significantly.


Related study sets

Mrs. Brock west europe test review

View Set

Geology 1104 Final Laura Mallard

View Set

Hormones of the Human Reproductive System

View Set

Envsci101, Lecture 1, Biodiversity

View Set

Pharmacology Midterm Test Bank 1

View Set