Chapter 3 STATS

¡Supera tus tareas y exámenes ahora con Quizwiz!

3.1.RA-2 Fill in the blank below. The mean of a collection of data is located at the​ ______ of a distribution of data.

The mean of a collection of data is located at the​ "balancing point" of a distribution of data.

In a​ boxplot, the vertical line inside the box marks the location of the

median.

Which of the following is NOT one of the five numbers needed to make a​ boxplot? Q1 The maximum Median Mean

The mean is not shown in a​ boxplot, so it is not used to construct boxplots.

When a distribution contains​ outliers, which of the following is the best choice for a measure of​ center? Choose the correct answer below. Interquartile range Mean Median Standard deviation

note: The median is resistant to​ outliers, so when a distribution contains​ outliers, the median is the best choice for a measure of center.

3.1.RA-4 The mean represents the typical value in a set of data for what type of​ distribution? A. For distributions that are roughly symmetric B. For distributions that are bimodal C. For distributions that are skewed D. For all distributions

A. For distributions that are roughly symmetric

In a​ right-skewed distribution, which of the following is​ true? A. The mean and median are approximately the same. B. The mean tends to be greater than the median. C. The mean tends to be less than the median. D. None of these

B. The mean tends to be greater than the median. note: The mean tends to be greater than the median in a​ right-skewed distribution. This is because the higher values to the right of the center pull the mean up more than they affect the median.

The value that would be right in the middle if you were to sort the data from smallest to largest is called the

median note: The median is the value that would be right in the middle if you were to sort the data from smallest to largest. About​ 50% of the observations are below it and about​ 50% of the observations are above it.

Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set. What are two measures of the center of a​ distribution? interquartile range and standard deviation first quartile and third quartile median and mean

median and mean

3.2.RA-3 A standard unit measures which of the​ following? A. How many standard deviations away an observation is from the mean B. How many standard deviations away an observation is from the median C. The magnitude of the standard deviation D. The interval within which approximately​ 68% of the observations fall

A. How many standard deviations away an observation is from the mean note: A standard unit is how many standard deviations away an observation is from the mean. A measurement converted to standard units is called a​ z-score.

For most​ applications, why is the standard deviation is preferred over the​ variance? A. The units for the variance are always squared. B. The standard deviation is easier to calculate than the variance. C. The standard deviation more accurately measures the variability in a distribution. D. All of the above

A. The units for the variance are always squared. note: The units for the variance are always squared. Measuring spread distance with the variance implies that the units for measuring spread are different from the units for measuring​ center, which is not true.

Boxplots are NOT recommended for use with which of the following types of​ distributions? Unimodal Skewed Symmetric Bimodal or any multimodal distribution

Correct answer: Bimodal or any multimodal distribution Boxplots are best used only for unimodal distributions because they hide bimodality or any multimodality.

What is the first step to do with potential​ outliers? Choose the correct answer below. A. Eliminate them from the data set B. Assume there was an error in the sampling process C. Assume there was an error in entering the data D. Investigate further

The first step with potential outliers is always to investigate. A potential outlier might not be an outlier at all. Or a potential outlier might tell an interesting​ story, or it might be the result of an error in entering data.

A dieter recorded the number of calories he consumed at lunch for one week. As you can​ see, a mistake was made on one entry. The calories are listed in increasing order below. 349, 371, 386, 398, 412, 4190 When the error is corrected by removing the extra​ 0, will the mean​ change? Will the​ median? Explain without doing any calculation.

note: The median is resistant to outliers and extreme values because it orders the data from lowest to highest and looks at the middle value. The highest value does not change the​ order, and so it does not change the median. The mean is the balancing point for the data set. When looking at the shape of a​ histogram, the mean is the point which balances the weight on both sides. If an extreme value is placed on one end of the​ mean, it has to shift in that direction to keep everything balanced.

The symbol ∑ stands for which of the​ following? Multiplication Summation Division Finding the mean

note: The symbol ∑ stands for summation. If x represented a single​ observation, then ∑x would mean that all the values should be added together.

3.1.RA-7 To compute the​ variance, what should one​ do? A. Double the standard deviation. B. Square the standard deviation. C. Divide the standard deviation by n minus −1. D. Take the square root of the standard deviation.

note: The variance is the square of the standard deviation. It is represented symbolically by s squared s2.

For what purpose is the median​ used? A. To give the spread of a distribution B. To measure the variation of a data set C. To give a typical value of a data set D. None of these

C. To give a typical value of a data set note: The median is a typical value of a data set. It is used particularly when the distribution is skewed.

Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set. Under what conditions is the median​ preferred? A. The median is preferred when there are few data points. B. The median is preferred when the data is strongly skewed or has outliers. C. The median is preferred when there are many data points. D. The median is preferred when the data is relatively symmetric.

B. The median is preferred when the data is strongly skewed or has outliers. note: The median provides a better measure of center when the data is skewed or has outliers because the presence of an outlier has a much greater effect on the mean.

The median is often used for which of the following types of​ distribution? Uniform Skewed Symmetric Bimodal

Skewed note: The median is often used for skewed distributions. The mean is not often used for skewed distributions because skew affects the mean more than it affects the median.

Why is the mean different from the​ median?

note: The median gives a better measure of center for this distribution because the​ professor's age is an outlying observation. The median also tends to give a better representation of a typical observation in a skewed distribution.

Which measure of the center​ (mean or​ median) is more resistant to​ outliers, and what does​ "resistant to​ outliers" mean?

The median is more​ resistant, which indicates that it usually changes less than the mean when comparing data with and without outliers. note: The median is more resistant to outliers than the​ mean, especially when the outliers have extreme values. The presence of an extreme value can cause the mean to become very skewed because it will shift heavily in the direction of the extreme value. The amount the median shifts by is based on the number of data​ observations, because it is determined by the middle value after ordering all the observations from lowest to highest. If there is only one outlier with an extremely large value the median will shift very​ slightly, while the mean will change significantly.

3.2.RA-1 According to the Empirical​ Rule, ________ will be within two standard deviations of the mean.

note: According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 95% of the observations will be within two standards deviation of the mean.

If the mean and the median of a distribution are approximately the​ same, then the shape of the distribution is likely to be​ _______.

note: If the mean and the median of a distribution are approximately the​ same, then the shape of the distribution is likely to be symmetric.

In a​ boxplot, the whiskers extend to which of the​ following? Choose the correct answer below. A. The smallest and largest values in the data set B. To the most extreme values that are not potential outliers C. To the first and third quartiles D. None of these

note: In a​ boxplot, the whiskers extend to the most extreme values that are not potential outliers. Potential outliers are then represented by others​ markers, such as dots.

The interquartile range tells us how much space the​ _____ of the data occupy.

note: The interquartile range tells us how much space the middle​ 50% of the data occupy. It is found by subtracting the third quartile from the first quartile.

The length of the box in a boxplot is proportional to which of the​ following? Choose the correct answer below. IQR Mean Median Standard deviation

note: The length of the box in a boxplot is proportional to the IQR. The left edge of the box is at the first quartile and the right edge is at the third quartile.

In a​ boxplot, potential outliers are points that are more than​ ___ IQRs from the edges of the box.

In a​ boxplot, potential outliers are points that are more than 1.5 IQRs below the first quartile or above the third quartile.

When comparing​ groups, if one group is strongly skewed or has outliers and the other is​ symmetric, which of the following should be used to compare the​ groups? A. The median and interquartile range for the skewed group and the mean and standard deviation for the symmetric group B. The mean and standard deviation for the skewed group and the median and interquartile range for the symmetric group C. The means and standard deviations D. The medians and interquartile ranges

D. The medians and interquartile ranges note: When comparing two​ distributions, one should always use the same measures of center and spread for both distributions. Since the mean will be affected by the skew or outliers in the first​ distribution, use the median and interquartile range for both distributions.

The​ ______ is a number that measures how far away the typical observation is from the mean.

answer: standard deviation note: For most​ distributions, a majority of observations are within one standard deviation of the mean value.

3.2.RA-2 The Empirical Rule applies to distributions that are​

answer: symmetric and unimodal. note: According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 68% of the observations will be within one standard deviation of the​ mean, approximately​ 95% of the observations will be within two standard deviations of the​ mean, and nearly all the observations will be within three standard deviations of the mean.

Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme​ value, we say the median is​ _____ to outliers.

note: The median is resistant to outliers. This makes it a good choice for a measure of center when a distribution is skewed.

The ______ is another term for the arithmetic average.

mean note: The mean is another term for the arithmetic average. It can be thought of as the balancing point of a distribution of data.

3.2.RA-4 If an observation has a​ z-score of​ 0, this means which of the​ following? Choose the correct answer below. A. The observation is equal to the standard deviation. B. The observation is equal to the median. C. The​ z-score was computed incorrectly. D. The observation is equal to the mean.

D. The observation is equal to the mean. note: If an observation has a​ z-score of​ 0, then it is equal to the mean. The mean is 0 standard deviations away from​ itself, so it has a​ z-score of 0.

When a distribution is​ skewed, the​ _______ is used to measure the center and the​ _______ is used to measure variation.

note: The mean and standard deviation are used to measure the center and​ variation, respectively, when a distribution is symmetric.

3.1.1 A sociologist​ says, "Typically, men in a certain country still earn more than​ women." What does this statement​ mean? A. The center of the distribution of salaries for men in the country is greater than the center for women. B. The highest paid people in the country are men. C. All​ women's salaries in the country are less varied than all​ men's salaries. D. All men make more than all women in the country.

A. The center of the distribution of salaries for men in the country is greater than the center for women. note: In a distribution of​ values, the typical value is given by the mean. In this​ case, the average salary of a man is higher than that of a​ woman, so when comparing the distribution of​ men's salaries to​ women's salaries, the center of the distribution for men is greater than the center of the distribution for the women.

3.1.RA-6 In a​ symmetric, unimodal​ distribution, about​ two-thirds of the observations are​ where? A. Within three standard deviations of the mean B. Within two standard deviations of the mean C. More than one standard deviation from the mean D. Within one standard deviation of the mean

D. Within one standard deviation of the mean

In your own​ words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What​ action(s) should be taken with an​ outlier?

Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are​ mistakes, they might be removed or corrected. If they are not​ mistakes, you might do the analysis​ twice, once with and once without the outliers. note: Outliers are observed values that lie outside the range of the main group of data. When an outlier is​ present, the observer needs to consider it more closely. Sometimes it is just a mistake that happened while collecting the data and can be corrected or discarded. Other times it is a legitimately observed value. In those​ cases, the analysis needs to be presented once with outliers and once without outliers to give a better idea of what a typical value is. Note that in​ statistics, potential outliers are defined as observations that are more than 1.5 interquartile ranges below the first quartile or above the third​ quartile, not above or below the median. Also note that a potential outlier is not the same thing as an outlier.

Name two measures of the center of a​ distribution, and state the conditions under which each is preferred for describing the typical value of a single data set. Under what conditions is the mean​ preferred? A. The mean is preferred when the data is relatively symmetric. B. The mean is preferred when the data is strongly skewed or has outliers. C. The mean is preferred when there are few data points. D. The mean is preferred when there are many data points.

A. The mean is preferred when the data is relatively symmetric.

3.2.RA-5 Which of the following can be used to compare values measured in different​ units, such as inches and​ pounds? ​z-score standard deviation standard error interquartile range

answer: ​z-score note: The​ z-score measures distance from a mean in terms of standard​ deviations, so it can be used to compare values measured in different​ units, such as inches and pounds.

3.1.19 In a recent​ competition, do you think the standard deviation of the running times for all men who ran the​ 100-meter race would be larger or smaller than the standard deviation of the running times for the​ men's marathon? Explain. A. The standard deviation for the​ 100-meter event would be less. All the runners come to the finish line within a few seconds of each other. In the​ marathon, the runners can be quite widely spread after running that long distance. B. The standard deviation for the marathon event would be less. Many more runners compete in a marathon rather than a​ 100-meter event.​ Therefore, the average time will be determined with greater precision. C. The standard deviation for the marathon event would be less. All the runners finish the race in a matter of seconds. In the​ marathon, the runners can be quite widely spread after running that long distance. D. The standard deviation for the​ 100-meter event would be less. All the runners finish the race in a matter of seconds. In the​ marathon, the runners take at least a few hours to complete the course.

note: Since the difference between running times in the​ 100-meter event will be within a few seconds of each​ other, the running times will have small variation. In the​ marathon, since the running times are likely to be minutes​ apart, the times will have greater variation.​ Thus, the marathon running times will have a greater standard deviation.


Conjuntos de estudio relacionados

Which statements accurately describe simple diffusion?

View Set

Chapter 7 Performing Forensic Analysis

View Set

Test - TST 102 Module 0-1 Exam: Test and Evaluation within the Acquisition Life Cycle

View Set

Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5, Chapter 6

View Set

Sociology 1301 Final Chapter 14:2

View Set

International Business Vocabulary

View Set