Stats - Chapter 3

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

In a​ boxplot, potential outliers are points that are more than​ ___ IQRs from the edges of the box.

1.5 IQRs

The mean of a collection of data is located at the​ ________ of a distribution of data.

Balancing Point

Boxplots are NOT recommended for use with which of the following types of​ distributions?

Bimodal or any multimodal distribution Boxplots are best used only for unimodal distributions because they hide bimodality or any multimodality.

The mean represents the typical value in a set of data for what type of​ distribution?

Distributions that are roughly symmetric.

A standard unit measures which of the​ following?

How many standard deviations away an observation is from the mean

The length of the box in a boxplot is proportional to which of the​ following?

IQR The length of the box in a boxplot is proportional to the IQR. The left edge of the box is at the first quartile and the right edge is at the third quartile.

If an observation has a​ z-score of​ 0, this means which of the​ following?

If an observation has a​ z-score of​ 0, then it is equal to the mean. The mean is 0 standard deviations away from​ itself, so it has a​ z-score of 0.

What is the first step to do with potential​ outliers?

Investigate Further The first step with potential outliers is always to investigate. A potential outlier might not be an outlier at all. Or a potential outlier might tell an interesting​ story, or it might be the result of an error in entering data.

The​ five-number summary for a distribution of final exam scores is 40​, 72​, 79​, 90​, 96. Explain why it is not possible to draw a boxplot based on this information.​ (Hint: What more do you need to​ know?)

It​ isn't possible to draw a boxplot based on the​ five-number summary because the boxplot must mark all potential outliers. Since the minimum is lower than the left​ limit, there may be other unknown outliers between the minimum and left​ limit, and so the boxplot cannot mark all potential outliers.

What does a Symmetric Distribution look like?

Left and right hand sides are roughly mirror images.

The​ ____ is another term for the arithmetic average.

Mean The mean is another term for the arithmatic average. It can be thought of as the balancing point of a distribution of data.

When a distribution is​ skewed, the​ _______ is used to measure the center and the​ _______ is used to measure variation.

Median and interquartile range. The mean and standard deviation are used to measure the center and​ variation, respectively, when a distribution is symmetric.

What are the 5 numbers needed for a boxplot?

Min Q1 Median Q3 Max

In your own​ words, describe to someone who knows only a little statistics how to recognize when an observation is an outlier. What​ action(s) should be taken with an​ outlier?

Outliers are observed values far from the main group of data. In a histogram they are separated from the others by space. Outliers must be looked at in closer context to know how to treat them. If they are​ mistakes, they might be removed or corrected. If they are not​ mistakes, you might do the analysis​ twice, once with and once without the outliers.

The median is often used for which of the following types of​ distribution?

Skewed The median is often used for skewed distributions. The mean is not often used for skewed distributions because skew affects the mean more than it affects the median.

The​ ______ is a number that measures how far away the typical observation is from the mean.

Standard Deviation For most​ distributions, a majority of observations are within one standard deviation of the mean value.

What doe the symbol ∑ stand for?

Summation

The Empirical Rule applies to distributions that are​ ________.

Symmetric and Unimodal According to the Empirical​ Rule, if a distribution is unimodal and​ symmetric, approximately​ 68% of the observations will be within one standard deviation of the​ mean, approximately​ 95% of the observations will be within two standard deviations of the​ mean, and nearly all the observations will be within three standard deviations of the mean.

A sociologist​ says, "Typically, men in a certain country still earn more than​ women." What does this statement​ mean?

The center of the distribution of salaries for men in the country is greater than the center for women. In a distribution of​ values, the typical value is given by the mean. In this​ case, the average salary of a man is higher than that of a​ woman, so when comparing the distribution of​ men's salaries to​ women's salaries, the center of the distribution for men is greater than the center of the distribution for the women.

A dieter recorded the number of calories he consumed at lunch for one week. As you can​ see, a mistake was made on one entry. The calories are listed in increasing order below. 349​, 371​, 386​, 398​, 412​, 4190 When the error is corrected by removing the extra​ 0, will the mean​ change? Will the​ median? Explain without doing any calculation.

The corrected value will give a different mean but not a different median. Medians are resistant to outliers and not as affected by extreme​ values, but the more extreme a value​ is, the more the mean is affected by it.

In a​ right-skewed distribution, which of the following is​ true?

The mean tends to be greater than the median in a​ right-skewed distribution. This is because the higher values to the right of the center pull the mean up more than they affect the median.

The value that would be right in the middle if you were to sort the data from smallest to largest is called the​ ______.

The median Median is the value that would be right in the middle if you were to sort the data from smallest to largest. About​ 50% of the observations are below it and about​ 50% of the observations are above it.

For what purpose is the median​ used?

The median is a typical value of a data set. It is used particularly when the distribution is skewed.

Which measure of the center​ (mean or​ median) is more resistant to​ outliers, and what does​ "resistant to​ outliers" mean?

The median is more resistant to outliers than the​ mean, especially when the outliers have extreme values. The presence of an extreme value can cause the mean to become very skewed because it will shift heavily in the direction of the extreme value. The amount the median shifts by is based on the number of data​ observations, because it is determined by the middle value after ordering all the observations from lowest to highest. If there is only one outlier with an extremely large value the median will shift very​ slightly, while the mean will change significantly.

Because the median is not affected by the size of an outlier and does not change even if a particular outlier is replaced by an even more extreme​ value, we say the median is​ _____ to outliers.

The median is resistant to outliers. This makes it a good choice for a measure of center when a distribution is skewed.

When a distribution contains​ outliers, which of the following is the best choice for a measure of​ center?

The median is resistant to​ outliers, so when a distribution contains​ outliers, the median is the best choice for a measure of center.

For most​ applications, why is the standard deviation is preferred over the​ variance?

The units for the variance are always squared. The units for the variance are always squared. Measuring spread distance with the variance implies that the units for measuring spread are different from the units for measuring​ center, which is not true.

In a​ boxplot, the vertical line inside the box marks the location of the​ _______.

The vertical line inside the box marks the location of the median.

In a​ boxplot, the whiskers extend to which of the​ following?

To the most extreme values that are not potential outliers In a​ boxplot, the whiskers extend to the most extreme values that are not potential outliers. Potential outliers are then represented by others​ markers, such as dots.

In a​ symmetric, unimodal​ distribution, about​ two-thirds of the observations are​ where?

Within one standard deviation of the mean

What can be used to compare values measured in different​ units, such as inches and​ pounds?

Z-score The​ z-score measures distance from a mean in terms of standard​ deviations, so it can be used to compare values measured in different​ units, such as inches and pounds.

According to the Empirical​ Rule, ________ will be within two standard deviations of the mean.

approximately 95% of the observations

The interquartile range tells us how much space the​ _____ of the data occupy.

middle​ 50% The interquartile range tells us how much space the middle​ 50% of the data occupy. It is found by subtracting the third quartile from the first quartile.

To compute the​ variance, what should one​ do?

the square of the standard deviation The variance is the square of the standard deviation. It is represented symbolically by s2.


Set pelajaran terkait

Vertebrate Integumentary System, part 2 (Reptiles, Birds, Mammals)

View Set

microbiology chapter 9 questions

View Set

Chapter 5 - Learning Self-Management Skills

View Set

ch 17 & 18: immunization, serology, immune disorders

View Set

Chapter 24: Assessing Musculoskeletal System

View Set

California Driver Handbook Study Set

View Set