Stats Ch. 2
After constructing a relative frequency distribution summarizing IQ scores of college students, what should be the sum of the relative frequencies?
If percentages are used, the sum should be 100%. If proportions are used, the sum should be 1.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. The mean is the measure of central tendency most likely to be affected by an extreme value (outlier).
True.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. The second quartile is the median of an ordered data set.
True.
Z-score
Value - Mean ÷ Standard Deviation
Deviation
The difference between the entry and the population mean.
Standard Deviation
The square root of the variance.
Range
(Maximum data entry) − (Minimum data entry)
Construct a data set that has the given statistics. n = 7 x = 19 s = 0 1.)What does the value n mean? 2.)What does the value x mean? 3.)What does the value s mean? 4.)Complete the sample data set.
1.)The number of values in the sample data set. This means there should be 7 values in the sample data set. 2.)The mean of the sample data set. This means the sample mean of the 7 values in the sample data should be 19. 3.)The spread of the data from the sample mean. 4.)19, 19,19, (19), 19, (19),19
In terms of displaying data, how is a stem-and-leaf plot similar to a dot plot?
Both plots show how data are distributed. Both plots can be used to identify unusual data values. Both plots can be used to determine specific data entries.
Relative frequency
Class frequency ÷ Sample size
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. In a frequency distribution, the class width is the distance between the lower and upper limits of a class.
False. In a frequency distribution, the class width is the distance between the lower or upper limits of consecutive classes.
What are some benefits of using graphs of frequency distributions?
It can be easier to identify patterns of a data set by looking at a graph of the frequency distribution.
What are some benefits of representing data sets using frequency distributions?
Organizing the data into a frequency distribution can make patterns within the data more evident.
Describe the relationship between quartiles and percentiles.
Quartiles are special cases of percentiles. Q1 is the 25th percentile. Q2 is the 50th percentile. Q3 is the 75th percentile.
What is the difference between relative frequency and cumulative frequency?
Relative frequency of a class is the percentage of the data that falls in that class, while cumulative frequency of a class is the sum of the frequencies of that class and all previous classes.
Weighted Mean
Score × Weight = Product Sum of all products ÷ Sum of all weights
Mode
The data entry that occurs with the greatest frequency.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. It is impossible to have a z-score of 0.
The statement is false. A z-score of 0 is a standardized value that is equal to the mean.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. An ogive is a graph that displays relative frequencies.
The statement is false. An ogive is a graph that displays cumulative frequencies.
Mean
The sum of the data entries divided by the number of entries.
Heights of men on a baseball team have a bell-shaped distribution with a mean of 186 cm and a standard deviation of 9 cm. Using the empirical rule, what is the approximate percentage of the men between the following values? a. 168 cm and 204 cm b. 159 cm and 213 cm
a. 95% of the men are between 168 cm and 204 cm. b. 99.7% of the men are between 159 cm and 213 cm.
Cumulative frequency
the sum of the frequency for that class and all previous classes
Interquartile Range (IQR)
Q3 - Q1
Variance
The sum of the squares of these deviations divided by the number of entries.
Midpoint
(Lower class limit) + (Upper class limit) ÷ 2
You are applying for a job at two companies. Company A offers starting salaries with µ = $29,000 and σ = $1,000. Company B offers starting salaries with µ = $29,000 and σ = $3,000. From which company are you more likely to get an offer of $31,000 or more?
Company B, because data values that lie within one standard deviation from the mean are considered very usual.
Class Boundary
Lower class limit − 0.5 Upper class limit + 0.5
How is a Pareto chart different from a standard vertical bar graph?
The bars are positioned in order of decreasing height with the tallest bar on the left.
Explain the relationship between variance and standard deviation. Can either of these measures be negative? Explain.
The standard deviation is the positive square root of the variance. The standard deviation and variance can never be negative. Squared deviations can never be negative.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. The 50th percentile is equivalent to Q1.
The statement is false. The 50th percentile is equivalent to Q2.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. A data set can have the same mean, median, and mode.
The statement is true.
A student's score on an actuarial exam is in the 78th percentile. What can you conclude about the student's exam score?
The student scored higher than 78% of the students who took the actuarial exam.
Skewed left
The tail of the graph of the distribution elongates more to the left.
Skewed right
The tail of the graph of the distribution elongates more to the right.
The goals scored per game by a soccer team represent the first quarterly for all teams in a league. What can you conclude about the team's goals scored per game?
The team scored fewer goals per game than 75% of the teams in the league.
Why is the standard deviation used more frequently than the variance?
The units of variance are squared. Its units are meaningless.Th
Median
The value that lies in the middle of the data when the data set is ordered.
Construct a data set that has the given statistics. N = 6 µ = 10 σ = 3 1.) What does the value N mean? 2.) What does the value µ mean? 3.) What does the value σ mean? 4.)Complete the population data set.
1.) The number of values in the population data set. This means there should be 6 values in the data set. 2.) The mean of the population data set. This means the population mean of the 6 values in the data set should be 10. 3.)The spread of the data from the population mean. This means the population standard deviation of the 6 values with a population mean of 10 should be 3. 4.) 7, 7, (7), 13, (13), 13
What is the difference between a frequency polygon and an ogive?
A frequency polygon displays class frequencies while an ogive displays cumulative frequencies.
Empirical Rule
About 68% of all values fall within 1 standard deviation of the mean. About 95% of all values fall within 2 standard deviations of the mean. About 99.7% of all values fall within 3 standard deviations of the mean.
Uniform
All classes in the distribution have equal frequencies. A uniform distribution is also symmetric.
Outliers
Any data entry less than Q1 - 1.5(IQR) or greater than Q3 + 1.5(IQR)
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum = 17, maximum = 130, 8 classes 1.) The class width is 2.) Choose the correct lower class limits. 3.) Choose the correct upper class limits.
1.) 15 2.) 17, 32, 47, 62, 77, 92, 107, 122 3.) 31, 46, 61, 76, 91, 106, 121, 136
Symmetric
If a vertical line is drawn through the middle of the graph, the resulting halves are approximately mirror images.
Determine whether the statement is true or false. If it is false, rewrite it as a true statement. Some quantitative data sets do not have medians.
The statement is false. All quantitative data set have medians.
Describe the difference between the calculation of population standard deviation and that of sample standard deviation. Let N be the number of data entries in a population and n be the number of data entries in a sample data set. Choose the correct answer below.
When calculating the population standard deviation, the sum of the squared deviation is divided by N, then the square root of the result is taken. When calculating the sample standard deviation, the sum of the squared deviations is divided by n - 1, then the square root of the result is taken.
Use the given minimum and maximum data entries, and the number of classes, to find the class width, the lower class limits, and the upper class limits. minimum = 10, maximum = 59, 6 classes 1.) The class width is 2.) Choose the correct lower class limits. 3.) Choose the correct upper class limits.
1.) 9 2.) 10, 19, 28, 37, 46, 55 3.) 18, 27, 36, 45, 54, 63
Compare the three data sets on the right. 1.) Which data set has the greatest sample standard deviation? 2.) Which data set has the least sample standard deviation? 3.) How are the data sets the same? How do they differ?
1.) Data set (?), because it has more entries that are farther away from the mean. 2.) Data set (?), because it has more entries that are close to the mean. 3.) The three data sets have the same mean but have different standard deviations.
You are applying for a job at two companies. Company A offers starting salaries with µ = $25,000 and σ = $2,000. Company B offers starting salaries with µ = $25,000 and σ = $ 5,000. From which company are you more likely to get an offer of $29,000 or more?
Company B, because data values that lie within one standard deviation from the mean are considered very usual.
Given a data set, how do you know whether to calculate σ or s?
When given a data set, one would have to determine if it represented the population or if it was a sample taken from the population. If the data are a population, then σ is calculated. If the data are a sample, then s is calculated.