WGU C955 - Module 4 & 5: Descriptive Statistics for a Single Variable and Two variables
An outlier
A data point that is significantly distant from the other data points in the data set is called: a) An anomaly b) An outlier c) An outsider d) An original
False
About 95 percent of results in a normal distribution fall between one standard deviation below the mean and one standard deviation above the mean. True or False? a) True b) False
4.7 The Standard Deviation Rule says that 2.35% of the values will fall within 2 to 3 standard deviations above the mean. Since we are looking for the population that both falls above and below the mean, we would multiply 2.35 by 2. Therefore, the answer is 4.7%
According to the Standard Deviation Rule in a population that is normally distributed, what percent of the population will fall between 2 and 3 standard deviations of the mean? (Include both above and below the mean!) a 4.7% b 2.35% c 7.5% d 10.2%
Mode, median, mean
Arrange the mean, median, and mode in order from least to greatest in a distribution that it is positively skewed. a Median, mode, mean b Mean, median, mode c Mean, mode, median d Mode, median, mean
Mean, median, mode The median will always remain in the middle for skewed data, the mean goes towards the skew, and the mode goes away from the skew.
Arrange the mean, median, and mode in order from least to greatest in a distribution that it is skewed left. a Median, mode, mean b Mean, median, mode c Mean, mode, median d Mode, median, mean
13.5%
Assume a normal distribution of data. What percent of the population will fall between 1 and 2 standard deviations above the mean? a 2.35% b 13.5% c 14.7% d 16.75
99.7%
Assuming a population is normally distributed, what percentage of the population will fall within 3 standard deviations of the mean? a 98.6% b 68% c 95% d 99.7%
68%
Assuming that the heights of adult males in the United States are normally distributed, what is the approximate percentage of adult males who have a height (in ft.) within 1 standard deviation of the mean? a) 50% b) 68% c) 95.4% d) 99.7%
10
Determine Q1 for the sorted list of data values. {3,8,8,10,16,20,20,23,27,37,44,45,62,71} a 8 b 10 c 11 d 16
81
Determine Q3 for the following set of data values. {60,47,65,85,81,77,70,56,47,84,75} a 25 b 57 c 68 d 81
44
Determine Q3 for the sorted list of data values. {3,8,8,10,16,20,20,23,27,37,44,45,62,71} a 27 b 37 c 44 d 45
A and B Q1- (IQR)(1.5) and Q3-(IQR)(1.5) any numbers below or higher than those are your outliers.
Determine any outliers in the following data set. {1,24,26,28,32,36,38,40,65} a 1 b 65 c A and B d None
106 Q1- (IQR)(1.5) and Q3-(IQR)(1.5) any numbers below or higher than those are your outliers
Determine any outliers in the following data set. {55,77,64,82,58,75,57,49,106,71,68,76} a 49 b 106 c Both A and B d There are no outliers in the data set.
137
Determine the best estimate for the total number of ice creams sold on Monday and Tuesday. {{ Bar Chart reflecting the average number of ice creams sold on each day of the week. Mon. 70, Tue. between 60 and 70, Wed. between 50 and 60, Thur. between 40 and 50, Fri. between 60 and 70, Sat between 30 and 40, Sun. 30. }} a 100 b 112 c 137 d 148
25
Determine the interquartile range ( IQR ) for the following set of diagnostic test scores. {60,47,65,85,81,77,70,56,47,84,75} a 25 b 57 c 68 d 80
7.6
Determine the mean for the data. a 7 b 7.5 c 7.6 d Cannot determine
68
Determine the mean for the following set of data values. {60,47,65,85,81,77,70,56,47,84,75} a 67 b 70 c 47 d 68
70
Determine the median for the following set of diagnostic test scores.{60,47,65,85,81,77,70,56,47,84,75} a 67 b 70 c 47 d 67.9
64
Determine the range for the following data set. {1,24,26,28,32,36,38,40,65} a 1 to 65 b 64 c 65 d Cannot determine
37
Determine the range for the following data set. {55,77,64,82,58,75,57,49,86,71,68,76} a 33 b 31 c 27 d 37
No, a box plot does not include the mean. The mean is not included in a box plot. A box plot includes the median value.
Does a box plot include the mean of a set of data? a Yes, a box plot includes both the mean and the median. b No, a box plot does not include the mean.
94
Given the following data set: {40, 11, 13, 26, 13, 40, 82, 5, 96, 45, 99, 36, 17, 33, 68} What is the range of this data set? a) 91 b) 88 c) 94 d) 89
b. When the distribution is negatively skewed, the mean is less than the median
How are the mean and median of a sample related if the distribution is negatively skewed? a) The mean is greater than the median. b) The mean is less than the median. c) They are equal. d) Cannot be determined
The values are all equal
How are the mean, median, and mode of a sample related if the distribution is normally distributed? a) The mode is greater than the other two. b) The values are all equal c) The mean is largest value d) The mean is less than the median, which is less than the mode
No vertical scale.
Identify the most significant mistake with the graph? {{ Bar chart showing number of software developer jobs in 2014, compared to the projected number in 2024, indicates that there will be a 17% increase in software developer jobs in 2024. The graph does not have a y-axis. }} a) Should have more years between 2014 and 2024. b) No vertical scale. c) Different colors should be used. d) Wrong type of graph used.
500
If 135 employees work in the software division, how many employees were surveyed? {{ Pie chart illustrating the distribution of where employees at Advotech work. 34% Business Division, 12% Online Services Division, 21% Server and Tools Division, 27% Software Division, 6% Other }} a) 400 b) 475 c) 500 d) 625
125 15 / .12
If 15 people exercised 3 - 4 hours per week, how many people were surveyed? a 18 b 180 c 1250 d 125
6 150 * .04 = 6
If 150 people are surveyed, how many claim to have exercised 6 - 7 hours per week? a 6 b 60 c 38 d Cannot determine
72
If 200 people are surveyed, how many claim to have had no exercise? a 18 b 36 c 72 d Cannot determine
800 to solve this its 80 divided by the percent in decimal form
If 80 people exercised 4 - 5 hours per week, how many people were surveyed? a 80 b 800 c 400 d 50
a. Two-way frequency table If both variables are categorical, a two-way frequency table is used to display the data.
If both variables are categorical, what is the best method to display the data? a) Two-way frequency table b) Scatterplot c) Side-by-side box plot d) Histogram
5
If the following data are represented on a stem plot, how many leaves would be shown on the " 6 " stem? {57,63,55,59,52,60,65,56,56,65,55,63} a 5 b 4 c 3 d 2
c) Explanatory variable A scatterplot is a graphical display that shows the explanatory variable on the x-axis.
In a scatterplot, which variable is represented on the x-axis? a) Response variable b) Dependent variable c) Explanatory variable d) Qualitative variable
a. Response variable
In a scatterplot, which variable is represented on the y-axis? a. Response variable b. Lurking variable c. Explanatory variable d. Categorical variable
c. Response variable A scatterplot is a graphical display that shows the explanatory variable on the x-axis and the response variable on the y-axis.
In the scatterplot, costs ($) represents what type of variable? a) Qualitative variable b) Explanatory variable c) Response variable d) Categorical variable
b. Explanatory variable A scatterplot is a graphical display that shows the explanatory variable on the x-axis and the response variable on the y-axis.
In the scatterplot, units of output represent what type of variable? a) Qualitative variable b) Explanatory variable c) Response variable d) Categorical variable
a) Response variable The response variable is a quantitative variable. First-quarter sales is a quantitative variable.
In the side-by-side box plot, first-quarter sales represent what type of variable? a) Response variable b) Categorical variable c) Explanatory variable d) Qualitative variable
Zero
It is important to start the frequency scale on a bar chart at which value to be certain not to overemphasize a difference in values? a Zero b The lowest measured value c The smallest frequency d As long as the scale is even, it doesn't matter where you start it.
Age of death
Of the following sets of data, which would you assume should have the greatest range? a Age when a baby gets their first tooth b Age of first-year business students c Age of death d Age of high school graduate
The ages of interns currently in the college summer internship program.
Of the following sets of data, which would you assume should have the smallest range? a Price in dollars ($) of penny stocks currently being traded over-the-counter through the OTC Bulletin Board. b Ages of stockbrokers currently on the trading floor. c The number of trades on the NYSE on any given day. d The ages of interns currently in the college summer internship program.
47.5% 70 to 94 mbps is 2 standard deviations above the mean, which means its almost half the data. The Standard Deviation Rule says that 95% of the values will fall within 2 standard deviations of the mean. Since we are looking for the amount that falls 2 standard deviations above the mean, we can divide this number in half to find the percentage. Therefore, 47.5% will fall within 2 standard deviations above the mean.
On average, Lightning Communications delivers to their customers, high-speed Internet connections of 70 Mbps, with a standard deviation of 12 Mbps. About what percentage of customers will experience a network speed between 70.0 Mbps and 94.0 Mbps? Assume a normal distribution. a 68% b 34% c 47.5% d 49.85%
8
Please use the dot plot below to answer the following question. Dot Plot displaying the following data points: 5, 6, 6, 7, 8, 8, 8, 9, 9, 10. Determine the median for the data. a 7 b 7.5 c 8 d Cannot determine.
Quantitative
Stem plots display which type of data? a Quantitative b Categorical c Neither Quantitative nor categorical d Quantitative and categorical
68%
The average male height in the U.S. in 2010 was 69.7 inches, with a standard deviation of 4 inches. Assuming a normal distribution, what percent of the data is between 65.7 and 73.7 inches? a) 34% b) 64% c) 58% d) 68%
categorical data., quantitative data
The bar graph is the graphical representation of _______. A histogram is the graphical representation of _____________.
95%
The bitrate of audio files in a sample group of files ranged from 160 kilobytes per second (Kbps) to 256 kilobytes per second (Kbps). What percentage of the audio files will fall within 2 standard deviations of the mean? Assume the data is normally distributed. a 34% b 68% c 95% d 99.7%
$ 42,700 and $ 87,900
The mean salary of an entry-level financial analyst in Anytown is $ 65,300 and the standard deviation is $ 11,300. What values would 95% of the data fall between? Assume a normal distribution. a) $ 47,200 and $ 89,700 b) $ 42,200 and $ 86,000 c) $ 42,700 and $ 87,900 d) $ 41,600 and $ 86,800
TRICK QUESTION! The answer is 34% since the question only states 1 standard deviation above and not above AND below.
The return on investment (ROI) in a sample population of investments ranged from 8% to 23% . What percentage of the sample will fall within 1 standard deviation above the mean? Assume the data is normally distributed. a 47.5% b 13.5% c 68% d 34%
6
Using the dot plot below, determine how many athletes on the cross country team weigh between 116 and 140 pounds. {{ Dot plot with a distribution of cross country athlete's weight; there are six dots between 116 and 140. }} a) 10 b) 8 c) 6 d) Cannot determine.
The % values do not sum to 100%.
Using the pie chart below, identify any possible misconception(s) with the graph. {{ Pie chart illustrating the distribution of ice cream flavor sales for July. 10% Cookie Dough, 14% Double Fudge Brownie, 18% Chocolate Chip, 27% Coffee Toffee Cruch, 29% Oreo Cookie }} a) The % values do not sum to 100%. b) The pieces are not proportional. c) Both A and B d) No misconceptions.
b and c
What can sometimes cause a bar graph or histogram to be misleading? a Whether or not the bars are touching b Starting the numerical scale on the vertical axis at a value other than zero c Leaving out a numerical scale on the vertical axis d b and c
125
What is second quartile ( Q2 ) of this data set? 78 85 87 90 100 105 117 123 125 125 128 135 140 152 159 165 169 179 a) 123 b) 124 c) 125 d) 125.5
box plot Outliers are determined by Q1 and Q3, which are clearly shown on a box plot. The outliers themselves are also displayed on the box plot.
What is the best type of graph to use where it is easiest to estimate outliers? a) Stem plot b) Histogram c) Dot plot d) Box plot
102.5
What is the first quartile ( Q1 ) of this data set? 78 85 87 90 100 105 115 117 123 125 125 128 135 140 152 159 160 169 179 a) 100 b) 103.5 c) 105 d) 102.5
65.3
What is the mean of the following data set? 68 60 68 64 84 48 60 80 80 58 62 52 a) 65.3 b) 69.2 c) 61.6 d) 64.4
The type of data depicted in the graph
What is the most significant difference between histograms and bar charts? a The number of bars used b The use of labels on the axes c The type of data depicted in the graph d The coloring used for each bar
36
What is the range of the following data set? 68 60 68 64 84 48 60 80 80 58 62 52 a) 26 b) 28 c) 32 d) 36
155.5
What is the third quartile ( Q3 ) of this data set? 78 85 87 90 100 105 115 117 123 125 125 128 135 140 152 159 160 169 179 a) 152 b) 155.5 c) 157 d) 159
Categorical
What type of data is presented in this chart? a Categorical b Quantitative c Numerical d Both b and c
Quantitative
What type of data is represented in a histogram? a Quantitative b Categorical c Both A and B d Neither A nor B
Positively Skewed
What type of distribution is shown in the histogram below? {{ Histogram illustrating a distribution of weights in pounds. The long tail of the graph is to the right of the peak }} a) Normal b) Negatively Skewed c) Positively Skewed d) Cannot determine
Histogram remember The bar graph is the graphical representation of categorical data. A histogram is the graphical representation of quantitative data.
What would be the best type of graph to use to display the age of all employees in a particular division in a company? a Bar chart b Histogram c Scatterplot d Pie chart
b. Side-by-side box plots can be used to compare quantitative data across multiple categories.
What would be the most appropriate method to graphically compare quantitative data across multiple categories? a) Two-way frequency table b) Side-by-side box plot c) Scatterplot d) Histogram
75%
When observing a box plot, what percentage of the measured data lies between Q1 and the maximum value in the list? a 75% b 50% c 25% d Cannot determine.
50 This area represents the middle 50% of the data.
When observing a box plot, what proportion of the data lies between Q1 and Q3? a. 25 b. 50 c. 75 d. 100
Pie chart
Which of the following are best used to display categorical data? a Histogram b Dot plot c Pie chart d Stem plot
Range of pages for volumes in an Encylopedia complete volume set.
Which of the following best describes a measure of spread? a) Day of the week with the most librarians on staff at BMC Library. b) The average number of books on a bookshelf in the Reference section of the library of a random sample of 15 bookshelves. c) Number of books put on reserve by library patrons in 2014. d) Range of pages for volumes in an Encylopedia complete volume set.
quantitative data
Which of the following is defined as data that represents values that can be counted or measured? a) descriptive data b) qualitative data c) categorical data d) quantitative data
Histogram
Which of the following would be the best option to graphically display continuous data? a) Bar Chart b) Histogram c) Pie Chart d) Box Plot
1,8,16,25,33
Which of the four different sets of numbers would have the greatest standard deviation? a 5,5,5,5,5 b 5,6,7,8,9 c 1,8,16,25,33 d 40,43,45,47
5,5,5,5,5
Which of the four different sets of numbers would have the least standard deviation? a 5,5,5,5,5 b 5,6,7,8,9 c 1,8,16,25,33 d 40,43,45,47
Box plot
You are a professional trainer at a local sports academy. You ask your athletes to determine the number of grams of protein they consume for a particular meal. Which of the following would be the best choice to illustrate the shape of the data you collect? a Bar chart b Pie chart c Box plot d None of the above
Quantitative
You are designing a study to determine if the amount of an individual's annual 401K contribution is dependent upon the number of hours she or he works per week. You survey a group of 900 men and women aged 20 - 60 and record the number of hours each person works per week versus their total annual 401K contribution. What type of data are you collecting? a Categorical b Qualitative c Quantitative d Both a and b.
bar graph While pie charts are used for categorical data, a bar graph is a better choice since the entire population of IT analysts in Boston has not been surveyed. The best way to display a large quantitative data set is a histogram.
You surveyed 50 IT analysts in the Boston area to determine at which tech firm they are employed. What type of graph would be best to use to display this data? a) Bar graph b) Pie chart c) Histogram d) Dot plot