Knewton Alta - Chapter 2 - Descriptive Statistics Part 2

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

The following data set provides wage information of Seattle by subdivisions. What is the lowest hourly rate in the fourth quartile of the Arts and Culture Department?

$40.93 The lowest hourly rate in the fourth quartile of the Arts and Culture Department is $40.93.

Find the five-number summary of the following data set. 1,2,4,5,6,8,8,10,19

1. First, order the data from least to greatest: 1,2,4,5,6,8,8,10,19 2. The sample minimum is the least value, which is 1. 3. The sample maximum is the least value, which is 19. 4. The median (second quartile) is the middle number in the ordered set. Since this data set has an odd number of values, the middle number is in the set. The median is 6. 5. The first quartile can be found in the lower half of the data set: 1,2,4,5 There are two 'middle numbers' in the lower half of the set, so you need to take the average of 2 and 4. (2+4)2=62=3, so the first quartile is 3. 6. The third quartile can be found in the upper half of the data set: 8,8,10,19 There are two 'middle numbers' in the upper half of the set, so you need to take the average of 8 and 10. (8+10)2=182=9, so the third quartile is 9.

The following frequency table summarizes a set of data. What is the five-number summary?

1. In a frequency table, the value column represents the number and the frequency column represents how many times that number appears. So we can write that the number 2 happens 3 times. So now we can write the data from least to greatest: 2,2,2,3,3,3,4,5,7,8,8,8,8,10,10,10,11,12,12 2. and 3. We can immediately see that the minimum value is 2 and the maximum value is 12. 4. If we add up the frequencies in the table, we see that there are 19 total values in the data set. Therefore, the median value is the one where there are 9 values below it and 9 values above it. By adding up frequencies, we see that this happens at the value 8, so that is the median.5. Now, looking at the lower half of the data, there are 9 values there, and so the median value of that half of the data is 3. This is the first quartile. 6. Similarly, the third quartile is the median of the upper half of the data, which is 10. 2,2,2,3,3,3,4,5,7,8,8,8,8,10,10,10,11,12,12 So the five-number summary is MinQ1MedianQ3Max2381012

The following frequency table summarizes 60 data values. What is the 3rd quartile of the data?

18 Remember that the 3rd quartile is the value which has 75% of the values below it. Because there are 60 values in the set of data, we compute (60)(75%)=45. So we want the value P which has 45 values less than or equal to P.Looking through the table, we find that there are 45 values less than or equal to 18, so the 3rd quartile is 18.

Elliot likes to find garden snakes in his backyard and record their lengths. Estimate the mean of the lengths (in inches) of the garden snakes given in the following grouped frequency table. Round the final answer to one decimal place.

11.5 Remember that to estimate the mean, we first find the midpoint of each interval: MidpointFrequency5.579.5613.51217.55 Now, we treat this as if it were a regular frequency table. We take each midpoint multiplied by its frequency, add them up, and divide by the total number of values. The sum is 5.5⋅7+9.5⋅6+13.5⋅12+17.5⋅5=345 We find the total number of values by adding up the frequency column: 7+6+12+5=30 Finally, dividing the sum by the total number of values, we find our mean estimate of the lengths (in inches) of the garden snakes is: Mean estimate=34530=11.5

The following frequency table summarizes 60 data values. What is the 80th percentile of the data?

17 Use k=80 and n=60 to calculate i. i=80100(60+1)=48.8 Since this is not an integer, round 48.8 up to 49 and down to 48. The 48th value in the ordered set is 17 and the 49th value is 17. The average of 17 and 17 is 17. So, the 80th percentile is 17.

A surveyor would like to find the mean number of pets living in apartments in a city. He collects data from 36 apartments in the area. The graph shows the frequency for the number of pets living in the apartments. Find the mean number of pets living in the 36 city apartments, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

2.1 The frequency graph shows the frequency for each data value. So, we can compute the mean by added up all the data values and dividing by the total number of data values. 13⋅1+11⋅2+9⋅3+2⋅4+1⋅5+0⋅6+0⋅736=7536≈2.08. Rounding to the nearest tenth, we have the mean is 2.1.

A cashier would like to find the mean number of dairy products purchased per customer at a grocery store. He collects data from 53 people checking out at the grocery store. The graph shows the frequency for the number of dairy products purchased. Find the mean number of dairy products purchased for the 53 customers, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

3.9 The frequency graph shows the frequency for each data value. So, we can compute the mean by added up all the data values and dividing by the total number of data values. 6⋅1+8⋅2+14⋅3+7⋅4+5⋅5+3⋅6+10⋅753=20553≈3.87. Rounding to the nearest tenth, we have the mean is 3.9.

The five number summary for a set of data is given below. Min Q1 Median Q3 Max 50 51 80 83 87 What is the interquartile range of the set of data?Enter just the number as your answer. For example, if you found that the interquartile range was 18, you would enter 18.

32​ Remember that the interquartile range is the third quartile minus the first quartile. So we find that the interquartile range is 83−51=32 This summary could represent bidding items bought on Ebay. A manager of those products could see the IQR is large and decide to choose permanent prices to reduce a wide range on sold items.

The following data set provides New York City school based programs cost by borough The five number summary of cost for school based programs in the Manhattan borough is given here. Minimum $35,263, Q1 $63,617, Median $507,206.5, Q3 $1,324,622, Maximum $4,667,715. Using the interquartile range, which of the following are outliers? Select all correct answers.

4,667,715 IQR=Q3−Q1 IQR=1,324,622−63,617=$1,261,005 To find any lower outliers:Q1−1.5(IQR)=63,617−1.5(1,261,005)=507,206.5−1,891,507.5=$−1,827,890.5There are no numbers below this level. To find any upper outliers:Q3+1.5(IQR)=1,324,622+1.5(1,261,005)=1,324,622+1,891,507.5=$3,216,129.5There is only one number more than $3,216,129.5, so there is only one outlier, $4,667,715.

The dataset below represents the population density per square mile of land area in 25 states in the 2010 U.S. Census. What is the 17th percentile? 1,19,35,43,49,55,56,56,63,67,94,105,110,168,175,181,212,231,239,351,461,595,738,839,9857

46 Use k=17 and n=25, to calculate i. i=17100(25+1)=4.42 Since this is not an integer, round 4.42 up to 5 and down to 4. The 4th value in the ordered set is 43, and the 5th value in the set is 49. The average of 43 and 49 is 46. So, the 17th percentile is 46.

Estimate the mean of the number of pull-ups completed by students during a gym class given in the following grouped frequency table. Round the final answer to one decimal place. Value Interval Frequency 2-5 11 6-9 6 10-13 3

5.9 Remember that to estimate the mean, we first find the midpoint of each interval: MidpointFrequency3.5117.5611.53 Now, we treat this as if it were a regular frequency table. We take each midpoint multiplied by its frequency, add them up, and divide by the total number of values. The sum is 3.5⋅11+7.5⋅6+11.5⋅3=118 We find the total number of values by adding up the frequency column:11+6+3=20Finally, dividing the sum by the total number of values, we find our mean estimate of the number of pull-ups completed by students during a gym class is:Mean estimate=11820=5.9

The following table shows 36 data values, sorted and arranged in rows of 5. What is the 3rd quartile of the data? 1 1 1 7 10 13 14 16 18 19 19 22 25 28 31 40 45 46 50 55 56 57 58 59 61 61 64 64 76 91 95 98 98 99 99 100

64

The following data set represents the ages of all 6 of Nancy's grandchildren. 11,8,5,6,3,9

7 Add up the squared deviations and divide by the number of data values: 16+1+4+1+16+46=7 So the variance is 7.

The following frequency table of data represents the costs of perfumes at a department store, what is the potential outlier?

Correct answer: 29 Note that most of the values are between 16 and 23, whereas 29 is far above the rest of the values. Therefore, 29 is the potential outlier. Since this data represents costs of perfumes at a certain department store, the outlier shows an uncommon occurrence. Managers may choose to sell the outlier perfume at a lower price.

The following data set provides wage information of Seattle by subdivisions. Background: You are interviewing for the job of Compensation Specialist for the city of Seattle. The hiring manager shows you the database for wage and classification information from February 2017. She asks you: What are the job titles of the two persons with the highest hourly rates in the second quartile of the Arts and Culture Department?

Administrative Staff Assistant and Stage Tech,Lead When separating the jobs into the quartiles, you see that the two job titles with the highest hourly rates in the second quartile of the Arts and Culture Department are Administrative Staff Assistant and Stage Tech,Lead.

The following data set provides wage information of Seattle by subdivisions. Arrange the Departments in ascending order of the maximum hourly rate.

Arts and Culture, City Auditor, City Budget The maximum hourly rate for the three departments in ascending order is: Arts and Culture ($62.59), City Auditor ($72.84), and City Budget ($86.83).

The five number summary for a set of data given below represents completed projects in a certain department. MinQ1MedianQ3Max 50 62 84 96 99 Using the interquartile range, which of the following are outliers? Select all correct answers.

Correct answer: 3 4 6 Remember that outliers are numbers that are less than 1.5⋅IQR below the first quartile or more than 1.5⋅IQR above the third quartile, where IQR stands for the interquartile range.The interquartile range is the third quartile minus the first quartile. So we find IQR=96−62=34 So a value is an outlier if it is less thanQ1−1.5⋅IQR=62−(1.5)(34)=11or greater thanQ3+1.5⋅IQR=96+(1.5)(34)=147So we see that 3, 4, and 6 are outliers. Since this summary represents completed projects in a department, the manager could see that the outliers are due to new employees or allow changes for even dispersement.

The five number summary for a set of data given below represents the cost of ads between different magazines owned by the same company. MinQ1MedianQ3Max 57 64 66 76 80 Using the interquartile range, which of the following are outliers? Select all correct answers.

Correct answer: 36 96 117 Remember that outliers are numbers that are less than 1.5⋅IQR below the first quartile or more than 1.5⋅IQR above the third quartile, where IQR stands for the interquartile range.The interquartile range is the third quartile minus the first quartile. So we find IQR=76−64=12 So a value is an outlier if it is less thanQ1−1.5⋅IQR=64−(1.5)(12)=46or greater thanQ3+1.5⋅IQR=76+(1.5)(12)=94So we see that 36, 96, and 117 are outliers. Since this data represents the cost of ads in magazines all owned by the same company those in charge could steer those with a larger ad budget to the top outliers and those with a lower ad budget to the bottom outliers.

Which areas of the plot for the Midwest and West regions have potential outliers?

Correct answer: Distance between the minimum and Q1 The length of the lower whisker line for both the Midwest and the West is long compared to the upper whiskers. The lower values may contain outliers.

The following data set provides Oklahoma data on benchmark jobs and relationship to market. You make some final calculations about the Information Systems (IS) staff. Which of the following statements are true? Select all that apply.

Correct answer: IS staff at the 25th percentile earn a salary of $44,237.21. IS staff at the 75th percentile earn a salary of $56,109.65. The IS staff at the median salary earn more than the CAD specialists at the 45th percentile. The calculations of the percentiles and quartiles confirm the correct answers. The only one that is not correct is D. The 87th percentile of the HR staff earn less than the IS staff at the 75th percentile. The numbers are $53,350.12 versus $56,109.65.

In the census population density data set, what are the first, second and third quartiles? 1,19,35,43,49,55,56,56,63,67,94,105,110,168,175,181,212,231,239,351,461,595,738,839,9857

Correct answer: Q1: 55.5 Q2: 110 Q3: 295 First, find the median (or second quartile). There is an odd number of data points, and so the median is the middle number, 110. To find the first quartile, you need to find the middle number (median) of the lower half of the data. There is an even number of data points, so the first quartile is the average of the two middle data points, 55 and 56. The first quartile is 55.5.To find the third quartile, you need to find the middle number (median) of the upper half of the data. There is an even number of data points, so the third quartile is the average of the two middle data points, 239 and 351. The third quartile is 295.

John is the owner of a flower shop in New York City.The changes in weather and temperature are key factors for his inventory. The data below are the monthly average high temperatures for New York City. What is the five-number summary? 40,40,48,61,72,78,84,84,76,65,54,42

Correct answer: Sample minimum: 40, Sample maximum: 84 Q1: 45, Median: 63, Q3: 77 The five-number summary must be found using a data set that is ordered from least to greatest. Once ordered, the sample minimum is the smallest value, and the sample maximum is the largest value. The median is the middle value, which separates the data set into a lower half and an upper half. Q1 is the median of the lower half of the data set, and Q3 is the median of the upper half of the data set.

The following table shows 56 data values, sorted and arranged in rows of 5. What is the 2nd quartile of the data? 1 2 5 7 7 9 10 11 11 14 15 15 15 18 18 18 22 22 22 23 26 27 28 33 34 34 35 38 38 39 41 42 45 48 52 52 54 58 59 61 65 66 68 70 79 84 87 87 87 88 89 91 93 96 97 98

Correct answer: quartile=38​ Remember that the 2nd quartile is the value which has 50% of the values below it. Because there are 56 values in the set of data, we compute (56)⋅(50%)=28. So we want the value P which has 28 values less than or equal to P.Looking through the table, we find that there are 28 values less than or equal to 38, so the 2nd quartile is 38. A hospital uses percentiles to see how their emergency room compares to several other hospitals. For example (using the same data), a rating scale of 1−100 was taken from 56 hospitals. Hospital A has a rating of 22 and from this data, we know that is below the 2nd quartile since the 2nd quartile is 38.

The five number summary for a set of data is given below. MinQ1MedianQ3Max 69 74 81 85 87 What is the interquartile range of the set of data?Enter just the number as your answer. For example, if you found that the interquartile range was 24, you would enter 24.

Correct answers:$11$11​ Remember that the interquartile range is the third quartile minus the first quartile. So we find that the interquartile range is 85−74=11

The dataset below represents the population density per square mile of land area in 15 states in the 2010 U.S. Census. What is the interquartile range 1,19,35,43,49,55,63,94,105,110,175,231,239,351,738

Correct answers:$188​ Find the median, which is 94. Q1 and Q3 can be calculated by finding the medians of the lower and the upper half of the data set, separated by the median. Q1 is 43, and Q3 is 231. The interquartile range is the difference between Q3 and Q1. 231−43=188, so the interquartile range is 188.

The following data set provides Oklahoma data on benchmark jobs and relationship to market. You find the spread in salaries for HR staff is much larger than that for the CAD (Computer Aided Drafting and Design) specialists. You calculate the 33rd percentile of HR staff salaries to find how it compares to your previous calculation. What is that salary?

Correct answers:$37,175.56​ There are 236 HR employees, so the 33rd percentile is between the 78th and the 79th employee. Both of them make the same salary, $37,175.56.

What is the potential outlier in the population density data set? 1,19,35,43,49,55,63,94,105,110,175,231,239,351,738

Correct answers:$738​ But to be sure a value is an outlier, check if the data value is less than Q1−1.5 (IQR), or greater than Q3+1.5 (IQR). Q1−1.5(IQR)=43−1.5(188)=43−282=−239 Q3+1.5(IQR)=231+1.5(188)=231+282=513 The value 738 is greater than 513 so it is an outlier.

Find the median of the following set of data. 7,26,7,9,11,4,15,22

Correct answers:$\text{median=}10$median=10​ It helps to put the numbers in order. 4,7,7,9,11,15,22,26 Now, because the list has length 8, which is even, we know the median number will be the average of the middle two numbers, 9 and 11. So the median is 10.

Given the following list of the number of pens randomly selected students purchased in the last semester, find the median. 13,7,8,37,32,19,17,32,12,26

Correct answers:$\text{median=}18\text{ pens}$median=18 pens​ It helps to put the numbers in order. 7,8,12,13,17,19,26,32,32,37 Now, because the list has length 10, which is even, we know the median number will be the average of the middle two numbers, 17 and 19. So the median number of pens randomly selected students purchased in the last semester is 18.

The following data set provides wage information of Seattle by subdivisions. In how many different quartiles would you find people with the job classification of Strategic Advisor 2, Exempt, in the City Budget Department?

Correct answers:3 ​ The City Budget Department has people with the job classification of Strategic Advisor 2, Exempt, in all quartiles of the compensation spreadsheet except for the the first quartile.

Based on the box-and-whisker plot you constructed above, which area of the plot has potential outliers?

Distance between the minimum and Q1 The length of the lower whisker line is long compared to the upper whiskers. The lower values may contain outliers.

Which interquartile range (IQR) is much smaller than the other three regions?

East The East region has a very tight spread of tuition costs, so its IQR is the smallest.

Find the mode of the following amounts (in thousands of dollars) in checking accounts of randomly selected people aged 20-25. 2,4,4,7,2,9,9,2,4,4,11

If we count the number of times each value appears in the list, we get the following frequency table: Note that 4 occurs 4 times, which is the greatest frequency, so 4 is the mode of the amounts (in thousands of dollars) in checking accounts of randomly selected people aged 20-25.

Find the median of the numbers in the following list. 10,7,15,6,24,20,1

It helps to put the numbers in order. 1,6,7,10,15,20,24 Now, because the list has length 7, which is odd, we know the median number will be the middle number. In other words, we can count to item 4 in the list, which is 10. So the median is 10.

Find the median of the following list of dollars spent per customer at a cheese shop in the last hour. 32,19,21,16,27,15

It helps to put the numbers in order. 15,16,19,21,27,32 Now, because the list has length 6, which is even, we know the median number will be the average of the middle two numbers, 19 and 21. So the median number of dollars spent per customer at a cheese shop in the last hour is 20

Find the median of the following list of inches traveled by randomly selected worms in a two minute time period. 11,7,5,12,20,6

It helps to put the numbers in order. 5,6,7,11,12,20 Now, because the list has length 6, which is even, we know the median number will be the average of the middle two numbers, 7 and 11. So the median number of inches traveled by randomly selected worms in a two minute time period is 9.

Find the mode of the following number of computers available to students at randomly selected high school libraries. 9,19,7,16,13,19,7,13,13

Note that 13 occurs 3 times, which is the greatest frequency, so 13 is the mode of the number of computers available to students at randomly selected high school libraries.

Given the following list of values, is the mean or the median likely to be a better measure of the center of the data set?29, 56, 27, 29, 27, 28, 28, 30, 30, 27

Median Most of the values are close together in the range between 27 and 30, but because there is one number, 56, which is much larger than the rest of the values, the mean would not be a good measure because that one large value would pull the mean up. Therefore, the median is probably a better measure of the center of this data set.

Given the following frequency table of values, is the mean or the median likely to be a better measure of the center of the data set? Value202122232425262728293031323334353637Frequency100000000000034331

Median Most of the values are close together in the range between 33 and 37, but because there is one number, 20, which is much smaller than the rest of the values, the mean would not be a good measure because that one small value would pull the mean down. Therefore, the median is probably a better measure of the center of this data set.

Given the following frequency table of values, is the mean or the median likely to be a better measure of the center of the data set?

Median Most of the values are close together in the range between 40 and 45, but because there is one number, 70, which is much larger than the rest of the values, the mean would not be a good measure because that one large value would pull the mean up. Therefore, the median is probably a better measure of the center of this data set.

The following frequency table summarizes a set of data. What is the five-number summary?

MinQ1MedianQ3Max 2 4 8 10 12 We can immediately see that the minimum value is 2 and the maximum value is 12.If we add up the frequencies in the table, we see that there are 19 total values in the data set. Therefore, the median value is the one where there are 9 values below it and 9 values above it. By adding up frequencies, we see that this happens at the value 8, so that is the median. Now, looking at the lower half of the data, there are 9 values there, and so the median value of that half of the data is 4. This is the first quartile. Similarly, the third quartile is the median of the upper half of the data, which is 10.2, 2, 3, 3, 4, 4, 5, 7, 7, 8, 8, 8, 8, 10, 10, 10, 11, 12, 12So the five-number summary is MinQ1MedianQ3Max 2 4 8 10 12 Plant Director can use a five-number summary to see the layout of number of projects completed per manager. The lowest number of projects completed was 2, the most 12 with a median score of 8. The median of the lower half 4, and upper half 10. This could tell the plant director if projects should be distributed differently among managers.

The following frequency table summarizes a set of data. What is the five-number summary?

MinQ1MedianQ3Max 5 6 11 13 16 We can immediately see that the minimum value is 5 and the maximum value is 16.If we add up the frequencies in the table, we see that there are 15 total values in the data set. Therefore, the median value is the one where there are 7 values below it and 7 values above it. By adding up frequencies, we see that this happens at the value 11, so that is the median.Now, looking at the lower half of the data, there are 7 values there, and so the median value of that half of the data is 6. This is the first quartile. Similarly, the third quartile is the median of the upper half of the data, which is 13.5, 5, 5, 6, 6, 7, 9, 11, 11, 11, 13, 13, 13, 15, 16So the five-number summary is MinQ1MedianQ3Max 5 6 11 13 16

The following frequency table summarizes a set of data. What is the five-number summary?

MinQ1MedianQ3Max 7 8 11 14 17 We can immediately see that the minimum value is 7 and the maximum value is 17.If we add up the frequencies in the table, we see that there are 15 total values in the data set. Therefore, the median value is the one where there are 7 values below it and 7 values above it. By adding up frequencies, we see that this happens at the value 11, so that is the median.Now, looking at the lower half of the data, there are 7 values there, and so the median value of that half of the data is 8. This is the first quartile. Similarly, the third quartile is the median of the upper half of the data, which is 14.7, 7, 8, 8, 8, 10, 10, 11, 13, 13, 14, 14, 16, 17, 17So the five-number summary is MinQ1MedianQ3Max 7 8 11 14 17 A new T-Shirt shop can use a five-number summary to see the layout of number of sweaters sold per day. The lowest number of sweaters sold was 7, the most 17 with a median score of 11. The median of the lower half 8, and upper half 14. This could tell the shop if they should keep selling sweaters or look at another item.

The following frequency table summarizes a set of data. What is the five-number summary?

MinQ1MedianQ3Max 7 9 13 15 17 We can immediately see that the minimum value is 7 and the maximum value is 17.If we add up the frequencies in the table, we see that there are 15 total values in the data set. Therefore, the median value is the one where there are 7 values below it and 7 values above it. By adding up frequencies, we see that this happens at the value 13, so that is the median.Now, looking at the lower half of the data, there are 7 values there, and so the median value of that half of the data is 9. This is the first quartile. Similarly, the third quartile is the median of the upper half of the data, which is 15.7, 8, 9, 9, 9, 10, 13, 13, 13, 15, 15, 15, 16, 16, 17So the five-number summary is MinQ1MedianQ3Max79131517

Given the following list of data, What is the five-number summary?2, 5, 7, 7, 9, 9, 9, 10, 10, 11, 12

MinQ1MedianQ3Max2791012 We can immediately see that the minimum value is 2 and the maximum value is 12.There are 11 values in the list, so the median value is the one where there are 5 values below it and 5 values above it. We see that this happens at the value 9, so that is the median.Now, looking at the lower half of the data, there are 5 values there, and so the median value of that half of the data is 7. This is the first quartile. Similarly, the third quartile is the median of the upper half of the data, which is 10.So the five-number summary is MinQ1MedianQ3Max2791012 A restaurant can use a five-number summary to see the layout of customer satisfaction scores about their service. For this, a scale of least 1 - most 12 could have been used with 1 not being selected. The lowest satisfaction score used was 2, the most 12 with a median score of 9. The median of the lower half 7, and upper half 10. This could tell the restaurant whether to train employees more or keep with their standards.

Given the following list of values, is the mean or the median likely to be a better measure of the center of the data set?29, 56, 27, 29, 27, 28, 28, 30, 30, 27

Most of the values are close together in the range between 27 and 30, but because there is one number, 56, which is much larger than the rest of the values, the mean would not be a good measure because that one large value would pull the mean up. Therefore, the median is probably a better measure of the center of this data set.

Given the following frequency table of values, is the mean or the median likely to be a better measure of the center of the data set

Most of the values are close together in the range between 33 and 37, but because there is one number, 20, which is much smaller than the rest of the values, the median is probably a better measure of the center of this data set.

Given the following list of values, is the mean or the median likely to be a better measure of the center of the data set?39, 41, 38, 39, 38, 41, 41, 39, 40

Most of the values are close together in the range between 38 and 41. There are no very large or very small values in the list, so the mean is a good measure of the center because it takes into account all the values but will not be pulled up or down by any one value.

The following data set provides New York City school based programs cost by borough. Carol examines a five number summary. She is given four numbers which are options to be outliers. If the numbers are above the maximum, or below the minimum, are they automatically identified as outliers?

No If a number is outside of the five number maximum and minimum, it does not necessarily mean it is an outlier. The IQR can help to determine potential outliers. An outlier is a data point that is significantly different (or far away) from the other data values. Outliers may be errors or some kind of abnormality, or they may be a key to understanding the data.

The following data set provides New York City school based programs cost by borough. The five number summary of cost for school based programs in the Brooklyn borough is given here. Minimum $35,263, Q1 $150,000, Median $738,703.5, Q3 $1,711,568, Maximum $5,704,790. Using the interquartile range, which of the following are outliers? Select all correct answers.

No outliers IQR=Q3−Q1 IQR=1,711,568−150,000=$1,561,568 To find any lower outliers:Q1−1.5(IQR)=150,000−1.5(1,561,568)=150,000−2,342,352=$−2,192,352There are no numbers less than $−2,192,352. To find any upper outliers:Q3+1.5(IQR)=1,711,568+1.5(1,561,568)=1,711,568+2,342,352=$4,053,920There are no numbers above this level. So, there are no outliers listed.

Find the mode of the following number of states randomly selected travelers at a service plaza visited in the past three years. 18,13,8,8,13,10,13,10,9,18

Note that 13 occurs 3 times, which is the greatest frequency, so 13 is the mode of the number of states randomly selected travelers at a service plaza visited in the past three years.

Find the mode of the following amounts of exercise (in hours) randomly selected runners completed during a weekend. 2,14,14,4,2,4,1,14,4,4,8

Note that 4 occurs 4 times, which is the greatest frequency, so 4 is the mode of the amount of exercise (in hours) randomly selected runners completed during a weekend.

Find the mode of the following number of times each machine in a car factory needed to be fixed within the last year. 2,5,6,12,14,12,6,2,5,3,14,5

Note that 5 occurs 3 times, which is the greatest frequency, so 5 is the mode of the number of times a machine in a car factory needed to be fixed within the last year

Find the mode of the following number of stingrays spotted by each person during a snorkeling trip. 11,5,7,5,11,1,7,4,1,7,7

Note that 7 occurs 4 times, which is the greatest frequency, so 7 is the mode of the number of stingrays spotted by each person during a snorkeling trip.

Given the following list of minutes randomly selected customers waited in line to buy tickets for a play, find the median. 17,11,34,22,4,9,12,31,21,15

Now, because the list has length 10, which is even, we know the median number will be the average of the middle two numbers, 15 and 17. So the median number of minutes randomly selected customers waited in line to buy tickets for a play is 16.

Given the following list of data, find the median. 22,33,17,8,17,29,18,13,26,28

Now, because the list has length 10, which is even, we know the median number will be the average of the middle two numbers, 18 and 22. So the median is 20.

A statistics professor surveys all 100 of the students in an introductory statistics lecture. The survey asks the students to estimate when they typically wake up on weekdays. The data are recorded in terms of the number of hours after midnight the students wake up. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?

Population standard deviation The survey was given to all 100 students in the class, which is the entire population of the class. So a population standard deviation would be more appropriate.

The following frequency table summarizes 60 data values. What is the 3rd quartile of the data? Value Frequency 1 5 2 4 3 3 4 1 5 6 6 4 7 1 8 3 9 2 10 2 11 5 12 3 13 2 14 2 15 5 16 1 17 1 18 4 19 1 20 5

Remember that the 3rd quartile is the value which has 75% of the values below it. Because there are 60 values in the set of data, we compute (60)⋅(75%)=45. So we want the value P which has 45 values less than or equal to P.Looking through the table, we find that there are 45 values less than or equal to 15, so the 3rd quartile is 15.

Estimate the mean of the amounts (in dollars) randomly selected customers spent on chocolate chip cookies at a winter fair given in the following grouped frequency table. Round the final answer to one decimal place.

Remember that to estimate the mean, we first find the midpoint of each interval: Now, we treat this as if it were a regular frequency table. We take each midpoint multiplied by its frequency, add them up, and divide by the total number of values. The sum is 1.5⋅5+5.5⋅6+9.5⋅13+13.5⋅1=177.5 We find the total number of values by adding up the frequency column: 5+6+13+1=25 Finally, dividing the sum by the total number of values, we find our mean estimate of the amounts (in dollars) randomly selected customers spent on chocolate chip cookies at a winter fair is: Mean estimate=177.525=7.1

A hotel owner is deciding whether to buy new parts, hire a plumber, or allow no changes due to possible issues with the water pressure. To help make her decision, the data set lists the number of complaints about the water pressure at the hotel. For this data set, the minimum is 3, the median is 15, the third quartile is 16, the interquartile range is 4, and the maximum is 19. Construct a box-and-whisker plot that shows the number of complaints. Move the median first, then the first and third quartiles, and last the minimum and maximum.

Remember that the interquartile range is the third quartile minus the first quartile. Since we know the third quartile is 16, and the interquartile range is 4, we find that the first quartile must be 16−4=12. Since the box-and-whisker plot represents the five number summary of a set of data, the left end of the left whisker is the minimum value (3), the left edge of the box is the first quartile (12), the line in the middle of the box is the median (15), the right edge of the box is the third quartile (16), and the right end of the right whisker is the maximum value (19).

The frequency table below summarizes a list of the number of laps completed by swimmers during a fitness class. Find the mean. Value Frequency 10 7 11 2 12 1 13 3 14 1 15 6 16 3 17 1

Remember that the mean is the sum of all the numbers divided by the number of numbers. The frequency table tells you the number of time that each number appears in the set of data. So to get the sum of all the numbers in the set of data, we take each frequency multiplied by its value and add them all up: Sum=10⋅7+11⋅2+12⋅1+13⋅3+14⋅1+15⋅6+16⋅3+17⋅1=70+22+12+39+14+90+48+17=312 The number of numbers in the list is the sum of the frequencies. Number of numbers=7+2+1+3+1+6+3+1=24 So the mean of the number of laps completed by swimmers during a fitness class is SumNumber of numbers=31224=13

The frequency table below summarizes a list of the amounts (in dollars) randomly selected customers spent on hot chocolate during a winter festival. Find the mean. v-f 8-5 9-2 10-5 11-2 12-2 13-2 14-2 15-3

Remember that the mean is the sum of all the numbers divided by the number of numbers. The frequency table tells you the number of time that each number appears in the set of data. So to get the sum of all the numbers in the set of data, we take each frequency multiplied by its value and add them all up: Sum=8⋅5+9⋅2+10⋅5+11⋅2+12⋅2+13⋅2+14⋅2+15⋅3=40+18+50+22+24+26+28+45=253 The number of numbers in the list is the sum of the frequencies. Number of numbers=5+2+5+2+2+2+2+3=23 So the mean of the amounts (in dollars) randomly selected customers spent on hot chocolate during a winter festival is SumNumber of numbers=25323=11

Given the frequency table below for a list of recorded lengths (in inches) of randomly sampled garden snakes, find the mean. V-F 9-8 10-3 11-2 12-2 13-1 14-1 15-3

Remember that the mean is the sum of all the numbers divided by the number of numbers. The frequency table tells you the number of time that each number appears in the set of data. So to get the sum of all the numbers in the set of data, we take each frequency multiplied by its value and add them all up: Sum=9⋅8+10⋅3+11⋅2+12⋅2+13⋅1+14⋅1+15⋅3=72+30+22+24+13+14+45=220 The number of numbers in the list is the sum of the frequencies. Number of numbers=8+3+2+2+1+1+3=20 So the mean of the recorded lengths (in inches) of randomly sampled garden snakes is SumNumber of numbers=22020=11

Jon loves to go bird watching at a nearby animal sanctuary. Find the mean of the following numbers of birds he spotted at the sanctuary in the last few days. 14,10,17,11,9,15,6,14

Remember that the mean is the sum of the numbers divided by the number of numbers. There are 8 numbers in the list. So we find that the mean number of birds spotted is 14+10+17+11+9+15+6+14////8 = 96 96/8 = 12

A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students to estimate when they typically wake up on weekdays. The data are recorded in terms of the number of hours after midnight the students wake up. The data are included below. Use Excel to calculate the population standard deviation and the population variance. Round your answers to three decimal places. Do not round until you've calculated your final answer Wake up time 6 7.5 7 8 7.5 6 8.5 7.5 8.5

STANDARD D 0.852 vARIANCE 0.726

Put the interquartile ranges in order from smallest to largest.

The IQR is represented by each regions block of color. This appears easy to sort. The smallest IQR is East, then South, West, and Midwest with the largest IQR.

The following data set provides New York City school based programs cost by borough.

The five number summary of cost for school based programs in the Bronx borough is given here. Minimum $35,239, Q1 $116,987, Median $194,696, Q3 $827,996, Maximum $12,035,084. What is the interquartile range of the set of data?Enter just the number as your answer. For example, if you found that the interquartile range is 25, you would enter 25.

The following data set provides New York City school based programs cost by borough.

The five number summary of cost for school based programs in the Manhattan borough is given here. Minimum $35,263, Q1 $63,617, Median $507,206.5, Q3 $1,324,622, Maximum $4,667,715. Using the interquartile range, which of the following are outliers? Select all correct answers.

A manager at a shoe factory would like to find the mean number of breaks taken by employees on a particular Friday. He collects data from 15 fellow coworkers in the factory. The graph shows the frequency for the number of breaks taken during this time period. Find the mean number of breaks for the 15 coworkers, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

The frequency graph shows the frequency for each data value. So, we can compute the mean by added up all the data values and dividing by the total number of data values. 3⋅1+5⋅2+3⋅3+2⋅4+1⋅5+0⋅6+1⋅7/15=42/15=2.8. Rounding to the nearest tenth, we have the mean is 2.8.

An office manager would like to find the mean number of emails sent by employees during a one-hour period. She collects data from 34 employees in the office. The graph shows the frequency for the number of emails sent during this time period. Find the mean number of emails for the 34 employees, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

The frequency graph shows the frequency for each data value. So, we can compute the mean by added up all the data values and dividing by the total number of data values. 3⋅1+6⋅2+3⋅3+9⋅4+6⋅5+4⋅6+3⋅734=13534≈3.97. Rounding to the nearest tenth, we have the mean is 4.0.

A student would like to find the mean number of people living in households in a neighborhood. She collects data from 65 homes in the area. The graph shows the frequency for the number of people living in the homes. Find the mean number of people living in the 65 homes, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

The frequency graph shows the frequency for each data value. So, we can compute the mean by added up all the data values and dividing by the total number of data values. 3⋅1+6⋅2+7⋅3+8⋅4+12⋅5+14⋅6+15⋅7/65=317/65≈4.88. Rounding to the nearest tenth, we have the mean is 4.9.

A student would like to find the mean number of people living in households in a neighborhood. She collects data from 65 homes in the area. The graph shows the frequency for the number of people living in the homes. Find the mean number of people living in the 65 homes, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

The frequency graph shows the frequency for each data value. So, we can compute the mean by added up all the data values and dividing by the total number of data values. 3⋅1+6⋅2+7⋅3+8⋅4+12⋅5+14⋅6+15⋅765=31765≈4.88. Rounding to the nearest tenth, we have the mean is 4.9.

A music teacher would like to find the mean number of songs people listen to on the way home from work. She collects data from 18 teachers at the school. The graph shows the frequency for the number of songs the teachers listen to on their way home from work. Find the mean number of songs listened to for the 18 teachers, and round your answer to the nearest tenth. Record your answer by dragging the purple point to the mean.

The frequency graph shows the frequency for each data value. So, we can compute the mean by adding up all the data values and dividing by the total number of data values. 0⋅1+1⋅2+2⋅3+2⋅4+3⋅5+2⋅6+8⋅71///8=99/18=5.5. Rounding to the nearest tenth, we have the mean is 5.5.

The following data set provides wage information of Seattle by subdivisions. Select all the statements that are true:

The lowest hourly rate in the Arts and Culture Department is the same as the lowest hourly rate in the City Budget Department. The first quartile hourly rates in the City Auditor Department are within the pay range of the fourth quartile hourly rates in the Arts and Culture Department. The higest pay in the second quartile of the City Auditor employee hourly rate are within a dollar of the lowest pay in the third quartile. Three statements are true: A. The lowest hourly rate in the Arts and Culture Department is the same as the lowest hourly rate in the City Budget Department. Both are $16.12. C. The first quartile hourly rates in the City Auditor Department ($43.86 to $52.10) are within the pay range of the fourth quartile hourly rates in the Arts and Culture Department ($40.45 to $62.59). E. The highest pay in the second quartile of the City Auditor employee hourly rate ($56.07) is within a dollar of the lowest pay in the third quartile ($56.24). Two statements are false: B. The lowest hourly rate in the Arts and Culture Department ($16.12) is not the same as the lowest hourly rate in the City Auditor Department ($43.86). D. The number of people in the second quartile in the City Auditor Department is not different than the number of people in the third quartile of that department. Both quartiles have wage information on two people.

A citizen interested in state and local politics obtains data on the percent of voters who are unregistered for state and local elections for each district in the county. There are 33 districts in the county. The data containing the percent of unregistered voters in each of the 33 districts are recorded as percentage points and are given in the table below. Unregistered Voters (percent values by district)56.153.760.562.559.665.154.260.160.744.152.264.650.758.062.757.751.360.960.338.662.153.054.072.262.763.550.048.346.655.259.054.263.9

The population standard deviation is σ≈6.80% and the population variance is σ2≈46.30%2, rounding both to two decimal places.

High school students in a random sample from various schools in a particular state were given a survey asking them what they spend their free time on. One of the questions was "How many hours do you spend on a typical school night playing video games?" The students were allowed to write in their own answers. A sample of 30 responses is reproduced in the table below: Typical time (hours) spent gaming on school nights 2.2 2.01.21.23.03.23.21.41.61.06.0 0.0 0.02.02.60.84.44.61.64.0 0.4 5.61.23.43.26.21.21.22.80.2 Use Excel to determine the sample standard deviation and the sample variance of the 30 times in the table above.

The sample standard deviation is s≈1.73 and the sample variance is s2≈3.01, rounding both to two decimal places.

A researcher for an organization that collects and reports on crime data is looking into the murder rates of 20 states from a specific year. The murder rate is the number of murders per 100,000 inhabitants. The true murder count can be approximately recovered by multiplying the murder rate by the population divided by 100,000. The murder rate data are reproduced below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to one decimal place. Murders per 100k inhabitants 10.7 1.3 3.1 6 6.2 7.3 1.3 2.8 1.9 HelpCopy to ClipboardDownload CSV

The sample standard deviation is s≈2.9 and the sample variance is s2≈8.1, rounding each to one decimal place.

An author is doing research on an upcoming book on skyscrapers in America. The author gathers data for 20 skyscrapers throughout America and decides to focus on the height data, given in meters, which are reproduced below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to two decimal places. Do not round until you've calculated your final answer

The sample standard deviation is s≈21.95 and the sample variance is s2≈481.82, rounding each to two decimal places.

An education reform lobby is compiling data on the state of education in the United States. In their research they looked at the percent of people who graduate high school in 20 different states. The data are provided below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to one decimal place. High School Graduation Rate (%) 73.9 76.9 101.3 85.1 86 84.3 81.3 77.8 94 HelpCopy to ClipboardDownload CSV

The sample standard deviation is s≈6.5 and the sample variance is s2≈42.4, rounding each to one decimal place.

The manager at a local grocery store thinks customers might benefit from a self-checkout aisle because the variation in customer wait times at checkout is high enough to warrant providing alternative checkout options for customers with fewer needs. To aid the decision, the manager decides to analyze the standard deviation of the wait times for a random sample of customers. A sample of 30 random customers had their wait times recorded. The wait times in minutes for one of these samples can be found in the table below.

The sample standard deviation is s≈8.22 and the sample variance is s2≈67.61, rounding both to two decimal places.

A student studying statistics wants to look at data for his favorite sport, American football. He collects data on the lengths of 100 field goals from various games over several seasons. The data are provided below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to one decimal place. Field Goal Distance (yards) 40 36 43 21 24 22 20 27 45 HelpCopy to ClipboardDownload CSV

The sample standard deviation is s≈9.1 and the sample variance is s2≈82.8, rounding each to one decimal place.

Quartiles

The values that divide the data into four equal parts

A nonprofit dedicated to eradicating drunk driving is putting together a report on the frequency of drunk driving throughout all 50 states. They could only afford to gather data for 20 states, which they chose randomly. The number of DUI arrests per 100,000 individuals is one of the data points they gathered for each state. The data are included below. Use Excel to calculate the sample standard deviation and the sample variance.

To determine the sample standard deviation and sample variance for a data set {x1,x2,...,xn} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A21. Select cell B3 and type "=STDEV.S(", select the range A2:A21, and then hit ENTER. This gives the sample standard deviation. Select cell B4 and type "=VAR.S(", select the range A2:A21, and then hit ENTER. This gives the sample variance. The sample standard deviation is s≈201 and the sample variance is s2≈40,533, rounding each to the nearest whole number.

A statistics professor asks each of the students in an introductory statistics lecture to fill out a survey. There are 100 students in the course, and each one filled out the survey. One of the questions asked students to state their age. The data are included below. Use Excel to calculate the population standard deviation and the population variance. first age of student 19

To determine the population standard deviation and population variance for a data set {x1,x2,...,xN} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A1:A101. Select cell B3 and type "=STDEV.P(", select the range A1:A101, and then hit ENTER. This gives the population standard deviation. Select cell B4 and type "=VAR.P(", select the range A1:A101, and then hit ENTER. This gives the population variance. The population standard deviation is σ≈3.4 and the population variance is σ2≈11.7, rounding each to one decimal place.

The following data set provides wage information of Seattle by subdivisions. The range of salaries is greater in the Arts and Culture Department than in the City Auditor Department. True or False

True Salaries in the Arts and Culture Department go from from $16.12 to $62.59, a range of $46.47. Salaries in the City Auditor Department go from $43.86 to $72.84, a range of $28.98.

The following data set represents the ages of all 6 of Nancy's grandchildren. 11,8,5,6,3,9 To determine the "spread" of the data, would you employ calculations for the sample standard deviation, or population standard deviation for this data set?

Use calculations for population standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the population standard deviation should be used because the data set represents all of, that is the total population of, Nancy's grandchildren.

median

a number that splits a data set in half, with one half smaller and one half larger; the center or middle value of a data set

The sample minimum and sample maximum

are the two values that show the range of the data set. The sample minimum is the least value in the ordered data set. The sample maximum is the greatest value in the ordered data set.

five-number summary

is descriptive statistics that provides information about the five most important percentiles from the data set

Given the following frequency table of values, is the mean or the median likely to be a better measure of the center of the data set? Value323334353637Frequency236221

mean

The following data set provides Oklahoma data on benchmark jobs and relationship to market. The next department you look at is Information Systems. What is the median salary for those employees?

median=50606.16​ The 2nd quartile or 50th percentile is the same as the median. The median is the 167th employee, which makes $50,606.16.

For example (using the same data), a rating scale of 1−100 was taken from 40 companies. A company uses percentiles to see how their profits in their market are compared to several other companies. The following table shows 40 data values taken from those 40 companies, sorted and arranged in rows of 5. What is the 70th percentile of the data? 1 5 6 7 8 17 18 18 22 29 31 39 43 46 50 50 51 55 55 57 59 62 64 67 71 72 75 77 77 82 85 92 92 92 93 93 95 98 98 100

percentile=77​ Remember that the 70th percentile is the value which has 70% of the values below it. Because there are 40 values in the set of data, we compute (40)⋅(70%)=28. So we want the value P which has 28 values less than or equal to P.Looking through the table, we find that there are 28 values less than or equal to 77, so the 70th percentile is 77. A company uses percentiles to see how their profits in their market are compared to several other companies. For example (using the same data), a rating scale of 1−100 was taken from 40 companies. Company A has a rating of 82 and from this data, we know that is above the 70th percentile since the 70th percentile is 77.

The following table shows 44 data values, sorted and arranged in rows of 5. What is the 3rd quartile of the data? 1 4 6 7 7 9 12 13 13 14 14 14 14 15 17 26 27 29 30 36 43 47 47 48 49 51 55 55 59 63 65 70 70 70 78 83 89 92 93 96 96 99 99 99

quartile=70​ Remember that the 3rd quartile is the value which has 75% of the values below it. Because there are 44 values in the set of data, we compute (44)⋅(75%)=33. So we want the value P which has 33 values less than or equal to P.Looking through the table, we find that there are 33 values less than or equal to 70, so the 3rd quartile is 70. A hotel uses percentiles to see how their beds compare to several other hotels. For example (using the same data), a rating scale of 1−100 was taken from 44 hotels. Hotel A has a rating of 65 and from this data, we know that is below the 3rd quartile since the 3rd quartile is 70.

Given the following list of tips (in dollars) earned in the last hour by waiters in a Japanese restaurant, find the median.

t helps to put the numbers in order. 17,17,17,21,26,30,31,36,47,47 Now, because the list has length 10, which is even, we know the median number will be the average of the middle two numbers, 26 and 30. So the median amount of tips (in dollars) earned in the last hour by waiters in a Japanese restaurant is 28.

mode

the number(s) that occurs most often in a data set

mean

the sum of all the items in a list divided by the number of items in the listThe term Mean is often used interchangeably with the term Average

The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. Building Type Land Area in Acres Living Area Total Assessed Land Value Total Assessed Parcel Value Commercial 0.20860882 318 625900 730700 Commercial 0.21751607 3630 423700 741100 Commercial 5.5474977 59506 10253200 13547100 Commercial 0.18682277 3780 293300 460200 Commercial 0.55227273 22017 681300 1214000 Condominium 0 1776 0 579500 Condominium 0 1270 0 512100 Condominium 0 2076 0 578900 Condominium 0 1132 0 281800 Condominium 0 957 0 266800 HelpCopy to ClipboardDownload CSV Is there a direct correlation between the commercial living area's standard deviation and number of offices in a building?

there is not enough info

outliers

values that are very different from the rest of the values in a data set


Kaugnay na mga set ng pag-aaral

Biology Unit 5: Protein Synthesis

View Set

CHP 1: The Nature of Strategic Management

View Set

Final Exam PSY 1400 (flashcards from all exams)

View Set

Community Organization and Development

View Set

8th English review IXL skills K1,3,4,5,and 6

View Set