Knewton Alta Chapter 2 Descriptive statistics - part 3

Ace your homework & exams now with Quizwiz!

A consumer report was released concerning the prices of various food products. The report listed the monthly average price of a pound of beef for 20 different months. Construct a box and whisker plot using Excel and the QUARTILE.INC function. Then, choose the correct answer below.

1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 5.138, Q1 is 5.26275, the median is 5.332, Q3 is 5.46425, and the maximum is 5.687. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the tool bar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create errors bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. For this dataset, the vertical axis was changed to run from 5.1 to 5.7 with major tick marks in increments of 0.1 and minor ticks marks in increments of 0.025. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color.

The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. Is there a direct correlation between the commercial living area's standard deviation and number of offices in a building?

Correct answer: There is not enough information to determine direct correlation. There is not enough information to determine direct correlation between the commercial living area's standard deviation and number of offices in a building.

deviation

The difference between a data value and the mean of the data set is called its _______ In a data set, there are as many deviations as there are values in the data set. deviations are used to calculate the variance and the standard deviation.

Z-SCORE

a measure of how many standard deviations a data value is from the mean of the data set.A z-score may also be referred to as the Test statistic.

In a recent national survey, the mean price for a 2000 sq ft home in Florida is $240,000 with a standard deviation of $16,000. The mean price for the same sized home in Ohio is $170,000 with a standard deviation of $12,000. In which city would a home priced at $200,000 be closer to the mean price, compared to the distribution of prices in the city? Find the z-score corresponding to each city.

-14.875 -14

Given the following histogram, decide if the data is skewed or symmetrical.

Answer Explanation Correct answer: The data are skewed to the right. Note that the histogram has most of its values concentrated on the left, with several much larger values on the right. Therefore, the data are skewed right.

Given the following box-and-whisker plot, decide if the data is skewed or symmetrical.

Correct answer: The data are skewed to the left. Note that the whisker on the left is much longer than the whisker on the right. So there are several much smaller values on the left. Therefore, the data are skewed left. This could represent a researcher studying the age of first trying tobacco. Since there is a legal requirement of a certain age, older people will try tobacco first while few try younger.

The following set of data represents stock prices of a pharmaceutical company, find the sample variance of the:14, 7, 10, 9. Round your answer to ONE decimal place.

First, we find that the mean is 14+7+10+94=404=10 Now, we need to take the deviations from the mean and square them: ValueDeviationDeviation2144.016.07−3.09.0100.00.09−1.01.0 Finally, we add up the squared deviations and divide by the number of data values minus one (4−1=3). 16.0+9.0+0.0+1.03=8.7 The sample variance shows how the stock values differ from the stock's mean. The larger the difference, the riskier the stock. This would say that the stock varies from 10 (mean), by 8.7 (units^2).

A policy think tank focused on teenage obesity was able to get several pediatricians to release anonymous data on the weights in pounds of sixteen-year-olds who had come to their office this year. A sample of 20 of the weights is included below. Construct a box and whisker plot using Excel and the QUARTILE.INC function to select the correct plot below

1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 110, Q1 is 129.75, the median is 146.5, Q3 is 173, and the maximum is 329. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create errors bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. For this dataset, the vertical axis was changed to run from 100 to 350 with major tick marks in increments of 25 and minor ticks marks in increments of 5. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. The completed box plot should look something like the following image.

How to find the variance

Given the sample mean, calculate the sample variance. s2=∑(x−x¯¯¯)^2/n−1 ...where s2 = the sample variance x = the specific data value x¯¯¯ = the sample mean n = the sample size

A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students how many text messages they think they sent yesterday. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?

Population standard deviation The professor surveys all 100 students in the class - that is, the entire population of the class. So a population standard deviation would be more appropriate.

The histograms below are of data sets representing four different stocks. Which has the smallest standard deviation?

Remember that the standard deviation is a measure of how spread out the data is. If the values are concentrated around the mean, then a data set has a lower standard deviation.A histogram with fewer values and higher frequencies has a lower standard deviation than a histogram which has a shorter hill and a wider range of values. A smaller standard deviation also suggests less risk in that particular stock.

Data Analysis Toolpak Installation Steps

Step 1: Open up Microsoft Excel Step 2: Go to File, then Options Step 3: Under the Excel Options Menu, Select Add-ins Step 4: Under Manage: Select Excel Add-ins Menu option and click "Go..." Step 5: In the Add-Ins menu, check the box for Analysis ToolPak and click OK Step 6: Confirm the Data Analysis ToolPak is installed, go to Data then confirm Data Analysis Menu option is available Step 7: Click on Data Analysis option to view the Data Analysis menu

Isabel is looking at the prices for round-trip airfare from Setauket to Orchard Park where both flights occur on Wednesday or both flights occur on Sunday. She randomly selects 20 round-trips where both flights occur on Wednesday and 20 round-trips where both flights occur on Sunday. Isabel records the prices for each round-trip airfare in dollars as shown in the samples provided. One of the round-trips on Wednesday and one of the round-trips on Sunday both cost $235. Based on the z-scores you calculated above, is the $235 airfare more likely to occur on Wednesday or on Sunday?

The absolute value of the z-score for a round-trip flight that occurs on Sunday is less than the absolute value of the z-score for a round-trip flight that occurs on Wednesday, so a round-trip flight that costs $235 is more likely to occur on Sunday than on Wednesday. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for flights that occur on Sunday, 0.97, is less than the absolute value of the z-score for flights that occur on Wednesday, 1.84, a round-trip flight that costs $235 is more likely to occur on Sunday than on Wednesday.

For the following dataset, you are interested to determine the "spread" of the data. Would you employ calculations for the sample standard deviation, or population standard deviation for this dataset: The pulse rate for 5 randomly selected players on a football team.

Use calculations for sample standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the data set represents only a subset of only 5 the players on a football team, so the sample standard deviation should be used.

The midterm and final exam grades for a statistics course are provided in the data set below. Jaymes, a student in the class, scored 86 on both exams. Treat the given data sets as samples. Based on the z-scores calculated above, which of Jaymes's grades is more unusual, the midterm grade or the final exam grade?

Correct answer: The absolute value of the z-score for the midterm exam grade is greater than for the final grade, so the midterm grade is more unusual. A z-score with a greater absolute value means that the data value is more unusual. Since the absolute value of the z-score for the midterm grade, 1.86, is greater than the absolute value of the z-score for the final grade, 1.18, Jaymes' midterm grade is more unusual.

The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. For residential buildings, what is the sample variance of the land area? Round your answer to FOUR decimal places.

Correct answers: sample variance=0.0004​ The mean is 0.0624. The sample variance is the sum of the squares of the difference of the specific data value from the mean divided by 4, which is n−1. For this sample, the sample variance is 0.0004.

The following lists of data represent five separate departments' employees time spent in meetings for a week. Which of the following lists of data has the largest standard deviation?

Correct answer: 24, 15, 21, 23, 9, 22, 12, 21, 20, 13 Remember that standard deviation is a measure of how spread out the values are. The list 24, 15, 21, 23, 9, 22, 12, 21, 20, 13 has the largest standard deviation because its values are all relatively spread apart. The department could see this as having inconsistent (or variability) weekly meeting times for each employee.

The auditors for a health insurance company are reviewing the bills from client's stays at hospitals from last year. The length of stay in days for hospital visits from 20 randomly sampled bills is provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the third quartile? Round your answer to two decimal places.

Correct answers: 7.25​ 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 2, Q1 is 5.75, the median is 6, Q3 is 7.25, and the maximum is 9. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Designtab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Formattab, and below that click the Format Selectionbutton. In the Format Data Serieswindow that pops up, click the Filltab, and then click the No Fillradio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selectionon the toolbar above. In the Format Error Bars window, click the Vertical Error Barstab, click the Plus button under Direction, and click the Capbutton under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selectionagain. In the Format Error Bars window, click the Vertical Error Barstab, click the Minusbutton under Direction, and click the Capbutton under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the third quartile, where the top whisker intersects with the top side of the box, is 7.25.

Wes is the owner of a real estate agency and is analyzing the time the agents take to sell houses. He reviews each house sold by his agency to determine the number of days each house was on the market before it was sold. The data for a random sample of 20 houses sold in the last year are provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the third quartile?

Correct answers:$165$165​ 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 65, Q1 is 111.75, the median is 140, Q3 is 165, and the maximum is 199. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the third quartile, where the top whisker intersects with the top side of the box, is 165.

The box and whisker plots below are of data sets representing four machines' waste ouputs each hour. Which has the smallest standard deviation?

Correct answer: A Remember that the standard deviation is a measure of how spread out the data is. If the values are concentrated around the mean, then a data set has a lower standard deviation.A box and whisker plot with short whiskers and a short box has values that are less spread out, and hence has the smaller standard deviation. A smaller standard deviation also suggests less variablity (more consistent) in the waste output for that machine per hour.

The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016.Find the standard deviations for the commercial buildings total assessed land value and total assessed parcel value, and the residential buildings total assessed land value and total assessed parcel value. Which has the smallest standard deviation?

Correct answer: Residential Total Assessed Land Value Residential Total Assessed Land Value's standard deviation is the smallest for those four, due to the narrowest range of data. Residential Total Assessed Land Value's standard deviation is 97,477.

The following data values represent the daily amount spent by a family each day during a 7 day summer vacation. Find the standard deviation of this data set: $96,$125,$80,$110,$75,$100,$121 Round the final answer to one decimal place.

Correct answers:17.7​ The population standard deviation is the square root of the population variance. Since we just found that the population variance to be 314.3, the sample standard deviation is 314.3−−−−√≈17.7.

The following data values represent the daily amount spent by a family each day during a 7 day summer vacation. Find the variance of this dataset: $96,$125,$80,$110,$75,$100,$121 Round the final answer to one decimal place.

Correct answers:314.3​ In this case, the population standard deviation should be used because the data set represents the amount spent each day of the 7 day vacation. First, we find that the mean is 96+125+80+110+75+100+1217=7077=101 Now, we need to take the deviations from the mean and square them:Value961258011075100121Deviation−524−219−26−120Deviation225576441816761400 The amounts spentars listed for every day of the vacation. Since we have data for the total population of vacation days, we will find the population variance and standard deviation. The population variance is the sum of the squared deviations, divided by the number of data values. 25+576+441+81+676+1+4007=22007≈314.28 So to one decimal place, the variance is 314.3

The heights, in inches, of the four members of a barbershop quartet of singers are listed below. 72,68,67,73 Find the population variance for this data set. Round to one decimal place if necessary.

Correct answers:6.5​ First, we find that the mean is 72+68+67+734=2804=70 Now, we need to take the deviations from the mean and square them: ValueDeviationDeviation2722468−2467−397339 Since there are exactly 4 members of a barbershop quartet, the 4 measurements given represent the entire population of the singing group. So we calculate the population variance. The population variance is the sum of the squared deviations, divided by the number of data values( 4). 4+4+9+9/4=6.5

Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time when compared to her team. SwimmerTime (sec) Team Mean Time Team Standard Deviation Angie 26.2 27.2 0.8 Beth 27.3 30.1 1.4 Compute the z-scores for Angie and Beth.

Angie: −1.25 Beth: −2 Answer Explanation Correct answers:-1.25 ; −2​ Angie's team has a mean time of μ=27.2 with a standard deviation of σ=0.8. The z-score corresponding to Angie's swim time of x=26.2 seconds is z=x−μσ=26.2−27.2/0.8= −1.25 Beth's team has a mean time of μ=30.1 with a standard deviation of σ=1.4. The z-score corresponding to Beth's swim time of x=27.3 seconds is z=x−μσ=27.3−30.1/1.4= −2

The following data set provides information of Households by Total Money Income, Race, and Hispanic Origin of Householder. Looking at the data set or chart for household income for all races in 2015, what percent of households are in the category range that contains the mean?

Answer Explanation Correct answer: 12.1% The mean is $79.263, which is in the category of $75,000 to $99,999. This range represents 12.1% of all races.

Which of the following frequency tables shows a skewed data set? Select all that apply.

Correct answer: ValueFrequency13214515116131723182619152012 ValueFrequency04112223328417576673 Remember that data are left skewed if there is a main concentration of large values with several much smaller values. Similarly, right skewed data have a main concentration of small values with several much larger values. We can see that the following is left skewed because of the concentration of large values with many smaller values: ValueFrequency13214515116131723182619152012 And the following is right skewed because of its concentration of small values with many larger values: ValueFrequency04112223328417576673 The other frequency tables are more balanced and symmetrical.

A math class's mean test score is 88.4. The standard deviation is 4.0. If Kimmie scored 85.9, what is her z-score?

Correct answer: −0.625 To calculate a z-score, you must use the formula: datavalue−meanstdev=x−μσ In this problem, z-score = 85.9−88.44.0=−0.625

The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. For residential buildings, what is the sample standard deviation of the land area? Round your answer to TWO decimal places.

Correct answers:standard deviation=0.02​ To find the sample standard deviation of the land area for residential buildings, use the following formula. s=∑i(xi−x¯)2(n−1)−−−−−−−−−−−⎷ The sample mean is x¯=0.062373738. xx−x¯(x−x¯)20.05502755−0.0073461880.000053966478131340.3826905−0.0241046880.000581035983577340.090702480.0283287420.000802517623302560.05968779−0.0026859480.000007214316658700.068181820.0058080820.00003373381651872 The variance, which is equal to the square of the standard deviation, is equal to the sum of the squares of the deviations divided by one less then the sample size. s2ss=∑(x−x¯)2n−1=0.00147854−−−−−−−−−√=0.0192 Therefore, after rounding to two decimal places, we find that the sample standard deviation is about 0.02.

Find The Variance

In order to calculate the sample standard deviation, you must first calculate the sample variance. To calculate the sample variance, add up the squares of the deviations and then divide this sum by the quantity (n−1) where n is the number of data values in the sample. To calculate the population variance, add up the squares of the deviations and then divide this sum by the quantity N, where N is the number of data values in the population.

A statistics professor surveys all 100 of the students in an introductory statistics lecture. The survey asks the students to estimate when they typically wake up on weekdays. The data are recorded in terms of the number of hours after midnight the students wake up. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation

Population standard deviation The survey was given to all 100 students in the class, which is the entire population of the class. So a population standard deviation would be more appropriate.

Tara is a journalist for the newspaper at a large college, and is writing a report about the cost of textbooks in the STEM fields. She is inquiring about how much mathematics majors and chemistry majors spend on textbooks during the past semester. She randomly selects a sample of 100 students of each major to ask how much each student spent on textbooks during the past semester and records their responses. The results of the survey are shown below, where the amounts are in dollars. One student from each department spent $624 on textbooks from each semester. Based on the z-scores you calculated above, would it be more likely for a Mathematics major or a Chemistry major to spend $624 on textbooks

Spending $624 on textbooks would be more likely for Chemistry majors, because the absolute value of the z-score for Mathematics majors is greater than for Chemistry majors. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for Mathematics majors, 1.68, is less than the absolute value of the z-score for Chemistry majors, 0.52, spending $624 on textbooks would be more likely for Chemistry majors.

A large company has two major departments, Development and Marketing. 100 employees are randomly selected from each department, and the age of each employee, in years, is recorded in the accompanying samples. Both departments have an employee who is 22 years old. Use Excel to calculate the z-score for the data value that represents the 22-year-old employee in each department. Round your answers to two decimal places.

Correct answers:1$-1.96$−1.96​2$-1.70$−1.70​ 1. Enter the data for the Development department into column A and the data for the Marketing department into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Development department is 40.6 with a sample standard deviation of 9.494, rounded to three decimal places. The sample mean for the Marketing department is 34.4 with a sample standard deviation of 7.291, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for the 22-year-old employee in the Development department, rounding to two decimal places. z≈22−40.69.494≈−1.96 Compute the z-score for the 22-year-old employee in the Marketing department, rounding to two decimal places. z≈22−34.47.291≈−1.70

You are told that a data set has a median of 13 and a mean of 23. Which of the following is a logical conclusion?

The data are skewed to the right. Because the mean, 23, is greater than the median, 13, we expect that there are some very large values which are bringing the mean up. In other words, the data are skewed to the right. This could represent the price of the first car someone buys. While many can only afford in the lower range of vehicles, some can afford to buy luxury.

Two basketball teams had the following point totals in regulation through their first 20 games of the season. In their 21st game of the season, both teams scored 122 points. Treat the given data sets as samples. For which team would this have been a more unusual point total?

1. Enter the data for Team A into Column A. Enter the data for Team B into Column B. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the range A1:B21 into Input Range, make sure Columns is selected under Group By, tick the Labels in First Row check box, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation from the output. 5. Compute the z-score for Team A scoring 122 points. z≈122−90.15.590≈5.71 Compute the z-score for Team B scoring 122 points. z≈122−93.711.859≈2.39 6. Compare the z-scores. The absolute value of the z-score for Team A, 5.71, is greater than the absolute value of the z-score for Team B, 2.39. Therefore, it is more unusual for Team A to score 122 points than it is for Team B.

The following data set provides information of Households by Total Money Income, Race, and Hispanic Origin of Householder. Looking at the data set for household income for all races in 2015, what percent of households are in the category range that contains the median?

Correct answer: 16.7% The median is $56,516, which is in the category of $50,000 to $74,999. This range represents 16.7% of all races.

The following lists of data represent five separate departments' employees used vacation days per year. Which of the following lists of data has the smallest standard deviation?

Correct answer: 17, 19, 17, 18, 17, 16, 16, 16, 17, 20 Remember that standard deviation is a measure of how spread out the values are. The list 17, 19, 17, 18, 17, 16, 16, 16, 17, 20 has the smallest standard deviation because its values are all relatively close together. A smaller standard deviation also suggests less variablity (more consistent) in this department's used vacation days per employee per year. This department could see this as each employee taking equal vacation days each year.

Which of the following histograms shows a skewed data set? Select all that apply.

Remember that data are left skewed if there is a main concentration of large values with several much smaller values. Left Skewed histograms could represent ages of retirement; many people (large concentration of large values) may retire after 65, than those younger. Similarly, right skewed data have a main concentration of small values with several much larger values. Right Skewed histograms could represent the age of 5 (large concentration of small values) when you first ride a bike, than those that are much older before riding a bike.

The following lists of data represent five different shipment centers' outputs. Which has the largest standard deviation?

Correct answer: 8, 17, 15, 21, 16, 6, 14, 12, 11, 6 Remember that standard deviation is a measure of how spread out the values are. The list 8, 17, 15, 21, 16, 6, 14, 12, 11, 6 has the largest standard deviation because its values are all relatively spread apart. A larger standard deviation also suggests that this shipment center needs improvements to have a more consistent release of products to be shipped.

The heights, in inches, of the members of a barbershop quartet of singers are listed below. 72,68,67,73 If the population variance for this data set is 6.5, what is the standard deviation of the heights of the barbershop quartet? Round your answer to 2 decimal places.

Correct answers:2.55​ The standard deviation is the square root of the variance. Since the population variance is 6.5, the (population) standard deviation is 6.5−−−√≈2.549 . So to two decimal places, the standard deviation is 2.55.

Kathy and Linda both accepted new jobs at different companies. Kathy's starting salary is $31,500 and Linda's starting salary is $33,000. They are curious to know who has the better starting salary, when compared to the salary distributions of their new employers. A website that collects salary information from a sample of employees for a number of major employers reports that Kathy's company offers a mean salary of $42,000 with a standard deviation of $7,000. Linda's company offers a mean salary of $45,000 with a standard deviation of $6,000. Find the z-scores corresponding to each woman's starting salary.

Kathy's z-score: -1.5 Linda's z-score: -2 Correct answers:-1.5 ; −2​ Whether the data is from a sample or population, the formulas for the z-score remains the same: z=data value−meanstandard deviation Kathy's starting salary is $31,500, for a company with a mean salary of $42,000 and standard deviation $7,000. The corresponding z-score is z=x−μσ=31,500−42,0007,000=−1.5 So Kathy's starting salary is 1.5 standard deviations below her company's mean salary. Linda's starting salary is $33,000, for a company with a mean salary of $45,000 and standard deviation $6,000. The corresponding z-score is z=x−μσ=33,000−45,0006,000=−2.0 So Linda's starting salary is 2 standard deviations below her company's mean salary.

Two students, John and Ali, are from different high schools which use different scales when reporting Grade Point Averages. Each student's GPA is shown in the table, along with the mean and standard deviation of the distribution of GPAs at each school. Student GPA School Mean GPA School Standard Deviation John 2.85 3.0 0.7 ALI 77 80 10 Compare their z-scores to determine which student had the highest GPA with respect to his school.

Solution Notice that GPAs for the two schools are reported on completely different scales, so comparing them directly would not be useful at all. We will compute the z-scores for John's and Ali's GPAs in order to compare their performance at their respective schools. To calculate John's z-score: x (data value) = 2.85 μ (mean) = 3.0 σ (stdev) = 0.7 Substituting these values, we find z=x−μσ=2.85−3.00.7=−0.21 John's z−-score is z=−0.21. So John's GPA is 0.21 standard deviations below his school's mean. To calculate Ali's z-score: x (data value) = 77 μ (mean) = 80 σ (stdev) = 10 Substituting these values, we find z=x−μσ=77−8010=−0.3 Ali's z-score is z=−0.3. So Ali's GPA is 0.3 standard deviations below his school's mean. John's z-score of -0.21 is greater than Ali's z-score of -0.3. For GPAs, greater values mean higher grades, so we conclude that John has the better GPA when compared to his school.

The following data set represents the ages of all 6 of Nancy's grandchildren. 11,8,5,6,3,9 To determine the "spread" of the data, would you employ calculations for the sample standard deviation, or population standard deviation for this data set?

Use calculations for population standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the population standard deviation should be used because the data set represents all of, that is the total population of, Nancy's grandchildren.

A manufacturing plant produces custom hardware for specific applications in construction. One particular kind of bolt that is intended to have a length of 84mm is produced by two different machines, A and B. Tristan is an employee in the department that produces this bolt. He randomly selects samples of 100 bolts from each machine and measures the length of each one. The lengths of the bolts, in millimeters, are shown in the table below. Each machine produced a bolt that has a length of 84.05mm. Based on the z-scores you calculated above, would an 84.05mm bolt more likely be produced by Machine A or Machine B?

An 84.05mm bolt would more likely be produced by Machine A, because the absolute value of the z-score for Machine A is less than for Machine B. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for Machine A, 1.74 is less than the absolute value of the z-score for Machine B, 3.57, an 84.05mm bolt would more likely be produced by Machine A.

A large company has two major departments, Development and Marketing. 100 employees are randomly selected from each department, and the age of each employee, in years, is recorded in the accompanying samples. Both departments have an employee who is 22 years old. Based on the z-scores you calculated above, would it be more likely for the Development or Marketing department to have a 22 year old employee?

Correct answer: A 22 year old employee would more likely be found in the Marketing department, because the absolute value of the z-score for Development is greater than for Marketing. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for the Development department, 1.96, is more than the absolute value of the z-score for the Marketing department, 1.70, a 22 year old employee would more likely be found in the Marketing department.

Based on the z-scores calculated above for Angie and Beth, which swimmer had the fastest time when compared to her team?

Correct answer: Beth Angie has a z-score of −1.25 and Beth's z-score is −2. Both Angie and Beth have negative z-scores, meaning they both swim in less time than their team's mean time. In terms of swim times, lower values are faster times, so Beth has the faster swim time when compared to her team. Your answer: Angie Smaller values correspond to faster swim times - who has the smaller z-score?

Isabel is looking at the prices for round-trip airfare from Setauket to Orchard Park where both flights occur on Wednesday or both flights occur on Sunday. She randomly selects 20 round-trips where both flights occur on Wednesday and 20 round-trips where both flights occur on Sunday. Isabel records the prices for each round-trip airfare in dollars as shown in the samples provided. One of the round-trips on Wednesday and one of the round-trips on Sunday both cost $235. Is the $235 airfare more likely to occur on Wednesday or on Sunday? Use Excel to calculate the z-scores for $235 airfare on Wednesday and on Sunday , rounding to two decimal places.

Wednesday $z$z​ -score: 1$$ Sunday $z$z​ -score: 2$$ Correct answers:1$1.84$1.84​2$-0.97$−0.97​ 1. Enter the Wednesday data into column A and the Sunday data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The mean for Wednesday is $192.25 with a sample standard deviation of $23.206, rounded to three decimal places. The sample mean for Sunday is $253.65 with a sample standard deviation of $19.228, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for a round-trip flight that occurs on Wednesday and whose airfare costs $235, rounding to two decimal places. z≈235−192.2523.206≈1.84 Compute the z-score for a round-trip flight that occurs on Sunday and whose airfare costs $235, rounding to two decimal places. z≈235−253.6519.228≈−0.97

A manufacturing plant produces custom hardware for specific applications in construction. One particular kind of bolt that is intended to have a length of 84mm is produced by two different machines, A and B. Tristan is an employee in the department that produces this bolt. He randomly selects samples of 100 bolts from each machine and measures the length of each one. The lengths of the bolts, in millimeters, are shown in the table below. Each machine produced a bolt that has a length of 84.05mm. Use Excel to calculate each machine's z-score for producing a bolt that has a length of 84.05mm. Round your answers to two decimal places.

1$1.74$1.74​ 2$3.57$3.57​ 1. Enter the data for Machine A into column A and the data for Machine B into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for machine A is 84.02 with a sample standard deviation of 0.0172, rounded to four significant figures. The sample mean for machine B is 83.98 with a sample standard deviation of 0.0196, rounded to four significant figures. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for machine A producing the bolt that has a length of 84.05mm, rounding to two decimal places. z≈84.05−84.020.0172≈1.74 Compute the z-score for machine B producing the bolt that has a length of 84.05mm, rounding to two decimal places. z≈84.05−83.980.0196≈3.57

Casey is looking to rent a two-bedroom apartment in one of two towns, Gardiner or Augusta. He randomly selects 100 two-bedroom apartments from both towns and records the area of each apartment. The area of each apartment, in square feet, is provided in the samples shown below. Both towns have a two-bedroom apartment that has an area of 645 square feet. Which of the two towns is more likely to have a two-bedroom apartment with 645 square feet? Use Excel to calculate the z-scores corresponding to a 645 square foot apartment in each city. Round each z-score to two decimal places.

1. Enter the Gardiner data into column A and the Augusta data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Gardiner data is 701.8 with a sample standard deviation of 39.013, rounded to three decimal places. The sample mean for the Augusta data is 680.4 with a sample standard deviation of 25.713, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for the apartment in Gardiner having an area of 645 square feet, rounding to two decimal places. z≈645−701.839.013≈−1.46 Compute the z-score for the apartment in Augusta having an area of 645 square feet, rounding to two decimal places. z≈645−680.425.713≈−1.38

Use z-scores to Compare Values from Different Data Sets with Excel

1. Enter the data for each set into separate columns in Excel. For example, if there are three data sets, enter the first data set into column A, enter the second data set into column B, and enter the third data set into column C. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation from the output. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. 6. Now compare the z-scores. The data value that has the greater absolute value of its z-score is the one that is more extreme and is, therefore, more unusual to occur.

Based on the z scores found above, on which college entrance exam did Sean perform better, compared to the national distributions for each test?

Correct answer: SAT Sean's SAT score is 0.8 standard deviations higher than the national mean SAT score. Sean's ACT score is 0.75 standard deviations higher than the national mean ACT score. The SAT corresponds to a greater z−-score, so Sean performed better on the SAT as compared to the national mean.

A student studying statistics wants to look at data for his favorite sport, American football. He collects data on the lengths of 100 field goals from various games over several seasons. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?

Correct answer: Sample standard deviation Since the data is from only a sample of games, and not all of the games for all of the seasons, he should calculate a sample standard deviation.

The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. Which statements are true about the pattern of data for the sample standard deviations of the commercial buildings total assessed land value and total assessed parcel value, and the residential buildings total assessed land value and total assessed parcel value? Select all that apply.

Correct answer: Commercial buildings have a greater standard deviation in both categories than residential. The largest difference in standard deviation is from Residential Total Assessed Land Value to Commercial Total Assessed Parcel Value. The smallest decrease in standard deviation is from Residential Total Assessed Parcel Value to Residential Total Assessed Land Value. Commercial buildings do have a greater standard deviation in both categories than residential. The standard deviation for Commercial Total Assessed Land Value (4361842) is not more than two times the standard deviation for Residential Total Assessed Land Value (97477). The standard deviation for Commercial Total Assessed Land Value is about 45 times more than the standard deviation for Residential Total Assessed Land Value. The largest difference in standard deviation is from Residential Total Assessed Land Value to Commercial Total Assessed Parcel Value. The smallest decrease in standard deviation is from Residential Total Assessed Parcel Value to Residential Total Assessed Land Value.

Karl and Fredo are basketball players who want to find out how they compare to their team in points per game. The mean amount of points per game and standard deviations for their team were calculated. Karl's z-score is 0.9. Fredo's z-score is −0.65. Which of the following statements are true about how Karl and Fredo compare to their team in points per game? Select all that apply.

Correct answer: Karl's average points per game is 0.9 standard deviations greater than his teammates' average points per game. Fredo's average points per game is closer to the team's mean than Karl's. The z-score is the number of standard deviations a data value is from the mean of the data set. Karl's average points per game is 0.9 standard deviations greater than his team mean. Fredo's average points per game is 0.65 standard deviations less than his team mean. But, since Fredo's z-score is −0.65, the distance from this to the mean is less than 0.9. since |−0.65|<0.9. So, Fredo's average points per game is closer to the team's mean than Karl's.

Based on the z scores found above, is Kathy or Linda's starting salary higher, when compared to the salary distributions of each company?

Correct answer: Kathy Kathy's starting salary of $31,500 has a z-score of z=−1.5, which means her salary 1.5 standard deviations below her company's mean salary. Linda's starting salary of $33,000 has a z-score of z=−2, which means her salary is 2 standard deviations below her company's mean salary. Kathy's salary corresponds to a greater z−-score, when compared to their respective company's salary distributions, Kathy's has the better starting salary.

A regional manager for a pie manufacturing company wants to compare the production in 3 locations. The average production per hour is below with the standard deviation: Plant A: has a mean production of 72.0 pies per hour and a standard deviation of 1.2. Plant B: has a mean production of 70.8 pies per hour and a standard deviation of 0.7. Plant C: has a mean production of 73 pies per hour and a standard deviation of 1.0. On average, which plant has a higher production? Which plant has a more consistent production?

Correct answer: Plant C has a higher production on average, and Plant B has a more consistent production. The mean is the average which tells which has the higher production, and Plant C has the greatest average. Standard deviation is a measure of how far the data values deviate from the mean, as a set. The smaller the standard deviation, the more consistent the data. So, Plant B has the most consistent production average overall, because their standard deviation is the least.

You are the owner of a marketing firm and want to retain talent. One of the benefits you are considering is pet insurance to the full-time employees. For an informed decision, you are conducting a survey on how many pets each employee has in their households. The mean number of pets is 4 per household, and the standard deviation is 2. Rob only owns cats, and he has 10 of them. Which of the following statements is true?

Correct answer: Rob's number of pets is 3 standard deviations to the right of the mean. If Rob has 6 pets greater than the mean, and the standard deviation is equal to 2, then he is just 3 standard deviations above the mean. 6=2(3). Since Rob has more pets than the mean number, the standard deviation will be to the right of the mean. Your answer: Rob's number of pets is 6 standard deviations to the right of the mean.

Casey is looking to rent a two-bedroom apartment in one of two towns, Gardiner or Augusta. He randomly selects 100 two-bedroom apartments from both towns and records the area of each apartment. Both towns have a two-bedroom apartment that has an area of 645 square feet. Based on the z-scores you calculated above, which of the two towns is more likely to have a two-bedroom apartment with 645 square feet?

Correct answer: The absolute value of the z-score for the apartment in Augusta is less than for the apartment in Gardiner, so the apartment in Augusta having an area of 645 square feet is more likely. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for Augusta, 1.38, is less than the absolute value of the z-score for Gardiner, 1.46, the apartment that has an area of 645 square feet is more likely to occur in Augusta than in Gardiner

Given the following histogram, decide if the data is skewed or symmetrical.

Correct answer: The data are skewed to the right. Note that the histogram has most of its values concentrated on the left, with several much larger values on the right. Therefore, the data are skewed right. This could represent the number of employees that call in sick each day for February. While most days the number is small, around the superbowl (one day of the month), it is rather large.

Given the following box-and-whisker plot, decide if the data is skewed or symmetrical.

Correct answer: The data are skewed to the right. Note that the whisker on the right is much longer than the whisker on the left. So there are several larger values on the right. Therefore, the data are skewed right. This could represent houses in one neighborhood. Most are in the lower range, but a few are worth much more.

A data set of the average income of a millennial in 15 U.S. states has a mean of 40 and a median of 32. Which of the following is a logical conclusion?

Correct answer: The data is skewed to the right. Because the mean, 40, is greater than the median, 32, we expect that there are some very large values which are pulling the mean up. In other words, the data is skewed to the right.

An independent census agency polls a random sample of 30 households in a particular neighborhood to find how many people live in each household. Using Excel, calculate the mode(s) of the dataset provided below.

Correct answer: There are two modes. The modes are 2 and 4. The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. In C1 type "Mode", highlight cell range C2:C5, while highlighted type "=MODE.MULT(A2:A31)", and then press CTRL+SHIFT+ENTER. If there is one mode, each cell in cell range C2:C5 will display that mode's value. If there is more than one mode, each mode is displayed in one of the cells in the cell range C2:C5. If each cell in the cell range C2:C5 has a unique mode value, then there may be more modes and step 1 should be repeated with a taller cell range, for example C2:C10. If there are more cells in the range than there are modes but there is more than one mode, the remaining cells in the range will display #N/A to indicate there were no other modes found. Cell C2 should display 4 and cell C3 should display 2. Cells C4 and C5 both should display #N/A, indicating that there are two modes: 0 and 4.

An event coordinator for a particular marathon held yearly is reviewing the data from the top 30 race finish times from the last race. Using Excel, calculate the mode(s) of the dataset provided below

Correct answer: There are two modes. The modes are 2.47 and 4.14. The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. In C1 type "Mode", highlight cell range C2:C5, while highlighted type "=MODE.MULT(A2:A31)", and then press CTRL+SHIFT+ENTER. If there is one mode, each cell in cell range C2:C5 will display that mode's value. If there is more than one mode, each mode is displayed in one of the cells in the cell range C2:C5. If each cell in the cell range C2:C5 has a unique mode value, then there may be more modes and step 1 should be repeated with a taller cell range, for example C2:C10. If there are more cells in the range than there are modes but there is more than one mode, the remaining cells in the range will display #N/A to indicate there were no other modes found. Cell C2 should display 2.47, and cell C3 should display 4.14. Cells C4 and C5 should both display #N/A, indicating that there are two modes: 2.47 and 4.14.

The following data values represent the daily amount spent by a family each day during a 7 day summer vacation. $96,$125,$80,$110,$75,$100,$121 To determine the "spread" of the data, would you employ calculations for the sample standard deviation, or population standard deviation for this data set?

Correct answer: Use calculations for population standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the population standard deviation should be used because the data set represents the amount spent each day of the 7 day vacation.

The following data set represents the pulse rate for 5 randomly selected players on a football team Find the standard deviation of the data set. Round your answer to one decimal place. 66, 70, 75, 80, 84

Correct answers: s= 7.3​ To find the sample standard deviation, follow these steps: Find the mean of the dataset:66+70+75+80+84/5=75. Find the deviation for each data value which is calculated by taking each data value and subtracting the mean:66−75=−9, 70−75=−5, 75−75=0, 80−75=5, 84−75=9 Square each of the deviations from step 2.(−92)=81, (−52)=25, (02)=0, (52)=25, (92)=81 Add up the square of the deviations from Step 3.81+25+0+25+81=212 Divide this total from Step 4 by (n−1) which is the number of data values minus 1. 2124=53This result is the sample variance of the dataset. Take the square root of the result from Step 5. This result is then the sample standard deviation of the data set. s=53−−√=7.28Then round your answer per the rounding instructions given in the problem.

The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. For commercial buildings, what is the sample standard deviation of the living area? Round your answer to the nearest hundred.

Correct answers: standard deviation=24,800​ The standard deviation is the square root of the sample variance of the living area of the five commercial buildings. The sample variance is 615,032,637. Take the square root, rounding to the nearest hundred. √615,032,637=24,800

A student of statistics and fan of baseball is looking over the player stats for a list she is compiling of "top thirty players nobody remembers." The data for the batting averages of these 30 players are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to three decimal places as needed).

Correct answers:$\text{Mean}=0.271,\ \text{Median}=0.270,\ \text{Mode}=0.301$Mean=0.271, Median=0.270, Mode=0.301​ The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to three decimal places should result in a mean of 0.271 and a median of 0.270. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 0.301. This indicates 0.301 is the only mode.

A market researcher sampled 30 randomly selected people in an email survey asking participants how many times they had seen a particular movie in theaters. The data are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).

Correct answers:$\text{Mean}=1.0,\ \text{Median}=1.0,\ \text{Mode}=1.0$Mean=1.0, Median=1.0, Mode=1.0​ The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 1.0 and a median of 1.0. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 1.0. This indicates 1.0 is the only mode.

A statistician wants to analyze the time spent commuting to work in the morning. To do so she records the time in minutes it takes to get to work each morning for 30 mornings. The data are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).

Correct answers:$\text{Mean}=21.9,\ \text{Median}=20.5,\ \text{Mode}=20.0$Mean=21.9, Median=20.5, Mode=20.0​ The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 21.9 and a median of 20.5. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 20.0. This indicates 20.0 is the only mode.

A veterinary researcher is studying a particular type of dog called the Australian Cattle Dog. The researcher has acquired data on a sample of 30 dogs, including the weight in pounds of each of the dogs. The dog weight dataset is provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).

Correct answers:$\text{Mean}=40.6,\ \text{Median}=40.6,\ \text{Mode}=38.5$Mean=40.6, Median=40.6, Mode=38.5​ The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 40.6 and a median of 40.6. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 38.5. This indicates 38.5 is the only mode.

A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students to estimate when they typically wake up on weekdays. The data are recorded in terms of the number of hours after midnight the students wake up. The data are included below. Use Excel to calculate the population standard deviation and the population variance. Round your answers to three decimal places. Do not round until you've calculated your final answer

Correct answers:$\text{Standard Deviation}=0.852,\ \text{Variance}=0.726$Standard Deviation=0.852, Variance=0.726​ To determine the population standard deviation and population variance for a data set {x1,x2,...,xN} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A21. Select cell B3 and type "=STDEV.P(", select the range A2:A21, and then hit ENTER. This gives the population standard deviation. Select cell B4 and type "=VAR.P(", select the range A2:A21, and then hit ENTER. This gives the population variance. The population standard deviation is σ≈0.852 and the population variance is σ2≈0.726, rounding each to one decimal place.

A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students how many text messages they think they sent yesterday. The data are included below. Use Excel to calculate the population standard deviation and the population variance. Round your answers to one decimal place. Do not round until you've calculated your final answer

Correct answers:$\text{Standard Deviation}=53.1,\ \text{Variance}=2819.7$Standard Deviation=53.1, Variance=2819.7​ To determine the population standard deviation and population variance for a data set {x1,x2,...,xN} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A21. Select cell B3 and type "=STDEV.P(", select the range A2:A101, and then hit ENTER. This gives the population standard deviation. Select cell B4 and type "=VAR.P(", select the range A2:A101, and then hit ENTER. This gives the population variance. The population standard deviation is σ≈53.1 and the population variance is σ2≈2819.7, rounding each to one decimal place.

A student studying statistics wants to look at data for his favorite sport, American football. He collects data on the lengths of 100 field goals from various games over several seasons. The data are provided below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to one decimal place.

Correct answers:$\text{Standard Deviation}=9.1,\ \text{Variance}=82.8$Standard Deviation=9.1, Variance=82.8​ To determine the sample standard deviation and sample variance for a data set {x1,x2,...,xn} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A101. Select cell B3 and type "=STDEV.S(", select the range A2:A101, and then hit ENTER. This gives the sample standard deviation. Select cell B4 and type "=VAR.S(", select the range A2:A101, and then hit ENTER. This gives the sample variance. The sample standard deviation is s≈9.1 and the sample variance is s2≈82.8, rounding each to one decimal place.

National results for the SAT test show that for college-bound seniors the average combined SAT Writing, Math and Verbal score is 1500 with a standard deviation of 250. National results for the ACT test show that for college-bound seniors the average composite ACT score is 20.4 with a standard deviation of 4.8. Sean took both the SAT and the ACT college entrance exams. His SAT score was 1700 and his ACT score was 24. He wants to know on which test he performed better. Find the z-scores for his result on each exam

Correct answers:0.8 ; 0.75​ We are told that the mean national SAT score is μ=1500 with a standard deviation of σ=250. Sean's SAT score of x=1700 corresponds to a z-score of z=x−μσ=1700−1500250=0.8 So Sean's SAT score is 0.8 standard deviations higher than the national mean. The mean national ACT score is μ=20.4 with a standard deviation of σ=4.8. Sean's ACT score of x=24 corresponds to a z-score of z=x−μσ=24−20.44.8=0.75 So Sean's ACT score is 0.75 standard deviations higher than the national mean.

Tara is a journalist for the newspaper at a large college, and is writing a report about the cost of textbooks in the STEM fields. She is inquiring about how much mathematics majors and chemistry majors spend on textbooks during the past semester. She randomly selects a sample of 100 students of each major to ask how much each student spent on textbooks during the past semester and records their responses. The results of the survey are shown below, where the amounts are in dollars. One student from each department spent $624 on textbooks from each semester. Use Excel to calculate the z-score for the data value that represent the $624 spent on textbooks for each major Round your answers to two decimal places.

Correct answers:1$1.68$1.68​2$0.52$0.52​ 1. Enter the mathematics data into column A and the data for chemistry into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the mathematics majors is $525 with a sample standard deviation of $58.764, rounded to three decimal places. The sample mean for the chemistry majors is $590.25 with a sample standard deviation of $64.581, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for the mathematics major who spent $624 on textbooks this past semester, rounding to two decimal places. z≈624−52558.764≈1.68 Compute the z-score for the chemistry major who spent $624 on textbooks this past semester, rounding to two decimal places. z≈624−590.2564.581≈0.52

The midterm and final exam grades for a statistics course are provided in the data set below. Jaymes, a student in the class, scored 86 on both exams. Treat the given data sets as samples. Jaymes's wants to know which grade is more unusual, the midterm grade or the final exam grade. Use Excel to calculate the z-scores corresponding to each grade. Round to three decimal places. Do not round until you've calculated your final answer.

Correct answers:1$1.857$1.857​2$1.181$1.181​ 1. Enter the Midterm data into column A and the Final data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Midterm data is 81.05 with a sample standard deviation of 2.665, rounded to three decimal places. The sample mean for the Final data is 77.4 with a sample standard deviation of 7.279, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for Jaymes scoring 86 points on the midterm, rounding to three decimal places. z≈86−81.052.665≈1.857 Compute the z-score for Jaymes scoring 86 points on the final, rounding to three decimal places. z≈86−77.47.279≈1.181


Related study sets

Week 6- Information Security and Privacy

View Set

US History, Chapter 11, Civil War

View Set

Scientific Methods assignment and quiz

View Set

WK10/MN success/High Risk Antepartum

View Set

Chapter 17, 18, 19 - Growth & Development

View Set