Alta- Ch. 2 Descriptive Statistics Pt. 3
A consumer report was released concerning the prices of various food products. The report listed the monthly average price of a pound of beef for 20 different months. Construct a box and whisker plot using Excel and the QUARTILE.INC function. Then, choose the correct answer below.
The five-number summary of the correct box and whisker plot is as follows: 5.138,5.26275,5.332,5.46425,5.687.
The heights, in inches, of the four members of a barbershop quartet of singers are listed below. 72,68,67,73 Find the population variance for this data set. Round to one decimal place if necessary.
$6.5$6.5 First, we find that the mean is 72+68+67+734=2804=70 Now, we need to take the deviations from the mean and square them: ValueDeviationDeviation2722468−2467−397339 Since there are exactly 4 members of a barbershop quartet, the 4 measurements given represent the entire population of the singing group. So we calculate the population variance. The population variance is the sum of the squared deviations, divided by the number of data values( 4). 4+4+9+94=6.5
The auditors for a health insurance company are reviewing the bills from client's stays at hospitals from last year. The length of stay in days for hospital visits from 20 randomly sampled bills is provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the third quartile? Round your answer to two decimal places.
$7.25$7.25 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 2, Q1 is 5.75, the median is 6, Q3 is 7.25, and the maximum is 9. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Designtab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Formattab, and below that click the Format Selectionbutton. In the Format Data Serieswindow that pops up, click the Filltab, and then click the No Fillradio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selectionon the toolbar above. In the Format Error Bars window, click the Vertical Error Barstab, click the Plus button under Direction, and click the Capbutton under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selectionagain. In the Format Error Bars window, click the Vertical Error Barstab, click the Minusbutton under Direction, and click the Capbutton under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the third quartile, where the top whisker intersects with the top side of the box, is 7.25.
An automobile company partnered with its competition to compile a large sample of email contacts from which they sent questionnaires to a random sample that includes car owners who aren't necessarily their customers. This method was used to ask 20 car owners the age of their primary vehicle in years. The data from their responses are provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the maximum for this dataset?
$8$8 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 0, Q1 is 3, the median is 4, Q3 is 6, and the maximum is 8. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the maximum, which is the top side of the top whisker, is 8.
A student of statistics and fan of baseball is looking over the player stats for a list she is compiling of "top thirty players nobody remembers." The data for the batting averages of these 30 players are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to three decimal places as needed).
$\text{Mean}=0.271,\ \text{Median}=0.270,\ \text{Mode}=0.301$Mean=0.271, Median=0.270, Mode=0.301 The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to three decimal places should result in a mean of 0.271 and a median of 0.270. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 0.301. This indicates 0.301 is the only mode.
A market researcher sampled 30 randomly selected people in an email survey asking participants how many times they had seen a particular movie in theaters. The data are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).
$\text{Mean}=1.0,\ \text{Median}=1.0,\ \text{Mode}=1.0$Mean=1.0, Median=1.0, Mode=1.0 The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 1.0 and a median of 1.0. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 1.0. This indicates 1.0 is the only mode.
A statistician wants to analyze the time spent commuting to work in the morning. To do so she records the time in minutes it takes to get to work each morning for 30 mornings. The data are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).
$\text{Mean}=21.9,\ \text{Median}=20.5,\ \text{Mode}=20.0$Mean=21.9, Median=20.5, Mode=20.0 The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 21.9 and a median of 20.5. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 20.0. This indicates 20.0 is the only mode.
An analytics firm is trying to determine whether a client would do well to set up an office in a particular county. The client is a temp agency, which staffs local small businesses with temporary workers. The client would benefit from a region with a lot of small businesses with a high mean employee count. The analytics firm surveyed 30 randomly selected small businesses in the county who supplied the number of full-time employees they had on their payroll. The data are provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).
$\text{Mean}=34.5,\ \text{Median}=26.5,\ \text{Mode}=39.0$Mean=34.5, Median=26.5, Mode=39.0 The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 34.5 and a median of 26.5. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 39.0. This indicates 39.0 is the only mode.
A veterinary researcher is studying a particular type of dog called the Australian Cattle Dog. The researcher has acquired data on a sample of 30 dogs, including the weight in pounds of each of the dogs. The dog weight dataset is provided below. Using Excel, calculate the mean, median, and mode of the dataset (round your answers to one decimal place as needed).
$\text{Mean}=40.6,\ \text{Median}=40.6,\ \text{Mode}=38.5$Mean=40.6, Median=40.6, Mode=38.5 The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 40.6 and a median of 40.6. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 38.5. This indicates 38.5 is the only mode.
A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students to estimate when they typically wake up on weekdays. The data are recorded in terms of the number of hours after midnight the students wake up. The data are included below. Use Excel to calculate the population standard deviation and the population variance. Round your answers to three decimal places. Do not round until you've calculated your final answer
$\text{Standard Deviation}=0.852,\ \text{Variance}=0.726$Standard Deviation=0.852, Variance=0.726 To determine the population standard deviation and population variance for a data set {x1,x2,...,xN} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A21. Select cell B3 and type "=STDEV.P(", select the range A2:A21, and then hit ENTER. This gives the population standard deviation. Select cell B4 and type "=VAR.P(", select the range A2:A21, and then hit ENTER. This gives the population variance. The population standard deviation is σ≈0.852 and the population variance is σ2≈0.726, rounding each to one decimal place.
A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students how many text messages they think they sent yesterday. The data are included below. Use Excel to calculate the population standard deviation and the population variance. Round your answers to one decimal place. Do not round until you've calculated your final answer
$\text{Standard Deviation}=53.1,\ \text{Variance}=2819.7$Standard Deviation=53.1, Variance=2819.7 To determine the population standard deviation and population variance for a data set {x1,x2,...,xN} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A21. Select cell B3 and type "=STDEV.P(", select the range A2:A101, and then hit ENTER. This gives the population standard deviation. Select cell B4 and type "=VAR.P(", select the range A2:A101, and then hit ENTER. This gives the population variance. The population standard deviation is σ≈53.1 and the population variance is σ2≈2819.7, rounding each to one decimal place.
A student studying statistics wants to look at data for his favorite sport, American football. He collects data on the lengths of 100 field goals from various games over several seasons. The data are provided below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to one decimal place.
$\text{Standard Deviation}=9.1,\ \text{Variance}=82.8$Standard Deviation=9.1, Variance=82.8 To determine the sample standard deviation and sample variance for a data set {x1,x2,...,xn} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A101. Select cell B3 and type "=STDEV.S(", select the range A2:A101, and then hit ENTER. This gives the sample standard deviation. Select cell B4 and type "=VAR.S(", select the range A2:A101, and then hit ENTER. This gives the sample variance. The sample standard deviation is s≈9.1 and the sample variance is s2≈82.8, rounding each to one decimal place.
The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. For residential buildings, what is the sample mean of the living area? Round your answer to ONE decimal place.
$\text{sample mean=}2945.8$sample mean=2945.8 The sample mean is just all the data added together and divided by the sample size. 2296+1620+3343+4032+3438=14729 8147295=2,945.8
Using the following set of data (the same as in the previous problem), find the sample standard deviation: 8, 6, 3, 11, 7. The sample variance of this data set is 8.5. Round the final answer to one decimal place.
$\text{sample std=}2.9$sample std=2.9 Remember that the sample standard deviation is the square root of the sample variance. Since we just found that the sample variance is 8.5, we find that the sample standard deviation is 8.5−−−√=2.9.
The high temperature, in ∘C, on the first day of winter was recorded in a certain city every year from 1915 to 2015. The following six temperature values are a sample chosen from the data. What is the sample variance of these temperatures? 4, 12, 6, 9, 6, 11 Round the final answer to one decimal place.
$\text{sample variance=}10.0$sample variance=10.0 In this case, the sample standard deviation should be used because the six temperature values were randomly selected from the full set of data values for the years 1915 to 2015. First, we find that the mean is 4+12+6+9+6+116=486=8 Now, we need to take the deviations from the mean and square them:Value41269611Deviation−4.04.0−2.01.0−2.03.0Deviation216.016.04.01.04.09.0Finally, we add up the squared deviations and divide by the number of data values minus one (6−1=5).16.0+16.0+4.0+1.0+4.0+9.05=10.0
The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. For commercial buildings, what is the sample variance of the living area? Round to the nearest whole number.
$\text{sample variance=}615032637$sample variance=615032637 The mean is 17850.2. The sample variance is the sum of the squares of the difference of the specific data value from the mean divided by 4, which is n−1. ((318−17850.2)2)=307378036.8 ((3630−17850.2)2)=202214088 ((59506−17850.2)2)=1735205674 ((3780−17850.2)2)=197970528 ((22017−17850.2)2)=17362222.2 307378036.8+202214088+1735205674+197970528+17362222.2=2460130549 24601305495−1=24601305494=615,032,637.2 Rounded to the nearest whole number, the sample variance in living area for commercial buildings is 615,032,637.
The following set of data represents the top managers' bonuses (in thousands), find the sample variance:10, 3, 6, 3, 8 Round your answer to ONE decimal place.
$\text{sample variance=}9.5\text{ thousand^2}$sample variance=9.5 thousand^2 First, we find that the mean is 10+3+6+3+85=305=6 Now, we need to take the deviations from the mean and square them: ValueDeviationDeviation2104.016.03−3.09.060.00.03−3.09.082.04.0 Finally, we add up the squared deviations and divide by the number of data values minus one (5−1=4). 16.0+9.0+0.0+9.0+4.04=9.5 The sample variance shows how the bonuses differ from the bonus mean. The larger the difference, the more varied the bonuses. This would say that the bonuses vary from 6 (mean), by 9.5 (thousand^2).
The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. For residential buildings, what is the sample standard deviation for the living area? Round your answer to ONE decimal place.
$\text{standard deviation=}969.5$standard deviation=969.5 The mean is 2,945.8. The sample variance is the sum of the squares of the difference of the specific data value from the mean divided by 4, which is n−1. Then, the sample standard deviation is the square root of the sample variance. For this sample, the sample standard deviation is 969.5.
The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. For commercial buildings, what is the sample standard deviation of the living area? Round your answer to the nearest hundred.
$\text{standard deviation}=24,800$standard deviation=24,800 The standard deviation is the square root of the sample variance of the living area of the five commercial buildings. The sample variance is 615,032,637. Take the square root, rounding to the nearest hundred. 615,032,637−−−−−−−−−√=24,800
The following set of data represents how many times per minute a person looks at their cell phone, find the sample standard deviation: 5, 9, 2, 10, 4 Round your answer to ONE decimal place.
$\text{standard deviation}=3.4\ times$standard deviation=3.4 times Remember that the sample standard deviation is the square root of the sample variance. Once we find that the sample variance is 11.5, we can find that the sample standard deviation is 11.5−−−−√=3.4. The standard deviation shows how spread out the times are from the mean. The larger the standard deviation (further away each value is from the mean), the wider range of times there are. This would say the average number of times a person looks at their cell phone in one minute is 6±3.4 minutes
Casey is looking to rent a two-bedroom apartment in one of two towns, Gardiner or Augusta. He randomly selects 100 two-bedroom apartments from both towns and records the area of each apartment. The area of each apartment, in square feet, is provided in the samples shown below. Both towns have a two-bedroom apartment that has an area of 645 square feet. Which of the two towns is more likely to have a two-bedroom apartment with 645 square feet? Use Excel to calculate the z-scores corresponding to a 645 square foot apartment in each city. Round each z-score to two decimal places.
1$-1.46$−1.46 2$-1.38$−1.38 1. Enter the Gardiner data into column A and the Augusta data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Gardiner data is 701.8 with a sample standard deviation of 39.013, rounded to three decimal places. The sample mean for the Augusta data is 680.4 with a sample standard deviation of 25.713, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for the apartment in Gardiner having an area of 645 square feet, rounding to two decimal places. z≈645−701.839.013≈−1.46 Compute the z-score for the apartment in Augusta having an area of 645 square feet, rounding to two decimal places. z≈645−680.425.713≈−1.38
Kathy and Linda both accepted new jobs at different companies. Kathy's starting salary is $31,500 and Linda's starting salary is $33,000. They are curious to know who has the better starting salary, when compared to the salary distributions of their new employers. A website that collects salary information from a sample of employees for a number of major employers reports that Kathy's company offers a mean salary of $42,000 with a standard deviation of $7,000. Linda's company offers a mean salary of $45,000 with a standard deviation of $6,000. Find the z-scores corresponding to each woman's starting salary.
1$-1.5$−1.5 2$-2$−2 Whether the data is from a sample or population, the formulas for the z-score remains the same: z=data value−meanstandard deviation Kathy's starting salary is $31,500, for a company with a mean salary of $42,000 and standard deviation $7,000. The corresponding z-score is z=x−μσ=31,500−42,0007,000=−1.5 So Kathy's starting salary is 1.5 standard deviations below her company's mean salary. Linda's starting salary is $33,000, for a company with a mean salary of $45,000 and standard deviation $6,000. The corresponding z-score is z=x−μσ=33,000−45,0006,000=−2.0 So Linda's starting salary is 2 standard deviations below her company's mean salary.
A large company has two major departments, Development and Marketing. 100 employees are randomly selected from each department, and the age of each employee, in years, is recorded in the accompanying samples. Both departments have an employee who is 22 years old. Use Excel to calculate the z-score for the data value that represents the 22-year-old employee in each department. Round your answers to two decimal places. Development
1$-1.96$−1.96 2$-1.70$−1.70 1. Enter the data for the Development department into column A and the data for the Marketing department into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Development department is 40.6 with a sample standard deviation of 9.494, rounded to three decimal places. The sample mean for the Marketing department is 34.4 with a sample standard deviation of 7.291, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for the 22-year-old employee in the Development department, rounding to two decimal places. z≈22−40.69.494≈−1.96 Compute the z-score for the 22-year-old employee in the Marketing department, rounding to two decimal places. z≈22−34.47.291≈−1.70
In a recent national survey, the mean price for a 2000 sq ft home in Florida is $240,000 with a standard deviation of $16,000. The mean price for the same sized home in Ohio is $170,000 with a standard deviation of $12,000. In which state would a home priced at $200,000 be closer to the mean price, compared to the distribution of prices in the state? Find the z-score corresponding to each state.
1$-2.5$−2.5 2$2.5$2.5 We are told that the mean price for a 2000 sq ft home in Florida is $240,000 with a standard deviation of $16,000. A house priced at x=200,000 dollars corresponds to a z-score of z=x−μσ=200,000−240,00016,000=−2.5 A house priced at $200,000 in Florida has a z-score of z=−2.5. The mean price for the same sized home in Ohio is $170,000 with a standard deviation of $12,000. A house priced at x=200,000 dollars corresponds to a z-score of z=x−μσ=200,000−170,00012,000=2.5 A house priced at $200,000 in Ohio has a z-score of z=2.5.
National results for the SAT test show that for college-bound seniors the average combined SAT Writing, Math and Verbal score is 1500 with a standard deviation of 250. National results for the ACT test show that for college-bound seniors the average composite ACT score is 20.4 with a standard deviation of 4.8. Sean took both the SAT and the ACT college entrance exams. His SAT score was 1700 and his ACT score was 24. He wants to know on which test he performed better. Find the z-scores for his result on each exam.
1$0.8$0.8 2$0.75$0.75 We are told that the mean national SAT score is μ=1500 with a standard deviation of σ=250. Sean's SAT score of x=1700 corresponds to a z-score of z=x−μσ=1700−1500250=0.8 So Sean's SAT score is 0.8 standard deviations higher than the national mean. The mean national ACT score is μ=20.4 with a standard deviation of σ=4.8. Sean's ACT score of x=24 corresponds to a z-score of z=x−μσ=24−20.44.8=0.75 So Sean's ACT score is 0.75 standard deviations higher than the national mean.
Tara is a journalist for the newspaper at a large college, and is writing a report about the cost of textbooks in the STEM fields. She is inquiring about how much mathematics majors and chemistry majors spend on textbooks during the past semester. She randomly selects a sample of 100 students of each major to ask how much each student spent on textbooks during the past semester and records their responses. The results of the survey are shown below, where the amounts are in dollars. One student from each department spent $624 on textbooks from each semester. Use Excel to calculate the z-score for the data value that represent the $624 spent on textbooks for each major Round your answers to two decimal places.
1$1.68$1.68 2$0.52$0.52 1. Enter the mathematics data into column A and the data for chemistry into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the mathematics majors is $525 with a sample standard deviation of $58.764, rounded to three decimal places. The sample mean for the chemistry majors is $590.25 with a sample standard deviation of $64.581, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for the mathematics major who spent $624 on textbooks this past semester, rounding to two decimal places. z≈624−52558.764≈1.68 Compute the z-score for the chemistry major who spent $624 on textbooks this past semester, rounding to two decimal places. z≈624−590.2564.581≈0.52
A manufacturing plant produces custom hardware for specific applications in construction. One particular kind of bolt that is intended to have a length of 84mm is produced by two different machines, A and B. Tristan is an employee in the department that produces this bolt. He randomly selects samples of 100 bolts from each machine and measures the length of each one. The lengths of the bolts, in millimeters, are shown in the table below. Each machine produced a bolt that has a length of 84.05mm. Use Excel to calculate each machine's z-score for producing a bolt that has a length of 84.05mm. Round your answers to two decimal places.
1$1.74$1.74 2$3.57$3.57 1. Enter the data for Machine A into column A and the data for Machine B into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for machine A is 84.02 with a sample standard deviation of 0.0172, rounded to four significant figures. The sample mean for machine B is 83.98 with a sample standard deviation of 0.0196, rounded to four significant figures. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for machine A producing the bolt that has a length of 84.05mm, rounding to two decimal places. z≈84.05−84.020.0172≈1.74 Compute the z-score for machine B producing the bolt that has a length of 84.05mm, rounding to two decimal places. z≈84.05−83.980.0196≈3.57
The midterm and final exam grades for a statistics course are provided in the data set below. Jaymes, a student in the class, scored 86 on both exams. Treat the given data sets as samples. Jaymes's wants to know which grade is more unusual, the midterm grade or the final exam grade. Use Excel to calculate the z-scores corresponding to each grade. Round to three decimal places. Do not round until you've calculated your final answer.
1$1.857$1.857 2$1.181$1.181 1. Enter the Midterm data into column A and the Final data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Midterm data is 81.05 with a sample standard deviation of 2.665, rounded to three decimal places. The sample mean for the Final data is 77.4 with a sample standard deviation of 7.279, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for Jaymes scoring 86 points on the midterm, rounding to three decimal places. z≈86−81.052.665≈1.857 Compute the z-score for Jaymes scoring 86 points on the final, rounding to three decimal places. z≈86−77.47.279≈1.181
The following data values represent the daily amount spent by a family each day during a 7 day summer vacation. Find the variance of this dataset: $96,$125,$80,$110,$75,$100,$121 Round the final answer to one decimal place.
1$314.3$314.3 In this case, the population standard deviation should be used because the data set represents the amount spent each day of the 7 day vacation. First, we find that the mean is 96+125+80+110+75+100+1217=7077=101 Now, we need to take the deviations from the mean and square them:Value961258011075100121Deviation−524−219−26−120Deviation225576441816761400 The amounts spentars listed for every day of the vacation. Since we have data for the total population of vacation days, we will find the population variance and standard deviation. The population variance is the sum of the squared deviations, divided by the number of data values. 25+576+441+81+676+1+4007=22007≈314.28 So to one decimal place, the variance is 314.3
Isabel is looking at the prices for round-trip airfare from Setauket to Orchard Park where both flights occur on Wednesday or both flights occur on Sunday. She randomly selects 20 round-trips where both flights occur on Wednesday and 20 round-trips where both flights occur on Sunday. Isabel records the prices for each round-trip airfare in dollars as shown in the samples provided. One of the round-trips on Wednesday and one of the round-trips on Sunday both cost $235. Is the $235 airfare more likely to occur on Wednesday or on Sunday? Use Excel to calculate the z-scores for $235 airfare on Wednesday and on Sunday , rounding to two decimal places.
1. Enter the Wednesday data into column A and the Sunday data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The mean for Wednesday is $192.25 with a sample standard deviation of $23.206, rounded to three decimal places. The sample mean for Sunday is $253.65 with a sample standard deviation of $19.228, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for a round-trip flight that occurs on Wednesday and whose airfare costs $235, rounding to two decimal places. z≈235−192.2523.206≈1.84 Compute the z-score for a round-trip flight that occurs on Sunday and whose airfare costs $235, rounding to two decimal places. z≈235−253.6519.228≈−0.97
The Bureau of Labor Statistics compiles and makes publicly available data from a range of different sectors of the economy. One number it reports is a weighted average of the costs of certain goods, called the Consumer Price Index (CPI). The CPI is related to the price but is not a dollar amount. The rise in prices is used as a metric of inflation. To make the metric more reliable, a chained CPI was created. The regular CPI does not update the types of goods it averages often enough to reflect the market trends, like a switch from apples to oranges resulting from a rise in apple prices. The chained CPI is a measure of price that takes this into account, and there are several bills that are tied to the chained CPI index in order to determine the payout the bill allots each year. The monthly average chained CPI for urban apparel for all urban consumers for 20 consecutive months is provided below. The data are not seasonally adjusted. Construct a box and whisker plot using Excel and the QUARTILE.INC function to choose the correct plot below.
1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 89.02, Q1 is 90.1975, the median is 91.66, Q3 is 94.825, and the maximum is 99.15. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the tool bar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create errors bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. For this dataset, the vertical axis was changed to run from 88 to 100 with major tick marks in increments of 1 and minor ticks marks in increments of 0.25. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. The completed box plot should look something like the following image. The five-number summary of the correct box and whisker plot is as follows: 89.02,90.1975,91.66,94.825,99.15.
The following lists of data represent five separate departments' employees used vacation days per year. Which of the following lists of data has the smallest standard deviation?
17, 19, 17, 18, 17, 16, 16, 16, 17, 20 Remember that standard deviation is a measure of how spread out the values are. The list 17, 19, 17, 18, 17, 16, 16, 16, 17, 20 has the smallest standard deviation because its values are all relatively close together. A smaller standard deviation also suggests less variablity (more consistent) in this department's used vacation days per employee per year. This department could see this as each employee taking equal vacation days each year.
The following lists of data represent five separate departments' employees time spent in meetings for a week. Which of the following lists of data has the largest standard deviation?
24, 15, 21, 23, 9, 22, 12, 21, 20, 13 Remember that standard deviation is a measure of how spread out the values are. The list 24, 15, 21, 23, 9, 22, 12, 21, 20, 13 has the largest standard deviation because its values are all relatively spread apart. The department could see this as having inconsistent (or variability) weekly meeting times for each employee.
A large company has two major departments, Development and Marketing. 100 employees are randomly selected from each department, and the age of each employee, in years, is recorded in the accompanying samples. Both departments have an employee who is 22 years old. Based on the z-scores you calculated above, would it be more likely for the Development or Marketing department to have a 22 year old employee?
A 22 year old employee would more likely be found in the Marketing department, because the absolute value of the z-score for Development is greater than for Marketing. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for the Development department, 1.96, is more than the absolute value of the z-score for the Marketing department, 1.70, a 22 year old employee would more likely be found in the Marketing department. Your answer: A 22 year old employee would more likely be found in the Development department, because the absolute value of the z-score for Development is less than for Marketing. The absolute value of the z-score for Development is greater than for Marketing.
A regional manager for a pie manufacturing company wants to compare the production in 3 locations. The average production per hour is below with the standard deviation: Plant A: has a mean production of 72.0 pies per hour and a standard deviation of 1.2. Plant B: has a mean production of 70.8 pies per hour and a standard deviation of 0.7. Plant C: has a mean production of 73 pies per hour and a standard deviation of 1.0. On average, which plant has a higher production? Which plant has a more consistent production?
Plant C has a higher production on average, and Plant B has a more consistent production. The mean is the average which tells which has the higher production, and Plant C has the greatest average. Standard deviation is a measure of how far the data values deviate from the mean, as a set. The smaller the standard deviation, the more consistent the data. So, Plant B has the most consistent production average overall, because their standard deviation is the least.
Which of the following histograms shows a skewed data set? Select all that apply.
Remember that data are left skewed if there is a main concentration of large values with several much smaller values. Left Skewed histograms could represent ages of retirement; many people (large concentration of large values) may retire after 65, than those younger. Similarly, right skewed data have a main concentration of small values with several much larger values. Right Skewed histograms could represent the age of 5 (large concentration of small values) when you first ride a bike, than those that are much older before riding a bike.
Which of the following frequency tables shows a skewed data set? Select all that apply.
Remember that data are left skewed if there is a main concentration of large values with several much smaller values. Similarly, right skewed data have a main concentration of small values with several much larger values. We can see that the following is left skewed because of the concentration of large values with many smaller values: ValueFrequency13214515116131723182619152012 And the following is right skewed because of its concentration of small values with many larger values: ValueFrequency04112223328417576673 The other frequency tables are more balanced and symmetrical.
The histograms below are of data sets representing four different stocks. Which has the smallest standard deviation?
Remember that the standard deviation is a measure of how spread out the data is. If the values are concentrated around the mean, then a data set has a lower standard deviation.A histogram with fewer values and higher frequencies has a lower standard deviation than a histogram which has a shorter hill and a wider range of values. A smaller standard deviation also suggests less risk in that particular stock.
The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. Find the standard deviations for the commercial buildings total assessed land value and total assessed parcel value, and the residential buildings total assessed land value and total assessed parcel value. Which has the smallest standard deviation?
Residential Total Assessed Land Value Residential Total Assessed Land Value's standard deviation is the smallest for those four, due to the narrowest range of data. Residential Total Assessed Land Value's standard deviation is 97,477.
You are the owner of a marketing firm and want to retain talent. One of the benefits you are considering is pet insurance to the full-time employees. For an informed decision, you are conducting a survey on how many pets each employee has in their households. The mean number of pets is 4 per household, and the standard deviation is 2. Rob only owns cats, and he has 10 of them. Which of the following statements is true?
Rob's number of pets is 3 standard deviations to the right of the mean. If Rob has 6 pets greater than the mean, and the standard deviation is equal to 2, then he is just 3 standard deviations above the mean. 6=2(3). Since Rob has more pets than the mean number, the standard deviation will be to the right of the mean.
A student studying statistics wants to look at data for his favorite sport, American football. He collects data on the lengths of 100 field goals from various games over several seasons. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?
Sample standard deviation Since the data is from only a sample of games, and not all of the games for all of the seasons, he should calculate a sample standard deviation.
Tara is a journalist for the newspaper at a large college, and is writing a report about the cost of textbooks in the STEM fields. She is inquiring about how much mathematics majors and chemistry majors spend on textbooks during the past semester. She randomly selects a sample of 100 students of each major to ask how much each student spent on textbooks during the past semester and records their responses. The results of the survey are shown below, where the amounts are in dollars. One student from each department spent $624 on textbooks from each semester. Based on the z-scores you calculated above, would it be more likely for a Mathematics major or a Chemistry major to spend $624 on textbooks?
Spending $624 on textbooks would be more likely for Chemistry majors, because the absolute value of the z-score for Mathematics majors is greater than for Chemistry majors. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for Mathematics majors, 1.68, is less than the absolute value of the z-score for Chemistry majors, 0.52, spending $624 on textbooks would be more likely for Chemistry majors.
The midterm and final exam grades for a statistics course are provided in the data set below. Jaymes, a student in the class, scored 86 on both exams. Treat the given data sets as samples. Based on the z-scores calculated above, which of Jaymes's grades is more unusual, the midterm grade or the final exam grade?
The absolute value of the z-score for the midterm exam grade is greater than for the final grade, so the midterm grade is more unusual. A z-score with a greater absolute value means that the data value is more unusual. Since the absolute value of the z-score for the midterm grade, 1.86, is greater than the absolute value of the z-score for the final grade, 1.18, Jaymes' midterm grade is more unusual.
Given the following box-and-whisker plot, decide if the data is skewed or symmetrical.
The data are skewed to the left. Note that the whisker on the left is much longer than the whisker on the right. So there are several much smaller values on the left. Therefore, the data are skewed left. This could represent a researcher studying the age of first trying tobacco. Since there is a legal requirement of a certain age, older people will try tobacco first while few try younger.
You are told that a data set has a median of 13 and a mean of 23. Which of the following is a logical conclusion?
The data are skewed to the right. Because the mean, 23, is greater than the median, 13, we expect that there are some very large values which are bringing the mean up. In other words, the data are skewed to the right. This could represent the price of the first car someone buys. While many can only afford in the lower range of vehicles, some can afford to buy luxury.
The data have a long tail to the left-hand side of the chart.
False The data have a long tail to the right-hand side of the chart, which again proves that the data is skewed to the right. There is a main concentration of small values with several much larger values.
The following data set provides information of Households by Total Money Income, Race, and Hispanic Origin of Householder. Looking at the data set or chart for household income for all races in 2015, what percent of households are in the category range that contains the mean?
12.1% The mean is $79.263, which is in the category of $75,000 to $99,999. This range represents 12.1% of all races.
Based on the z scores found above, in which city would a home priced at 200,000 be closer to the mean price, compared to the distribution of prices in the city?
Neither Previously, we found that a house priced at $200,000 in Florida has a z-score of z=−2.5 and a house priced at $200,000 in Ohio has a z-score of z=2.5. Although one is positive and one negative, both of these z-scores are the same number (2.5) of standard distributions from the mean for the two cities, so for neither city is the value closer to the mean.
A statistics professor gives a survey to each of the 100 students in an introductory statistics lecture. The survey asks the students how many text messages they think they sent yesterday. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?
Population standard deviation The professor surveys all 100 students in the class - that is, the entire population of the class. So a population standard deviation would be more appropriate.
Based on the z scores found above, on which college entrance exam did Sean perform better, compared to the national distributions for each test?
SAT Sean's SAT score is 0.8 standard deviations higher than the national mean SAT score. Sean's ACT score is 0.75 standard deviations higher than the national mean ACT score. The SAT corresponds to a greater z−-score, so Sean performed better on the SAT as compared to the national mean.
Given the following histogram, decide if the data is skewed or symmetrical.
The data are skewed to the right. Note that the histogram has most of its values concentrated on the left, with several much larger values on the right. Therefore, the data are skewed right.
The following data values represent the daily amount spent by a family each day during a 7 day summer vacation. $96,$125,$80,$110,$75,$100,$121 To determine the "spread" of the data, would you employ calculations for the sample standard deviation, or population standard deviation for this data set?
Use calculations for population standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the population standard deviation should be used because the data set represents the amount spent each day of the 7 day vacation.
The following lists of data represent five different shipment centers' outputs. Which has the largest standard deviation?
8, 17, 15, 21, 16, 6, 14, 12, 11, 6 Remember that standard deviation is a measure of how spread out the values are. The list 8, 17, 15, 21, 16, 6, 14, 12, 11, 6 has the largest standard deviation because its values are all relatively spread apart. A larger standard deviation also suggests that this shipment center needs improvements to have a more consistent release of products to be shipped.
The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. Which statements are true about the pattern of data for the sample standard deviations of the commercial buildings total assessed land value and total assessed parcel value, and the residential buildings total assessed land value and total assessed parcel value? Select all that apply.
Commercial buildings have a greater standard deviation in both categories than residential. The largest difference in standard deviation is from Residential Total Assessed Land Value to Commercial Total Assessed Parcel Value. The smallest decrease in standard deviation is from Residential Total Assessed Parcel Value to Residential Total Assessed Land Value. Commercial buildings do have a greater standard deviation in both categories than residential. The standard deviation for Commercial Total Assessed Land Value (4361842) is not more than two times the standard deviation for Residential Total Assessed Land Value (97477). The standard deviation for Commercial Total Assessed Land Value is about 45 times more than the standard deviation for Residential Total Assessed Land Value. The largest difference in standard deviation is from Residential Total Assessed Land Value to Commercial Total Assessed Parcel Value. The smallest decrease in standard deviation is from Residential Total Assessed Parcel Value to Residential Total Assessed Land Value.
An insurance company is looking over accident statistics. They want to estimate the standard deviation and variance of the age of the driver involved in an automobile accident. They record the age of the driver for a random selection of 100 accidents from around the country. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?
sample standard deviation Since the data includes the ages of drivers involved in only 100 accidents , and not all of the accidents ever, they are using a sample from a larger population. So a sample standard deviation would be more appropriate.
Given the following histogram, decide if the data is skewed or symmetrical.
The data are skewed to the right. Note that the histogram has most of its values concentrated on the left, with several much larger values on the right. Therefore, the data are skewed right. This could represent the number of employees that call in sick each day for February. While most days the number is small, around the superbowl (one day of the month), it is rather large.
A large, multi-company construction workers union is gathering data on the number of workplace injuries that occurred last year. It gathers the number of injuries from 20 randomly selected companies among the hundreds of construction companies at which its members work. The data are provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the median? Round your answer to one decimal place.
$15.5$15.5 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 0, Q1 is 10.75, the median is 15.5, Q3 is 28.5, and the maximum is 33. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the median, which is the dividing line inside the box, is 15.5. FEEDBACK
The box and whisker plots below are of data sets representing four machines' waste ouputs each hour. Which has the smallest standard deviation?
A Remember that the standard deviation is a measure of how spread out the data is. If the values are concentrated around the mean, then a data set has a lower standard deviation.A box and whisker plot with short whiskers and a short box has values that are less spread out, and hence has the smaller standard deviation. A smaller standard deviation also suggests less variablity (more consistent) in the waste output for that machine per hour.
A data set of the average income of a millennial in 15 U.S. states has a mean of 40 and a median of 32. Which of the following is a logical conclusion?
The data is skewed to the right. Because the mean, 40, is greater than the median, 32, we expect that there are some very large values which are pulling the mean up. In other words, the data is skewed to the right.
The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. Is there a direct correlation between the commercial living area's standard deviation and number of offices in a building?
There is not enough information to determine direct correlation. There is not enough information to determine direct correlation between the commercial living area's standard deviation and number of offices in a building.
A math class's mean test score is 88.4. The standard deviation is 4.0. If Kimmie scored 85.9, what is her z-score?
−0.625 To calculate a z-score, you must use the formula: datavalue−meanstdev=x−μσ In this problem, z-score = 85.9−88.44.0=−0.625
Of the 459,000 people living in Miami, 68% indicated on a recent, city-wide survey that they were employed For a random sample of size 490 people, what is standard deviation for the sampling distribution of the sample proportions, rounded to three decimal places?
$0.021$0.021 Given the population proportion p=68%=0.68 and a sample size of n=490, the standard deviation of the sampling distribution of sample proportions is σpˆ=p(1−p)n−−−−−−−√=0.68(1−0.68)490−−−−−−−−−−−−√≈0.021
A company sells classes on its speed-reading technique, which it advertises to customers through a free, online survey. The results of 20 of these tests are included below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the first quartile?
$222$222 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 191, Q1 is 222, the median is 239.5, Q3 is 257, and the maximum is 302. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the tool bar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the first quartile, which is where the bottom whisker intersects with the bottom side of the box, is 222.
Of last year, the percentage of investors that follow the buy low, sell high strategy was 56%. A financial analyst is conducting research on various investing strategies and plans to sample investors from their company. For what sample size, n, will the sampling distribution of sample proportions have a standard deviation of 0.03?
$274\ \text{investors}$274 investors We are given a population proportion of p=0.56 and want to know the sample size n for which the sampling distribution has a standard deviation of σp^=0.03. Substituting the known values into the formula for the standard error, σp^0.03=p(1−p)n−−−−−−−√=0.56(1−0.56)n−−−−−−−−−−−−√ Squaring both sides and solving for n yields 0.0009n=0.56(1−0.56)n=0.56(1−0.56)0.0009=273.777 Since the sample size is always a whole number, round UP to the desired sample size of 274 investors.
An organization that monitors the average annual salaries of different professions collects data on the salaries of 20 randomly selected secretaries. The annual salary, in U.S. dollars, of the secretaries in the sample is given below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the median?
$46240$46240 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 26,940, Q1 is 28,655, the median is 46,240, Q3 is 48,480, and the maximum is 59,460. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the median, which is the dividing line inside the box, is 46,240.
An insurance company is looking over accident statistics. They want to model the population of all drivers for the population parameter of the standard deviation and variance of the age of the driver involved in an automobile accident. They have a sample of 100 accidents from around the country. The age of the driver in each of these accidents is provided below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to one decimal place. Do not round until you've calculated your final answer
$\text{Standard Deviation}=19.8,\ \text{Variance}=390.1$Standard Deviation=19.8, Variance=390.1 To determine the sample standard deviation and sample variance for a data set {x1,x2,...,xn} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A101. Select cell B3 and type "=STDEV.S(", select the range A2:A101, and then hit ENTER. This gives the sample standard deviation. Select cell B4 and type "=VAR.S(", select the range A2:A101, and then hit ENTER. This gives the sample variance. The sample standard deviation is s≈19.8 and the sample variance is s2≈390.1, rounding each to one decimal place.
A nutrition expert is publishing a study on the amount of sodium present in fast food. A sample of 20 different burgers from various fast food restaurant chains had their sodium content recorded. The data are included below. Use Excel to calculate the sample standard deviation and the sample variance. Round your answers to the nearest whole number.
$\text{Standard Deviation}=197,\ \text{Variance}=38677$Standard Deviation=197, Variance=38677 To determine the sample standard deviation and sample variance for a data set {x1,x2,...,xn} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A2:A21. Select cell B3 and type "=STDEV.S(", select the range A2:A21, and then hit ENTER. This gives the sample standard deviation. Select cell B4 and type "=VAR.S(", select the range A2:A21, and then hit ENTER. This gives the sample variance. The sample standard deviation is s≈197 and the sample variance is s2≈38,677, rounding each to the nearest whole number.
A statistics professor asks each of the students in an introductory statistics lecture to fill out a survey. There are 100 students in the course, and each one filled out the survey. One of the questions asked students to state their age. The data are included below. Use Excel to calculate the population standard deviation and the population variance. Round your answers to one decimal place. Do not round until you've calculated your final answer
$\text{Standard }=3.4,\ \text{V}=11.8$Standard =3.4, V=11.8 To determine the population standard deviation and population variance for a data set {x1,x2,...,xN} using Excel, follow these steps: Open the included data with Excel. The data should occupy cell range A1:A101. Select cell B3 and type "=STDEV.P(", select the range A1:A101, and then hit ENTER. This gives the population standard deviation. Select cell B4 and type "=VAR.P(", select the range A1:A101, and then hit ENTER. This gives the population variance. The population standard deviation is σ≈3.4 and the population variance is σ2≈11.7, rounding each to one decimal place.
To fairly divide their patients on a hospital ward, oncology nurses are going to examine their typical sample proportions. They find that, of 780 patients admitted to a hospital at any given time, there are 130 patients on the oncology ward. In a sample size 70 patients, use excel to find the proportion of patients that separates the lowest 25% from the upper 75%. Round your answer to three decimal places.
1$.045$.045 2$.136$.136 1. We can use the Central Limit Theorem here since np=70⋅0.167=12 and 70(1−0.167)=58 are both greater than 10. 2. We know that p=0.16667, so μp^=0.167 by the Central Limit Theorem. We can also find σp^=0.167(1−0.167)70−−−−−−−−−√≈0.045. We want to know the point in the sampling distribution for which 25% of sample proportions fall to the left and the remaining 75\% fall to the right. Remember that the =NORM.INV( shows us left-tail values. So, we want to use 0.25 for 25%. 3. Open Excel and click on any cell. We can either set up the spreadsheet with the information given, or we can continue with the =NORM.INV function on its own. 4. Type =NORM.INV(. Then enter the probability, 0.25, the population proportion 0.167, and the standard_deviation, 0.045. Hit enter. We find the answer is 0.136. This means: there is a 25% chance that the sample proportion will be at or below 0.136.
The percentage of married couples who own a single family home is 33% for a given population. A financial analyst is interested in the impact that home mortgages have on a couple's finances. As the financial analyst sets up a study, they are curious about the impact of sample size. What is the standard error of the sampling distribution of sample proportions for samples of size n=200, n=300, and n=400? Round all answers to the nearest thousandths if applicable.
1$0.033$0.033 2$0.027$0.027 3$0.024$0.024 Remember that σp^=p(1−p)n−−−−−−−√. With a population proportion p=0.33, this means when: n=200→σp^=0.33(1−0.33)200−−−−−−−−−−−−√=0.033 n=300→σp^=0.33(1−0.33)300−−−−−−−−−−−−√=0.027 n=400→σp^=0.33(1−0.33)400−−−−−−−−−−−−√=0.024 Notice that, as the sample size n increases, the standard deviation of the sampling distribution σp^ decreases. This means that as n increases, the sample mean better approximates the population mean.
A college student recently purchased a multiplayer video game. Before playing, the college student looked up statistics on how fast gamers we completing the full playthrough of the game. Of the 500 players that have completed the game, 331 of them completed the full game in less than 49 minutes. Use the graph below to determine the probability that in a sample of 45 games, 30 or fewer of them were completed in less than 49 minutes. Drag and move the blue dot to select the appropriate probability graph area from the four options on the left. (Note - there are four graphs available to choose from. Only select between less than, greater than, and area between graphs.); Use the Central Limit Theorem to find p^ and σp^; Calculate the z-score for p^ and move the slider along the x-axis to the appropriate z-score; The purple area under the curve represents the probability of the event occurring. Interpret the purple area under the curve. Round your answers to two decimal places.
1$0.66\pm0.01$0.66±0.01 2$0.67\pm0.01$0.67±0.01 3$0.07\pm0.01$0.07±0.01 4$0.14\pm0.01$0.14±0.01 5$0.56\pm0.01$0.56±0.01 1. We will select the less than graph option, with the purple area in the left tail of the curve, because we want to find the probability that 30 of the games were completed in less than 49 minutes. We know the total number of playthroughs in the total population, so we know the population proportion: p=331500=0.66 Next, we will check that the Central Limit Theorem for Sample Proportions applies. n=45 and p=0.66, so n(1−p)=45(1−0.66)=15.3 and np=29.7. These are greater than 10, so we may proceed and use the standard curve graph above to find our probability. 2. In order to calculate the z-score, so we need to identify μp^, and calculate p^ and σp^. The Central Limit Theorem for sample proportions tell us that p^ is normally distributed with mean μp^=p and σp^=p(1−p)n−−−−−√. We have μp^=p=0.66; The standard deviation of the sampling distribution of sample proportions is thusσp^=0.66(1−0.66)45−−−−−−−−−−−−√=0.07 For the given sample of 45 playthroughs, the sample proportion isp^=3045=0.67. Then, the corresponding z-score is z=p^−μp^σp^=0.67−0.660.07=0.14 3. Slide the black dot to the z-score we found above: 0.14. The area shown is 0.5557. Rounding to two decimal places, we have 0.56. That is, the probability that 30 or fewer out of 45 playthroughs will be completed in less than 49 minutes is 0.56.
A statistics professor asks the students in an introductory statistics lecture to fill out a survey. There are 100 students in the course, and each one filled out the survey. One of the questions asked students to state their age. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?
Population standard deviation There are 100 students in the class, and the professor has data for all of them - that is, the entire population of students in the class. So a population standard deviation would be more appropriate.
Dalton plays on his high school's baseball team and his friend Caleb plays on the high school's lacrosse team. Since both teams have several of the school's most prolific scholars, they wondered which team has a greater cumulative GPA on average. They randomly selected 20 baseball players and 20 lacrosse players and found each player's cumulative GPA with the help of a guidance counselor. The results of the survey are shown in the accompanying samples. A player on each team has a 3.92 cumulative GPA. Use Excel to calculate the z-score for the data value that represents the player with the 3.92 GPA for each team. Round your answers to two decimal places.
The $z$z -score for the player on the baseball team is $$1.44 and the $z$z -score for the player on the lacrosse team is $$1.17. 1. Enter the data for the baseball team into column A and the data for the lacrosse team into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the baseball team is 3.285 with a sample standard deviation of 0.441, rounded to three decimal places. The sample mean for the lacrosse team is 3.409 with a sample standard deviation of 0.438, rounded to three decimal places. 5. Use the sample mean and sample deviation of each data set to compute the z-score for the given values. Compute the z-score for the player on the baseball team who has a 3.92 GPA, rounding to two decimal places. z≈3.92−3.2850.441≈1.44 Compute the z-score for the player on the lacrosse team who has a 3.92 GPA, rounding to two decimal places. z≈3.92−3.4090.438≈1.17
The following data set provides information about the City of Somerville Assessors Valuation for the fiscal year 2016. For residential buildings, what is the sample standard deviation of the land area?
$\text{standard deviation=}0.02$standard deviation=0.02 To find the sample standard deviation of the land area for residential buildings, use the following formula. s=∑i(xi−x¯)2(n−1)−−−−−−−−−−−⎷ The sample mean is x¯=0.062373738. xx−x¯(x−x¯)20.05502755−0.0073461880.000053966478131340.3826905−0.0241046880.000581035983577340.090702480.0283287420.000802517623302560.05968779−0.0026859480.000007214316658700.068181820.0058080820.00003373381651872 The variance, which is equal to the square of the standard deviation, is equal to the sum of the squares of the deviations divided by one less then the sample size. s2ss=∑(x−x¯)2n−1=0.00147854−−−−−−−−−√=0.0192 Therefore, after rounding to two decimal places, we find that the sample standard deviation is about 0.02.
The following data values represent the daily amount spent by a family each day during a 7 day summer vacation. Find the standard deviation of this data set: $96,$125,$80,$110,$75,$100,$121 Round the final answer to one decimal place.
1$17.7$17.7 The population standard deviation is the square root of the population variance. Since we just found that the population variance to be 314.3, the sample standard deviation is 314.3−−−−√≈17.7.
A consumer report was released concerning the prices of various food products. The report listed the monthly average price of a pound of beef for 20 different months. Construct a box and whisker plot using Excel and the QUARTILE.INC function. Then, choose the correct answer below.
1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 5.138, Q1 is 5.26275, the median is 5.332, Q3 is 5.46425, and the maximum is 5.687. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the tool bar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create errors bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. For this dataset, the vertical axis was changed to run from 5.1 to 5.7 with major tick marks in increments of 0.1 and minor ticks marks in increments of 0.025. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. The completed box plot should look something like the following image.
Natasha records the price for a gallon of home-heating oil from 20 randomly selected providers in her region on May 15. She does it again with another 20 randomly selected providers on November 15. The results (in dollars) are shown in the accompanying samples. Each date had one provider that had a price of $2.879 per gallon. Is the $2.879 price per gallon more unusual to occur on May 15 or November 15? Use Excel to calculate the z-scores for a price of $2.879 on May 15 and on November 15, rounding to two decimal places.
May 15 $z$z -score: $$1.42 November 15 $z$z -score: $$−1.58 1$1.42$1.42 2$-1.58$−1.58 1. Enter the May 15 data into column A and the November 15 data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the May 15 data is 2.788 with a sample standard deviation of 0.064, rounded to three decimal places. The sample mean for the November 15 data is 3.004 with a sample standard deviation of 0.079, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for a provider having a price of $2.879 per gallon on May 15, rounding to two decimal places. z≈2.879−2.7880.064≈1.42 Compute the z-score for a provider having a price of $2.879 per gallon on November 15, rounding to two decimal places. z≈2.879−3.0040.079≈−1.58
A nutrition expert is publishing a study on the amount of sodium present in fast food. She recorded the sodium content of 10 different burgers from various fast food restaurant chains. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?
Sample standard deviation The nutrition expert collected data for only a sample of 10 different burgers from various fast food chains. Since she does not have data for all possible burgers from all fast food chains, a sample standard deviation would be more appropriate.
A dental student is conducting a study on the number of people who visit their dentist regularly. Of the 520 people surveyed, 312 indicated that they had visited their dentist within the past year. Find the population proportion, as well as the mean and standard deviation of the sampling distribution for samples of size n=60. Round all answers to 3 decimal places.
The population proportion is p=312520=0.6. This is also the mean of the sampling distribution: μp^=p=0.6 For samples of size n=60, the standard deviation of the sampling distribution is σpˆ=p(1−p)n−−−−−−−√=0.600(1−0.600)60−−−−−−−−−−−−−−√≈0.063
An independent census agency polls a random sample of 30 households in a particular neighborhood to find how many people live in each household. Using Excel, calculate the mode(s) of the dataset provided below.
There are two modes. The modes are 2 and 4. The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. In C1 type "Mode", highlight cell range C2:C5, while highlighted type "=MODE.MULT(A2:A31)", and then press CTRL+SHIFT+ENTER. If there is one mode, each cell in cell range C2:C5 will display that mode's value. If there is more than one mode, each mode is displayed in one of the cells in the cell range C2:C5. If each cell in the cell range C2:C5 has a unique mode value, then there may be more modes and step 1 should be repeated with a taller cell range, for example C2:C10. If there are more cells in the range than there are modes but there is more than one mode, the remaining cells in the range will display #N/A to indicate there were no other modes found. Cell C2 should display 4 and cell C3 should display 2. Cells C4 and C5 both should display #N/A, indicating that there are two modes: 0 and 4.
The high temperature, in ∘C, on the first day of winter was recorded in a certain city every year from 1915 to 2015. The following six temperature values are a sample chosen from the data. 4,12,6,9,6,11 Given that the sample variance of this data set is 10, find the sample standard deviation. Round the final answer to one decimal place.
$\text{sample std=}3.2$sample std=3.2 Remember that the sample standard deviation is the square root of the sample variance. Since we just found that the sample variance is 10.0, we find that the sample standard deviation is 10.0−−−−√=3.2.
Wes is the owner of a real estate agency and is analyzing the time the agents take to sell houses. He reviews each house sold by his agency to determine the number of days each house was on the market before it was sold. The data for a random sample of 20 houses sold in the last year are provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the third quartile?
$165$165 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 65, Q1 is 111.75, the median is 140, Q3 is 165, and the maximum is 199. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the third quartile, where the top whisker intersects with the top side of the box, is 165.
The following data set represents the pulse rate for 5 randomly selected players on a football team Find the standard deviation of the data set. Round your answer to one decimal place. 66, 70, 75, 80, 84
$s=\ 7.3$s= 7.3 To find the sample standard deviation, follow these steps: Find the mean of the dataset:66+70+75+80+845=75. Find the deviation for each data value which is calculated by taking each data value and subtracting the mean:66−75=−9, 70−75=−5, 75−75=0, 80−75=5, 84−75=9 Square each of the deviations from step 2.(−92)=81, (−52)=25, (02)=0, (52)=25, (92)=81 Add up the square of the deviations from Step 3.81+25+0+25+81=212 Divide this total from Step 4 by (n−1) which is the number of data values minus 1. 2124=53This result is the sample variance of the dataset. Take the square root of the result from Step 5. This result is then the sample standard deviation of the data set. s=53−−√=7.28Then round your answer per the rounding instructions given in the problem.
When Natalie moved from Nashville, TN to Philadelphia, PA, her average monthly water bill increased from $32 to $47. She is curious to know whether her TN or PA water bill is relatively more or less expensive, when compared to the distribution of water bills for each city. The mean and standard deviation for water bills in each city are shown in the table.
1$0.8$0.8 2$0.625$0.625 The mean monthly water bill in Nashville, TN is $30, with a standard deviation of $2.50. The z-score corresponding to Natalie's TN water bill of $32 is z=x−μσ=32−302.5=0.8 The mean monthly water bill in Philadelphia, PA is $45, with a standard deviation of $3.20. The z-score corresponding to Natalie's PA water bill of $47 is z=x−μσ=47−453.2=0.625
Colin is a student in a consumer mathematics class creating a report on the annual interest rates of savings accounts. He randomly selects 25 brick-and-mortar banks and 25 online banks and then records the annual interest rate for each financial institution. The results of the samples are provided in the accompanying data set. Colin found an interest rate of 2.15% from a brick-and-mortar bank and an online bank. For which type of bank is this interest rate a more unusual result, the brick-and-mortar bank or the online bank? Use Excel to calculate the z-scores for an interest rate of 2.15% for each type of bank. Round to two decimal places.
1$3.41$3.41 2$1.78$1.78 1. Enter the Brick-and-Mortar data into column A and the Online data into column B in Excel. 2. Select Data, then select Data Analysis, and then select Descriptive Statistics. 3. In the Descriptive Statistics dialog box, enter the cells containing the data sets into Input Range, make sure Columns is selected under Group By, tick the Summary statistics check box, and press OK. 4. Read the sample mean and sample standard deviation of each group from the output. The sample mean for the Brick-and-Mortar data is 0.998 with a sample standard deviation of 0.338, rounded to three decimal places. The sample mean for the Online data is 1.642 with a sample standard deviation of 0.286, rounded to three decimal places. 5. Use the sample mean and sample standard deviation of each data set to compute the z-score for the given values. Compute the z-score for a brick-and-mortar bank having a 2.15% interest rate, rounding to two decimal places. z≈2.15−0.9980.338≈3.41 Compute the z-score for an online bank having a 2.15% interest rate, rounding to two decimal places. z≈2.15−1.6420.286≈1.78
A policy think tank focused on teenage obesity was able to get several pediatricians to release anonymous data on the weights in pounds of sixteen-year-olds who had come to their office this year. A sample of 20 of the weights is included below. Construct a box and whisker plot using Excel and the QUARTILE.INC function to select the correct plot below.
1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 110, Q1 is 129.75, the median is 146.5, Q3 is 173, and the maximum is 329. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the toolbar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create errors bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. For this dataset, the vertical axis was changed to run from 100 to 350 with major tick marks in increments of 25 and minor ticks marks in increments of 5. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. The completed box plot should look something like the following image.
Colin is a student in a consumer mathematics class creating a report on the annual interest rates of savings accounts. He randomly selects 25 brick-and-mortar banks and 25 online banks and then records the annual interest rate for each financial institution. The results of the samples are provided in the accompanying data set. Colin found an interest rate of 2.15% from a brick-and-mortar bank and an online bank. Based on the z-scores you calculated above, for which type of bank is this interest rate a more unusual result, the brick-and-mortar bank or the online bank?
The absolute value of the z-score for the brick-and-mortar bank is greater than for the online bank, so the brick-and-mortar bank having a 2.15% interest rate is more unusual. A z-score with a greater absolute value means that the data value is more unusual. Since the absolute value of the z-score for the brick-and-mortar bank, 3.41, is greater than the absolute value of the z-score for the online bank, 1.78, the brick-and-mortar bank's interest rate is more unusual.
In a large town, 57% of the population over the age of 25 have a college degree. Use the graph below to calculate the probability that if a random sample of 50 people over the age of 25 are chosen, over half of them have a college degree. Use the graph below to calculate the probability that at least half the people in the sampling distribution will have a college degree: Drag and move the blue dot to select the appropriate probability graph area from the four options on the left. (Note - there are four graphs available to choose from. Only select between less than, greater than, and area between graphs.); Use the Central Limit Theorem to find p^ and σp^; Calculate the z-score for p^ and move the slider along the x-axis to the appropriate z-score; The purple area under the curve represents the probability of the event occurring. Interpret the purple area under the curve. Enter your answer in decimal form and round to two decimal places.
$0.84\pm0.01$0.84±0.01 1. We will select the greater than graph option, with the purple area in the right tail of the curve, because we want to find P(p^>0.50), the probability that the sample proportion p^ will be more than 50%=0.50. We are given a population proportion of p=0.57. Next, we will check that the Central Limit Theorem for Sample Proportions applies. n=50 and p=0.57, so n(1−p)=21.5. This is greater than 10, so we may proceed and use the standard curve graph above to find our probability. 2. In order to calculate the z-score, so we need to identify μp^, and calculate p^ and σp^. The Central Limit Theorem for sample proportions tell us that p^ is normally distributed with mean μp^=p and σp^=p(1−p)n−−−−−√. We have μp^=p=0.57; The standard deviation of the sampling distribution of sample proportions is: σp^=0.57(1−0.57)50−−−−−−−−−−−−√=0.0700 The sample proportion is p^=0.50. Then, the corresponding z-score is z=0.50−0.570.07=−1 3. Slide the black dots to the z-score we found above: −1. The area shown is 0.8413. Rounding to the nearest hundredth, we have P(p^>50)=0.84.
Klayton is the owner of a franchised hotel chain and manages its brand carefully. He thinks the standard set for the number of pillows each guest bed must have needs to be updated. Klayton requests that the franchisees of the chain ask each guest what number of pillows is considered ideal. A random sample of the responses from 20 guests is provided below. Use Excel and the QUARTILE.INC function to construct a box and whisker plot for the dataset. What is the value of the maximum for this dataset?
$6$6 1. Open the dataset in Excel. The dataset should occupy A1:A21, where A1 is the header and A2:A21 contains the data. Steps (2) through (6) have you construct a table for the quartile values. 2. Use fill series or manually enter the numbers 0, 1, 2, 3, and 4 into cell range B2:B6 from top to bottom, one number per cell. 3. Write the following entries into cell range D2:D6 from top to bottom, one entry per cell: Minimum, Q1, Median, Q3, Maximum. 4. In cell C1, copy the header for the dataset from A1. 5. In cell C2, write "=QUARTILE.INC(A$2:A$21,$B2)." 6. Then copy and paste this cell to C3:C6. Cells C2:C6 contain the five-number summary. The minimum is 2, Q1 is 3, the median is 5, Q3 is 5, and the maximum is 6. Continue on to construct the box plot. Steps (7) through (10) have you construct a table for the quartile differences. 7. Copy and paste cell range B1:C6 into cell range B7:C12. 8. Write the following entries into cell range D8:D12 from top to bottom, one entry per cell: Minimum, Q1−Minimum, Median−Q1, Q3−Median, Maximum−Q3. 9. In cell C9, write "=C3-C2" where C3 is the corresponding cell in the quartiles table. 10. Copy and paste this new cell to the remaining three cells underneath. Steps (11) through (15) have you construct the box plot from the quartile differences. 11. Create a stacked column chart type from the quartile differences. Select cells C7:C12. Then at the top of the Excel window, click the Insert tab, then the Charts button labeled Column, and then the Stacked Column chart type. The Stacked Column chart type is the top row and second column of the Columns chart types. 12. Switch the row and column data on the resulting chart. Click the Design tab at the top of the Excel Window, and then press the button labeled Switch Row/Column. 13. Set the Fill to blank for the bottom two sections of the bar and also the top section. Click the bottom segment of one of the rectangles in the plot, click the Format tab, and below that click the Format Selection button. In the Format Data Series window that pops up, click the Fill tab, and then click the No Fill radio button. 14. Select the top visible segment, and give it error bars that stretch from the top of the visible segment to the maximum of each dataset. Select the top visible segment. Then click on the Layout tab at the top of the Excel window, click on the Error Bars button, and then click on the second option from the top labeled Error Bars With Standard Error. Then select the error bars that appear on the plot, and click Format Selection on the tool bar above. In the Format Error Bars window, click the Vertical Error Bars tab, click the Plus button under Direction, and click the Cap button under End Style. Under Error Amount, click Custom, and then press Specify Value. In the resulting pop-up, click on the Positive Error Value field and select the data in cell C12. 15. Create error bars for the bottom visible segment that extend to the minimum of each dataset. Click the segment below the bottom visible segment, being careful not to select the very bottom segment by accident. Select or click Format Selection again. In the Format Error Bars window, click the Vertical Error Bars tab, click the Minus button under Direction, and click the Cap button under End Style. Under Error Amount, click Percentage:, and then set its value to 100.0% without typing the % symbol. Steps (16) through (20) are optional and deal with beautifying the box and whisker plot. 16. Delete the legend. 17. Modify the vertical axis to more tightly contain the data. 18. Set the fill of both visible box segments to solid color. 19. Set the border color of the visible box segments to solid color. 20. Set the line color of the error bars to solid color. Looking at the box and whisker plot, the maximum, which is the top side of the top whisker, is 6.
According to a study, 60% of people who are murdered knew their murderer. Suppose that in a particular state there are currently 50 current cold cases. What is the probability that of those 50 cold cases, between 28 and 33 of them knew their murderer? Round all numbers to two decimal places. Use the z-tables below:
$\sigma_{p̂}=\ $σp̂= 1$$ $P\left(28\le X\le33\right)=\ $P(28≤X≤33)= 2$$ Correct answers:1$0.07\pm0.01$0.07±0.012$0.52\pm0.01$0.52±0.01 We are given the population proportion of p=60%=0.6. Again, we will check the conditions of the CLT. n=50, and p=0.6, so n(1−p)=20. 20 is greater than 10, so we may proceed as if our distribution is normal by using the standard normal table for probabilities The standard deviation of the sampling distribution of sample proportions is thus σp^=0.6(1−0.6)50−−−−−−−−−−√≈0.07 The proportions of interest are then p^1=2850=0.56 and p^2=3350=0.66, which correspond to the two z-scores z1=0.56−0.60.07=−0.57andz2=0.66−0.60.07=0.86 The probabilities corresponding to these z-scores can be found in the given tables: From the z-tables, P(X≤28)=P(z≤−0.57)=0.284 and P(X≤33)=P(z≤0.86)=0.805. The probability that, in a sample of n=50, the sample proportion p^ falls between 2850 and 3350 is the difference between these probabilities: P(28≤X≤33)=P(−0.57≤z≤0.86)=0.805−0.284=0.521 So to two decimal places, P(28≤X≤33)=0.52.
According to a study, 81% of pro gamers own a dedicated video game console. Suppose that at a gaming convention, 54 pro gamers are sampled. Use Excel to calculate the probability that of those 54 gamers, between 35 and 37 of them own a dedicated video game console. Round your answers to three decimal places.
1$0.053$0.053 2$0.009$0.009 1. We can use the Central Limit Theorem for Proportions because we have np=54⋅0.81=43.74 and 54(1−0.81)=10.26, both of which are greater than 10. 2. The following information is given: The population proportion, p=0.81, so we know μp^=p=0.81. The sample size is 54, so we know the standard deviation of the sampling distribution is σp^=0.81(1−0.81)54−−−−−−−−√=0.053. We want to know the probability that P(0.648≤p^<0.685). 3. Open Excel and click on any cell. 4. Type =NORM.DIST(. Then enter the value 0.685 as x, the mean of the sample μp^=0.81 as the mean, the standard deviation of the sample distribution σp^=0.053 as the standard_deviation, and TRUE for cumulative. Hit ENTER. This probability is 0.010. Because we want to find an in between probability, we need to do this again. So type =NORM.DIST(0.648, 0.81, 0.053, TRUE). This probability is 0.0012. Then, the difference between the two probabilities gives us P(35≤X≤37)=0.0085. So to three decimal places, P(35≤X≤37)=0.009.
A college student recently purchased a multiplayer video game. Before playing, the college student looked up statistics on how fast gamers were completing the full playthrough of the game. Of the 500 players that have completed the game, 331 of them completed the full game in less than 49 minutes. Use Excel to determine the probability that in a sample of 45 games, 30 or fewer of them were completed in less than 49 minutes. Round your answers to two decimal places.
1$0.66$0.66 2$0.67$0.67 3$0.07$0.07 4$0.53$0.53 1. We can use the Central Limit Theorem for Proportions because we have np=45⋅0.662=29.79 and 45(1−0.662)=15.21, both of which are greater than 10. 2. The following information is given: The population proportion, p=0.662, so we know μp^=p=0.662. The sample size is 45, so we know the standard deviation of the sampling distribution is σp^=0.662(1−0.662)45−−−−−−−−−√=0.071. We want to know the probability that P(p^<0.667). 3. Open Excel and click on any cell. 4. Type =NORM.DIST(. Then enter the value 0.667 as x, the mean of the sample μp^=0.662 as the mean, the standard deviation of the sample distribution σp^=0.071 as the standard_deviation, and TRUE for cumulative. Hit ENTER. This probability is 0.526. That is, the probability that 30 or fewer out of 45 playthroughs will be completed in less than 49 minutes is 0.53, when rounded to two decimal places.
Dalton plays on his high school's baseball team and his friend Caleb plays on the high school's lacrosse team. Since both teams have several of the school's most prolific scholars, they wondered which team has a greater cumulative GPA on average. They randomly selected 20 baseball players and 20 lacrosse players and found each player's cumulative GPA with the help of a guidance counselor. The results of the survey are shown in the accompanying samples. A player on each team has a 3.92 cumulative GPA. Based on the z-scores you calculated above, is a baseball or lacrosse player more likely to have a 3.92 cumulative GPA?
A lacross player is more likely to have a 3.92 cumulative GPA, because the absolute value of the z-score for baseball is greater than for lacrosse. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for the lacrosse player, 1.44, is greater than the absolute value of the z-score for the baseball player, 1.17, a player on the school's lacrosse team is more likely to have a 3.92 GPA.
Based on the z-scores calculated above for Angie and Beth, which swimmer had the fastest time when compared to her team?
Beth Angie has a z-score of −1.25 and Beth's z-score is −2. Both Angie and Beth have negative z-scores, meaning they both swim in less time than their team's mean time. In terms of swim times, lower values are faster times, so Beth has the faster swim time when compared to her team. Angie's team has a mean time of μ=27.2 with a standard deviation of σ=0.8. The z-score corresponding to Angie's swim time of x=26.2 seconds is z=x−μσ=26.2−27.20.8=−1.25 Beth's team has a mean time of μ=30.1 with a standard deviation of σ=1.4. The z-score corresponding to Beth's swim time of x=27.3 seconds is z=x−μσ=27.3−30.11.4=−2
Based on the z-scores calculated above for Stephan's electric bills in IL and FL, in which state is his electric bill higher, when compared to their respective distributions?
Illinois Both z-scores are negative, meaning Stephan's bills are below the average in both states. His Florida bill is 0.75 standard deviations below the Florida state mean, but his Illinois bill is only 0.625 standard deviations below the Illinois state mean. This means his bill was higher in Illinois, when compared to the state distributions of electric bills.
Karl and Fredo are basketball players who want to find out how they compare to their team in points per game. The mean amount of points per game and standard deviations for their team were calculated. Karl's z-score is 0.9. Fredo's z-score is −0.65. Which of the following statements are true about how Karl and Fredo compare to their team in points per game? Select all that apply.
Karl's average points per game is 0.9 standard deviations greater than his teammates' average points per game. Fredo's average points per game is closer to the team's mean than Karl's. The z-score is the number of standard deviations a data value is from the mean of the data set. Karl's average points per game is 0.9 standard deviations greater than his team mean. Fredo's average points per game is 0.65 standard deviations less than his team mean. But, since Fredo's z-score is −0.65, the distance from this to the mean is less than 0.9. since |−0.65|<0.9. So, Fredo's average points per game is closer to the team's mean than Karl's.
Key Terms
Spread: a general term that describes how closely concentrated or how far apart data values are to one another Deviation: the difference between a particular data value and the mean of the data set Sample variance: a measure of spread calculated as the sum of the squares of the deviations divided by (n−1) where n is the number of data values in the sample Sample standard deviation: a measure of spread calculated as the square root of the sample variance Population variance: a measure of spread calculated as the sum of the squares of the deviations divided by N where N is the number of data values in the population Population standard deviation: a measure of spread calculated as the square root of the population variance
The charts below show the stock price for four different technology companies in the past 16 months. Which of the following histograms shows a skewed data set? Select all correct answers.
The data in the second graph above is skewed to the left because most values are in the 17 to 18 range, and then there is a tail of smaller values to the left. The data in the third graph above is skewed to the right because most values are in the 2 to 3 range and then there is a tail of larger values to the right. The data from the first and fourth graphs above is roughly even and centered around 10.
An event coordinator for a particular marathon held yearly is reviewing the data from the top 30 race finish times from the last race. Using Excel, calculate the mode(s) of the dataset provided below.
There are two modes. The modes are 2.47 and 4.14. The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. In C1 type "Mode", highlight cell range C2:C5, while highlighted type "=MODE.MULT(A2:A31)", and then press CTRL+SHIFT+ENTER. If there is one mode, each cell in cell range C2:C5 will display that mode's value. If there is more than one mode, each mode is displayed in one of the cells in the cell range C2:C5. If each cell in the cell range C2:C5 has a unique mode value, then there may be more modes and step 1 should be repeated with a taller cell range, for example C2:C10. If there are more cells in the range than there are modes but there is more than one mode, the remaining cells in the range will display #N/A to indicate there were no other modes found. Cell C2 should display 2.47, and cell C3 should display 4.14. Cells C4 and C5 should both display #N/A, indicating that there are two modes: 2.47 and 4.14.
The following data set represents the ages of all 6 of Nancy's grandchildren. 11,8,5,6,3,9 To determine the "spread" of the data, would you employ calculations for the sample standard deviation, or population standard deviation for this data set?
Use calculations for population standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the population standard deviation should be used because the data set represents all of, that is the total population of, Nancy's grandchildren.
Of the 1,300 children participating in a town's parks and recreations programs, 765 are under the age of 8. To encourage participation across all ages, city officials are studying which programs are the most popular in each age group. In the sampling distribution of sample proportions of size 100, above what proportion will 35% of all sample proportions be? Select all answers that apply to your calculation below. Use the z-table given below to answer the question:
z=0.38 p^=0.61 Since we are looking for the top 35% of sample proportions, we need to look in the z-table for the bottom 65%. So the top 35% corresponds to a z-score of z=0.38. Next, we will find the sample proportion that goes with the z-score. Note the population proportion is not directly given, but can be calculated as follows: p=7651300=0.588. To find the sample proportion p^ that corresponds to z=0.38, we substitute the known values into the formula z=p^−pσp^ and solve for p^ to get 0.38p^=p^−0.590.59(1−0.59)100−−−−−−−−√≈0.609 Rounded to two decimal places, p^=0.61. This means: the top 35% of possible sample proportions are at or above 0.61.
A sandwich shop finds that of 112 customers they serve during the day, 64 purchase a coffee with their sandwich. The shop owner is trying to decide how many baristas to hire per shift. In the sampling distribution of sample proportions of size 36, above what proportion will 58% of all sample proportions be? Select all answers that apply to your calculation below. Use the z-table given below to answer the question:
z=−0.21 p^=0.55 Since we are looking for the top 58% of sample proportions, we need to look in the z-table for the bottom 42%. So the top 58% corresponds to a z-score of z=−0.21. Next, we will find the sample proportion that goes with the z-score. Note the population proportion is not directly given, but can be calculated as follows: p=64112=0.57. To find the sample proportion p^ that corresponds to z=−0.21, we substitute the known values into the formula z=p^−pσp^ and solve for p^ to get −0.21p^=p^−0.570.57(1−0.57)36−−−−−−−−√≈0.553 Rounded to two decimal places, p^=0.55. This means: the top 58% of possible sample proportions are at or above 0.55.
The heights, in inches, of the members of a barbershop quartet of singers are listed below. 72,68,67,73 If the population variance for this data set is 6.5, what is the standard deviation of the heights of the barbershop quartet? Round your answer to 2 decimal places.
$2.55$2.55 The standard deviation is the square root of the variance. Since the population variance is 6.5, the (population) standard deviation is 6.5−−−√≈2.549 . So to two decimal places, the standard deviation is 2.55.
The following data set provides infomation about the City of Somerville Assessors Valuation for the fiscal year 2016. For residential buildings, what is the sample variance of the land area? Round your answer to FOUR decimal places.
$\text{sample variance=}0.0004$sample variance=0.0004 The mean is 0.0624. The sample variance is the sum of the squares of the difference of the specific data value from the mean divided by 4, which is n−1. For this sample, the sample variance is 0.0004.
Find the sample variance of the following set of data:8, 6, 3, 11, 7. Round the final answer to one decimal place
$\text{sample variance=}8.5$sample variance=8.5 First, we find that the mean is 8+6+3+11+75=355=7 Now, we need to take the deviations from the mean and square them:Value863117Deviation1.0−1.0−4.04.00.0Deviation21.01.016.016.00.0Finally, we add up the squared deviations and divide by the number of data values minus one (5−1=4).1.0+1.0+16.0+16.0+0.04=8.5
When Stephan moved from Illinois to Florida, his average monthly electric bill increased from $83 to $102. He is curious to know whether his IL or FL electric bill is relatively more or less expensive, when compared to the distribution of electric bills for each state. In Illinois, the mean monthly electric bill is $85, with a standard deviation of $3.20. In Florida, the mean monthly electric bill is $105, with a standard deviation of $4.00. Compute the z-scores for Stephan's IL and FL electric bills. Round to three decimal places if necessary.
1$-0.625$−0.625 2$-0.75$−0.75 The mean monthly electric bill in Illinois is $85, with a standard deviation of $3.20. The z-score corresponding to Stephan's IL electric bill of $83 is z=x−μσ=83−853.2=−0.625 The mean monthly electric bill in Florida is $105, with a standard deviation of $4.00. The z-score corresponding to Stephan's FL electric bill of $102 is z=x−μσ=102−1054.00=−0.75
Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time when compared to her team. SwimmerTime (sec)Team Mean TimeTeam Standard DeviationAngie26.227.20.8Beth27.330.11.4 Compute the z-scores for Angie and Beth.
1$-1.25$−1.25 2$-2$−2 Angie's team has a mean time of μ=27.2 with a standard deviation of σ=0.8. The z-score corresponding to Angie's swim time of x=26.2 seconds is z=x−μσ=26.2−27.20.8=−1.25 Beth's team has a mean time of μ=30.1 with a standard deviation of σ=1.4. The z-score corresponding to Beth's swim time of x=27.3 seconds is z=x−μσ=27.3−30.11.4=−2
Based on the z scores found above, is Kathy or Linda's starting salary higher, when compared to the salary distributions of each company?
Kathy Kathy's starting salary of $31,500 has a z-score of z=−1.5, which means her salary 1.5 standard deviations below her company's mean salary. Linda's starting salary of $33,000 has a z-score of z=−2, which means her salary is 2 standard deviations below her company's mean salary. Kathy's salary corresponds to a greater z−-score, when compared to their respective company's salary distributions, Kathy's has the better starting salary.
Based on the z-scores calculated above for Natalie's water bills in TN and PA, in which city is her water bill closer to the city's mean water bill, when compared to their respective distributions?
Philadelphia, PA Natalie's Nashville, TN water bill has a z-score of 0.8 and her Philadelphia, PA water bill has a z-score of 0.625. For both bills, the z-scores are positive, meaning Natalie's water bill is above the mean in both cities. A z-score of 0.625 is closer to the mean, so her Philadelphia, PA water bill is closer to the PA mean water bill.
A statistics professor surveys all 100 of the students in an introductory statistics lecture. The survey asks the students to estimate when they typically wake up on weekdays. The data are recorded in terms of the number of hours after midnight the students wake up. Would it be more appropriate to find a sample standard deviation or population standard deviation in this situation?
Population standard deviation The survey was given to all 100 students in the class, which is the entire population of the class. So a population standard deviation would be more appropriate.
Casey is looking to rent a two-bedroom apartment in one of two towns, Gardiner or Augusta. He randomly selects 100 two-bedroom apartments from both towns and records the area of each apartment. Both towns have a two-bedroom apartment that has an area of 645 square feet. Based on the z-scores you calculated above, which of the two towns is more likely to have a two-bedroom apartment with 645 square feet?
The absolute value of the z-score for the apartment in Augusta is less than for the apartment in Gardiner, so the apartment in Augusta having an area of 645 square feet is more likely. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for Augusta, 1.38, is less than the absolute value of the z-score for Gardiner, 1.46, the apartment that has an area of 645 square feet is more likely to occur in Augusta than in Gardiner
The following data set provides information of Households by Total Money Income, Race, and Hispanic Origin of Householder. Looking at the data set for household income for all races in 2015, you see that the median is $56,516 and the mean is $79,263. Which of the following is most likely?
The data are skewed to the right. If the mean is greater than the median, the mean has been pulled to the right, and the data are skewed to the right.
Given the following box-and-whisker plot, decide if the data is skewed or symmetrical.
The data are skewed to the right. Note that the whisker on the right is much longer than the whisker on the left. So there are several larger values on the right. Therefore, the data are skewed right. This could represent houses in one neighborhood. Most are in the lower range, but a few are worth much more.
If the median of a data set is 11 and the mean is 11, which of the following is most likely?
The data are symmetrical. Because the mean and the median are equal, we expect that the data are symmetric. This could represent machine output of a clothing company. Majority of product output falls in the center, a few times the machine produces more or less. This allows the average output and middle output value to be equal.
Studies show that out of the 40,000 people in a mid-sized city, 18,000 will suffer from some type of common cold in any given winter. In order to reduce the spread of the diseases, doctors are studying the statistics of communicable diseases such as the common cold. Use Excel to find above what proportion will 95% of all sample proportions be in the sampling distribution of sample proportions of size 1,000.
$0.424$0.424 1. We can use the Central Limit Theorem here since np=1000⋅0.45=450 and 1000(1−0.45)=550 are both greater than 10. 2. We know that p=18,00040,000=0.45, so μp^=0.45 by the Central Limit Theorem. We can also find σp^=0.45(1−0.45)1000−−−−−−−−√≈0.016. We want to know the point in the sampling distribution for the top 95% of sample proportions. Remember that the =NORM.INV(shows us left-tail values. So, we want to use 1−0.95=0.05, the complement of 0.95. 3. Type =NORM.INV(. Then enter the probability, 0.05, the population proportion0.45, and the standard_deviation, 0.016. Hit enter. The top 95% of the sampling distribution is identified by the sample proportion with the value 0.424, when rounded to three decimal places.
A sandwich shop finds that of 112 customers they serve during the day, 64 purchase a coffee with their sandwich. The shop owner is trying to decide how many baristas to hire per shift. Use Excel to find above what proportion 58% of all sample proportions will be in a sampling distribution of sample proportions of size 25. Round your answer to three decimal places.
$0.551$0.551 1. We can use the Central Limit Theorem here since np=25⋅0.571=14.29 and 25(1−0.571)=10.71 are both greater than 10. 2. We know that p=64112=0.571, so μp^=0.571 by the Central Limit Theorem. We can also find σp^=0.571(1−0.571)25−−−−−−−−−√≈0.099. We want to know the point in the sampling distribution for the top 58% of sample proportions. Remember that the =NORM.INV(shows us left-tail values. So, we want to use 1−0.58=0.42, the complement of 0.58. 3. Type =NORM.INV(. Then enter the probability, 0.42, the population proportion 0.571, and the standard_deviation, 0.099. Hit enter. The top 58% of the sampling distribution is identified by the sample proportion with the value 0.551, when rounded to three decimal places.
Of the 4,000 undergraduate students at a state university, 2,500 indicate on a university survey that they have taken student loans to help pay for college. In order to help make college more affordable for their students, the university's administrative team is conducting a statistical study. Use Excel to find above what proportion will 46% of all sample proportions be in the sampling distribution of sample proportions of size 250. Round your answer to three decimal places.
$0.628$0.628 1. We can use the Central Limit Theorem here since np=250⋅0.625=156.25 and 250(1−0.625)=93.75 are both greater than 10. 2. We know that p=25004000=0.625, so μp^=0.625 by the Central Limit Theorem. We can also find σp^=0.6251(1−0.625)250−−−−−−−−−−√≈0.031. We want to know the point in the sampling distribution for the top 46% of sample proportions. Remember that the =NORM.INV(shows us left-tail values. So, we want to use 1−0.46=0.54, the complement of 0.46. 3. Type =NORM.INV(. Then enter the probability, 0.54, the population proportion 0.625, and the standard_deviation, 0.031. Hit enter. The top 46% of the sampling distribution is identified by the sample proportion with the value 0.628, when rounded to three decimal places.
The percentage of students at a local college that purchased a newly released video game is 17%. An avid gamer plans to study the video game industry and the impact of gaming on college life. For her study, what sample size, n, will the sampling distribution of sample proportions have a standard deviation of 0.01?
$1411\ \text{college students}$1411 college students We are given a population proportion of p=0.17 and want to know the sample size n for which the sampling distribution has a standard deviation of σp^=0.01. Substituting the known values into the formula for the standard error, σp^0.01=p(1−p)n−−−−−−−√=0.17(1−0.17)n−−−−−−−−−−−−√ Squaring both sides and solving for n yields 0.0001n=0.17(1−0.17)n=0.17(1−0.17)0.0001=1411 Since the sample size is always a whole number, round UP to the desired sample size of 1,411 college students.
A doctor is concerned that hospital patients with minor injuries have longer than necessary hospital stays due to low nurse staffing. Of the last 5 years, the percentage of patients who had minor injuries and stayed at the hospital for over 3 days was 22%. If the doctor chooses to study this further, for what sample size n will the sampling distribution of sample proportions have a standard deviation of 0.15?
$8\ \text{patients with minor injuries}$8 patients with minor injuries We are given a population proportion of p=0.22 and want to know the sample size n for which the sampling distribution has a standard deviation of σp^=0.15. Substituting the known values into the formula for the standard error, σp^0.15=p(1−p)n−−−−−−−√=0.22(1−0.22)n−−−−−−−−−−−−√ Squaring both sides and solving for n yields 0.0225n=0.22(1−0.22)n=0.22(1−0.22)0.0225=7.6266 Since the sample size is always a whole number, round UP to the desired sample size of 8 patients with minor injuries.
A veterinary researcher is studying a particular type of dog called the Australian Cattle Dog. The researcher has acquired data on a sample of 30 dogs, including the weight in pounds of each of the dogs. The dog weight dataset is provided below.
$\text{Mean}=40.6,\ \text{Median}=40.6,\ \text{Mode}=38.5$Mean=40.6, Median=40.6, Mode=38.5 The mean, median, and mode can be calculated quickly and easily in Excel using the built-in functions for these calculations. Open the accompanying dataset in Excel. The range of the data is A2:A31. In B1 type "Mean", in B2 type "=AVERAGE(", select the data or write their range, and then hit ENTER. In C1 type "Median", in C2 type "=MEDIAN(", select the data or write their range, and then hit ENTER. In D1 type "Mode", highlight cell range D2:D5, type "=MODE.MULT(", select the data or write their range, and then hit CTRL+SHIFT+ENTER. Rounding the results in B2 and C2 to one decimal place should result in a mean of 40.6 and a median of 40.6. All cells given to MULT.MODE, the cell range D2:D5, should show a value of 38.5. This indicates 38.5 is the only mode.
The following set of data represents stock prices of a pharmaceutical company, find the sample variance of the:14, 7, 10, 9. Round your answer to ONE decimal place.
$\text{sample variance=}8.7\ price^2$sample variance=8.7 price2 First, we find that the mean is 14+7+10+94=404=10 Now, we need to take the deviations from the mean and square them: ValueDeviationDeviation2144.016.07−3.09.0100.00.09−1.01.0 Finally, we add up the squared deviations and divide by the number of data values minus one (4−1=3). 16.0+9.0+0.0+1.03=8.7 The sample variance shows how the stock values differ from the stock's mean. The larger the difference, the riskier the stock. This would say that the stock varies from 10 (mean), by 8.7 (units^2).
A college has a total enrollment of 2445 students, and 469 of them are left-handed. Use the graph below to determine the probability that a survey of 50 students will find that 9 or fewer students are left-handed. Use the graph below to calculate the probability that the sampling distribution will have a proportion of left-handed students less than or equal to 9: Drag and move the blue dot to select the appropriate probability graph area from the four options on the left. (Note - there are four graphs available to choose from. Only select between less than, greater than, and area between graphs.); Use the Central Limit Theorem to find p^ and σp^; Calculate the z-score for p^ and move the slider along the x-axis to the appropriate z-score; The purple area under the curve represents the probability of the event occurring. Interpret the purple area under the curve. Round to two decimal places.
$p\ =\ $p = $$.19 $p̂\ =\ $p̂ = $$.19 $\sigma_{p̂}\ =\ $σp̂ = $$.06 $z=\ $z= $$−.17 $P\left(X\le9\right)\ =\ $P(X≤9) = $$.43 1$0.19\pm0.01$0.19±0.01 2$0.18\pm0.01$0.18±0.01 3$0.06\pm0.01$0.06±0.01 4$-0.17\pm0.01$−0.17±0.01 5$0.43\pm0.01$0.43±0.01 1. We will select the less than graph option, with the purple area in the left tail of the curve, because we want to know the probability that 9 or fewer people in the sample are left-handed. Before we calculate the z-score, we must find p, the population proportion. We know the total number of left-handed people in the total population, so we know the population proportion: p=4692445=0.19 We should also check that we have met the condition for the Central Limit Theorem by plugging in p and n into n(1−p)≥10. We have a sample size of 50 students, so we have: 50(1−0.19)=40.5 40.5 is greater than 10, so we can use the Central Limit Theorem. The CLT for sample proportions tell use that p^ is normally distributed with mean μp^=p and σp^=p(1−p)n−−−−−√. 2. In order to calculate the z-score, so we need to identify μp^, and calculate p^ and σp^: We have μp^=p=0.19; The standard deviation of the sampling distribution of sample proportions isσp^=0.19(1−0.19)50−−−−−−−−−−−−√=0.06 For the given sample of 50 students, the sample proportion isp^=950=0.18. Then, the corresponding z-score is z=p^−μp^σp^=0.18−0.190.06=−0.17 3. Slide the black dot to the z-score we found above: −0.17. The area shown is 0.4325. Rounding to three decimal places, we have P(X≤9)=0.43.
The following data set provides information of Households by Total Money Income, Race, and Hispanic Origin of Householder. Looking at the data set for household income for all races in 2015, what percent of households are in the category range that contains the median?
16.7% The median is $56,516, which is in the category of $50,000 to $74,999. This range represents 16.7% of all races.
In a town, a pediatric nurse is concerned about the number of children who have whooping cough during the winter season. Of the 3,492 children living in a town, 623 of them have whooping cough. Use Excel to determine the probability that in sample of 60 children, 7 or fewer of them have whooping cough. Round your answers to three decimal places.
1$0.178$0.178 2$0.117$0.117 3$0.049$0.049 4$0.106$0.106 1. We can use the Central Limit Theorem for Proportions because we have np=60⋅0.178=10.704 and 60(1−0.178)=49.300, both of which are greater than 10. 2. The following information is given: The population proportion, p=0.178, so we know μp^=p=0.178. The sample size is 60, so we know the standard deviation of the sampling distribution is σp^=0.178(1−0.178)60−−−−−−−−−√=0.049. We want to know the probability that P(p^<0.117). 3. Open Excel and click on any cell. 4. Type =NORM.DIST(. Then enter the value 0.117 as x, the mean of the sample μp^=0.178 as the mean, the standard deviation of the sample distribution σp^=0.049 as the standard_deviation, and TRUE for cumulative. Hit ENTER. This probability is 0.106. Rounding to the nearest hundredth, we have P(p^<0.117)=0.106. That is, in a sample of 60 children, the probability that less than 7 of them have whooping cough is 0.106.
At a small business convention, 63% of businesses turned a profit within the first three years of opening. A reporter would like to interview these successful businesses. If a random sample of 60 businesses at the convention are chosen, Use Excel to calculate probability that more than 80% of them turned a profit within the first three years of opening. Round your answers to three decimal places.
1$0.003$0.003 1. We can use the Central Limit Theorem for Proportions because we have np=60⋅0.63=37.8 and 60(1−0.63)=22.2, both of which are greater than 10. 2. The following information is given: The population proportion, p=0.63, so we know μp^=p=0.63. The sample size is 60, so we know the standard deviation of the sampling distribution is σp^=0.63(1−0.63)60−−−−−−−−√=0.062. We want to know the probability that P(p^>0.8). 3. Open Excel and click on any cell. 4. Type =NORM.DIST(. Then enter the value 0.8 as x, the mean of the sample μp^=0.63 as the mean, the standard deviation of the sample distribution σp^=0.062 as the standard_deviation, and TRUE for cumulative. Hit ENTER. This probability is 0.997. Because we want to find a greater than probability, we must use the complement rule. So, P(p^>0.8)=1−0.997)=0.003. So, the probability that more than 80% of a random sample turned a profit within the first three years of opening is 0.003, or 0.3%.
According to a study, 80% of pro gamers own a dedicated video game console. Suppose that at a gaming convention, 50 pro gamers are sampled. What is the probability that of those 50 gamers, between 35 and 37 of them own a dedicated video game console? Round all numbers to two decimal places. Use the z-table below:
1$0.06\pm0.01$0.06±0.01 2$0.11\pm0.01$0.11±0.01 We are given the population proportion of p=80%=0.80. The standard deviation of the sampling distribution of sample proportions is thus σp^=0.8(1−0.8)50−−−−−−−−−−√≈0.06 The proportions of interest are then p^1=3550=0.70 and p^2=3750=0.74, which correspond to the two z-scores z1=0.70−0.80.06=−1.67andz2=0.74−0.80.06=−1 The probabilities corresponding to these z-scores can be found in the given table: From the z-tables, P(X≤35)=P(z≤−1.67)=0.047 and P(X≤37)=P(z≤−1)=0.159. The probability that, in a sample of n=50, the sample proportion p^ falls between 3550 and 3750 is the difference between these probabilities: P(35≤X≤37)=P(−1.67≤z≤−1)=0.159−0.047=0.112 So to two decimal places, P(35≤X≤37)=0.11.
In a town, a pediatric nurse is concerned about the number of children who have whooping cough during the winter season. Of the 3,492 children living in a town, 623 of them have whooping cough. Determine the probability that in sample of 60 children, 7 or fewer of them have whooping cough. Round the z-score to two decimal places and all other answers to three decimal places. Use the z-table below:
1$0.178$0.178 2$0.117$0.117 3$0.049$0.049 4$-1.24$−1.24 5$0.107$0.107 We know the total number of children in the total population, so we know the population proportion: p=6233,492≈0.178 The standard deviation of the sampling distribution of sample proportions is thus σp^=0.178(1−0.178)60−−−−−−−−−−−−−−√≈0.049 For the given sample of 60 children, the sample proportion is p^=760≈0.117 . The corresponding z-score is z=p^−μp^σp^=0.117−0.1780.049≈−1.24 Since we are looking for the probability that the number of children who have whooping cough is less than or equal to 7, we want to know the area under the left tail of the standard normal distribution, to the left of z=−1.24. Using the z-table, z=−1.24 corresponds to a left-tail probability of 0.107, so P(z≤0.117)=0.107. That is, the probability that 7 or fewer out of 60 children have whooping cough is approximately 0.107, when rounded to two decimal places.
A credit card company surveys its customers to determine the number of times they use the card each month. There are 5,500 customers and 2,756 indicate that they use the card at least twice each month. Find the population proportion, as well as the mean and standard deviation of the sampling distribution for samples of size n=70 customers. Round all answers to 3 decimal places.
1$0.501$0.501 2$0.501$0.501 3$0.060$0.060 The population proportion is p=2,7565,500≈0.501. This is also the mean of the sampling distribution: μp^=p=0.501 For samples of size n=70, the standard deviation of the sampling distribution is σpˆ=p(1−p)n−−−−−−−√=0.501(1−0.501)70−−−−−−−−−−−−−−√≈0.060
A casino in Las Vegas, reporting to city officials, states that 34% of all gamblers at their facility end up winning money. The city officials take a random sample of 150 gamblers to test the casino's claims. With a 60% chance, the officials' sample proportion will be no greater than what value of p^? Round your answer to the nearest hundredth. Use the z-table given below:
1$0.25$0.25 2$0.04$0.04 3$0.35$0.35 The question asks about the likelihood of sample proportions, so it is the sampling distribution of sample proportions we are concerned with. Since the population proportion is given to be 0.34, we know the mean of the sampling distribution is also 0.34. The standard deviation of the sampling distribution for samples of size 150 is σp^=p(1−p)n−−−−−−−√=0.34(1−0.34)150−−−−−−−−−−−−√≈0.04 We want to know the point in the sampling distribution for which 60% of sample proportions fall to the left and the remaining 40\% fall to the right. We can work backwards in the z-table below to find the z-score corresponding to this separating percentage of 60%. Note the table does not include exactly 0.600. We want to make sure to find the value closest to the percentage but that does not exceed the amount. So we will say 60% corresponds to a z-score of about 0.25. To find the sample proportion p^ that corresponds to this z-score, we substitute the known values into the formula z=p^−pσp^ to get 0.25p^=≈p^−0.340.040.35 If the area under the sampling distribution is 0.60, the value of the sample proportion p^ that separates this area from the other 0.40 units is approximately 0.35 units. This means: there is a 60% chance that her sample proportion will be at or below 0.35.
Of the 13,500 savings accounts in a bank, 4,675 belong to people younger than 40 years old. The bank president would like to increase her institution's marketing strategy to younger customers, so she is examining the population proportions in order to create a statistical study. Find the population proportion, as well as the mean and standard deviation of the sampling distribution for samples of size n=400. Round all answers to 3 decimal places.
1$0.346$0.346 2$0.346$0.346 3$0.024$0.024 The population proportion is p=4,67513,500≈0.346. This is also the mean of the sampling distribution: μp^=p=0.346 For samples of size n=400, the standard deviation of the sampling distribution is σpˆ=p(1−p)n−−−−−−−√=0.346(1−0.346)400−−−−−−−−−−−−−−√≈0.024
A graduate student majoring in linguistics is interested in studying the number of students in her college who are bilingual. Of the 1,320 students at the college, 466 of them are bilingual. If the graduate student conducts a study and samples 50 students at the college, determine the probability that 17 or fewer of them are bilingual. Round your answer for σp^ to three decimal places and all other answers to two decimal places.
1$0.35$0.35 2$0.34$0.34 3$0.068$0.068 4$0.42$0.42 1. We can use the Central Limit Theorem for Proportions because we have np=50⋅0.353=17.652 and 50(1−0.353)=32.348, both of which are greater than 10. 2. The following information is given: The population proportion, p=0.353, so we know μp^=p=0.353. The sample size is 50, so we know the standard deviation of the sampling distribution is σp^=0.353(1−0.353)50−−−−−−−−−√=0.068. We want to know the probability that P(p^<0.34). 3. Open Excel and click on any cell. 4. Type =NORM.DIST(. Then enter the value 0.34 as x, the mean of the sample μp^=0.353 as the mean, the standard deviation of the sample distribution σp^=0.068 as the standard_deviation, and TRUE for cumulative. Hit ENTER. This probability is 0.424. That is, the probability that 17 or fewer out of 50 students at this college are bilingual is 0.42, when rounded to two decimal places.
A graduate student majoring in linguistics is interested in studying the number of students in her college who are bilingual. Of the 1,320 students at the college, 466 of them are bilingual. If the graduate student conducts a study and samples 50 students at the college, use the graph below to determine the probability that 17 or fewer of them are bilingual. Drag and move the blue dot to select the appropriate probability graph area from the four options on the left. (Note - there are four graphs available to choose from. Only select between less than, greater than, and area between graphs.); Use the Central Limit Theorem to find p^ and σp^; Calculate the z-score for p^ and move the slider along the x-axis to the appropriate z-score; The purple area under the curve represents the probability of the event occurring. Interpret the purple area under the curve. Round the σp^ answer to three decimal places and all other answers to two decimal places. powered by Move the blue dot to choose the configuration
1$0.35\pm0.01$0.35±0.01 2$0.34\pm0.01$0.34±0.01 3$0.067\pm0.001$0.067±0.001 4$-0.15\pm0.01$−0.15±0.01 5$0.44\pm0.01$0.44±0.01 1. We will select the less than graph option, with the purple area in the left tail of the curve, because we want to know the probability that 17 or fewer people in the sample are bilingual. Before we calculate the z-score, we must find p, the population proportion. We know the total number of college students in the total population, so we know the population proportion: p=4661320=0.35 We should also check that we have met the condition for the Central Limit Theorem by plugging in p and n into n(1−p)≥10 and np≥10. We have a sample size of 50 students and p=0.35, so we have: 50(1−0.35)=32.5 and 50(0.35)=17.5. 32.5 and 17.5 are greater than 10, so we can use the Central Limit Theorem. The CLT for sample proportions tell use that p^ is normally distributed with mean μp^=p and σp^=p(1−p)n−−−−−√. 2. In order to calculate the z-score, so we need to identify μp^, and calculate p^ and σp^: We have μp^=p=0.35; The standard deviation of the sampling distribution of sample proportions is thusσp^=0.35(1−0.35)50−−−−−−−−−−−−√=0.067 For the given sample of 50 students, the sample proportion isp^=1750=0.34. The corresponding z-score is z=p^−μp^σp^=0.34−0.350.067=−0.15 3. Then, slide the black dot on the x-axis to the z-score we found above: −0.15. The area shown is 0.4404, and rounding to two decimal places we have 0.44. That is, the probability that 17 or fewer out of 50 students at this college are bilingual is 0.44.
A kitchen supply store has a total of 642 unique items available for purchase. Of their available kitchen items, 260 are kitchen tools. The store manager would like to study this further when conducting item inventory. If the store manager surveys 52 store items, use Excel to calculate the probability that 24 or fewer of them are kitchen tools. Round your answers to three decimal places.
1$0.405$0.405 2$0.462$0.462 3$0.068$0.068 4$0.797$0.797 1. We can use the Central Limit Theorem for Proportions because we have np=52⋅0.405=21.059 and 52(1−0.405)=30.940, both of which are greater than 10. 2. The following information is given: The population proportion, p=0.405, so we know μp^=p=0.405. The sample size is 52, so we know the standard deviation of the sampling distribution is σp^=0.405(1−0.405)52−−−−−−−−−√=0.068. We want to know the probability that P(p^<0.462). 3. Open Excel and click on any cell. 4. Type =NORM.DIST(. Then enter the value 0.462 as x, the mean of the sample μp^=0.405 as the mean, the standard deviation of the sample distribution σp^=0.068 as the standard_deviation, and TRUE for cumulative. Hit ENTER. This probability is 0.797. This tells us that out of 52 products selected, the probability that fewer than 24 of them are kitchen tools is 0.797.
In a small town, there are 4954 adults and 2998 of them own a home. A real estate business would like to perform a study on homeowners to help with their business's advertising. If the real estate business surveys 199 adults in the town, use the graph below to determine the probability that 120 or fewer of them own a home. Drag and move the blue dot to select the appropriate probability graph area from the four options on the left. (Note - there are four graphs available to choose from. Only select between less than, greater than, and area between graphs.); Use the Central Limit Theorem to find p^ and σp^; Calculate the z-score for p^ and move the slider along the x-axis to the appropriate z-score; The purple area under the curve represents the probability of the event occurring. Interpret the purple area under the curve.
1$0.61\pm0.01$0.61±0.01 2$0.60\pm0.01$0.60±0.01 3$0.03\pm0.01$0.03±0.01 4$-0.33\pm0.01$−0.33±0.01 5$0.37\pm0.01$0.37±0.01 1. We will select the less than graph option, with the purple area in the left tail of the curve, because we want to know the probability that 120 or fewer people in the sample own a home. Before we calculate the z-score, we must find p, the population proportion. We know the total adults in the total population, so we know the population proportion: p=29984954=0.61 We should also check that we have met the condition for the Central Limit Theorem by plugging in p and n into n(1−p)≥10 and np≥10. We have a sample size of 199 adults, and p=0.61, so we have: 199(1−0.61)=77.61 and 199(0.61)=121.39. 77.61 and 121.39 are greater than 10, so we can use the Central Limit Theorem. The CLT for sample proportions tell use that p^ is normally distributed with mean μp^=p and σp^=p(1−p)n−−−−−√. 2. In order to calculate the z-score, so we need to identify μp^, and calculate p^ and σp^: We have μp^=p=0.61; The standard deviation of the sampling distribution of sample proportions is thusσp^=0.61(1−0.61)199−−−−−−−−−−−−√=0.03 For the given sample of 199 adults, the sample proportion isp^=120199=0.60. The corresponding z-score is z=p^−μp^σp^=0.60−0.610.03=−0.33 3. Then, slide the black dot to the z-score we found above: −0.33. The area shown is 0.3707. Rounding to two decimal places, we have P(X≤120)=0.37. Since we are looking for the probability that the number of home owners is less than or equal to 120, we want to know the area under the left tail of the standard normal distribution, to the left of z=−0.33. That is, the probability that 120 or fewer out of 199 adults in the town will be homeowners is 0.37.
A manufacturing plant produces custom hardware for specific applications in construction. One particular kind of bolt that is intended to have a length of 84mm is produced by two different machines, A and B. Tristan is an employee in the department that produces this bolt. He randomly selects samples of 100 bolts from each machine and measures the length of each one. The lengths of the bolts, in millimeters, are shown in the table below. Each machine produced a bolt that has a length of 84.05mm. Based on the z-scores you calculated above, would an 84.05mm bolt more likely be produced by Machine A or Machine B?
An 84.05mm bolt would more likely be produced by Machine A, because the absolute value of the z-score for Machine A is less than for Machine B. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for Machine A, 1.74 is less than the absolute value of the z-score for Machine B, 3.57, an 84.05mm bolt would more likely be produced by Machine A.
Natasha records the price for a gallon of home-heating oil from 20 randomly selected providers in her region on May 15. She does it again with another 20 randomly selected providers on November 15. The results (in dollars) are shown in the accompanying samples. Each date had one provider that had a price of $2.879 per gallon. Is the $2.879 price per gallon more unusual to occur on May 15 or November 15? Based on the z-scores you calculated above, is the $2.879 price per gallon more unusual to occur on May 15 or November 15?
The absolute value of the z-score for November 15 is greater than the absolute value of the z-score for May 15, so a provider having a price of $2.879 per gallon on November 15 is more unusual. A z-score with a greater absolute value means that the data value is more unusual. Since the absolute value of the z-score for November 15, 1.58, is greater than the absolute value of the z-score for May 15, 1.42, a provider having a price of $2.879 per gallon on November 15 is more unusual.
Isabel is looking at the prices for round-trip airfare from Setauket to Orchard Park where both flights occur on Wednesday or both flights occur on Sunday. She randomly selects 20 round-trips where both flights occur on Wednesday and 20 round-trips where both flights occur on Sunday. Isabel records the prices for each round-trip airfare in dollars as shown in the samples provided. One of the round-trips on Wednesday and one of the round-trips on Sunday both cost $235. Based on the z-scores you calculated above, is the $235 airfare more likely to occur on Wednesday or on Sunday?
The absolute value of the z-score for a round-trip flight that occurs on Sunday is less than the absolute value of the z-score for a round-trip flight that occurs on Wednesday, so a round-trip flight that costs $235 is more likely to occur on Sunday than on Wednesday. A z-score with a lower absolute value means that the data value is more likely to occur. Since the absolute value of the z-score for flights that occur on Sunday, 0.97, is less than the absolute value of the z-score for flights that occur on Wednesday, 1.84, a round-trip flight that costs $235 is more likely to occur on Sunday than on Wednesday.
The following data set provides information of Households by Total Money Income, Race, and Hispanic Origin of Householder. If the measurement of the median and the mean were reversed, which of the following is most likely?
The data are skewed to the left. If the median is greater than the mean, the mean has been pulled to the left, and the data are skewed to the left.
For the following dataset, you are interested to determine the "spread" of the data. Would you employ calculations for the sample standard deviation, or population standard deviation for this dataset: The pulse rate for 5 randomly selected players on a football team.
Use calculations for sample standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the data set represents only a subset of only 5 the players on a football team, so the sample standard deviation should be used.
The high temperature, in ∘C, on the first day of winter was recorded in a certain city every year from 1915 to 2015. The following six temperature values were randomly selected from the data. 4,12,6,9,6,11 To determine the "spread" of the data, would you employ calculations for the sample standard deviation, or population standard deviation for this data set?
Use calculations for sample standard deviation To determine if sample standard deviation or population standard deviation should be used, determine if the data set represents data values collected from the entire population, or from a subset of the population. If the data values represent data collected from a subset of the population, then the sample standard deviation should be used. If the data values represent data collected from the entire population of interest, then the population standard deviation should be used. In this case, the sample standard deviation should be used because the six temperature values were randomly selected from the full set of data values for the years 1915 to 2015.
There are 19,500 inmates that are wrongfully imprisoned in the United States out of 2,200,000 inmates. A sociologist is working to exonerate those wrongfully imprisoned. In the sampling distribution of sample proportions of size 500, above what proportion will 75% of all sample proportions be? Select all answers that apply to your calculation below. Use the z-table given below to answer the question:
z=−0.68 p^=0.01 Since we are looking for the top 75% of sample proportions, we need to look in the z-table for the bottom 25%. So the top 75% corresponds to a z-score of z=−0.68. Next, we will find the sample proportion that goes with the z-score. Note the population proportion is not directly given, but can be calculated as follows: p=19,5002,200,000=0.009. To find the sample proportion p^ that corresponds to z=−0.68, we substitute the known values into the formula z=p^−pσp^ and solve for p^ to get −0.68p^=p^−0.010.01(1−0.01)500−−−−−−−−√≈0.0069 Rounded to two decimal places, p^=0.01. This means: the top 75% of possible sample proportions are at or above 0.01.