Probability and Statistics: Exam 1 Review (Chapter 1, 2, 3)
(c) What is the variable for question A? Classify the variable as qualitative or quantitative. What is the level of measurement?
hours scheduled quantitative ratio
(d) What is the variable for question B? Classify the variable as qualitative or quantitative. What is the level of measurement?
rating of applicability of work experience to future employment qualitative ordinal
(e) Is the proportion of responses "3 = very" to question B a statistic or a parameter?
statistic
Driving under the influence of alcohol (DUI) is a serious offense. The following data give the ages of a random sample of 50 drivers arrested while driving under the influence of alcohol. This distribution is based on the age distribution of DUI arrests given in the Statistical Abstract of the United States (112th Edition). 46 16 41 26 22 33 30 22 36 34 63 21 26 18 27 24 31 38 26 55 31 47 27 43 35 22 64 40 58 20 49 37 53 25 29 32 23 49 39 40 24 56 30 51 21 45 27 34 47 35 (a) Make a stem-and-leaf display of the age distribution. (Use the tens digit as the stem and the ones digit as the leaf. Enter numbers from smallest to largest separated by spaces. Enter NONE for stems with no values.)
(a) 1 I 6 8 2 I 0 1 1 2 2 2 3 4 4 5 6 6 6 7 7 7 9 3 I 0 0 1 1 2 3 4 4 5 5 6 7 8 9 4 I 0013567799 5 I 13568 6 I 34
Each of the following data sets has a mean of x = 10. (i) 8 9 10 11 12 (ii) 7 9 10 11 13 (iii) 7 8 10 12 13 (a) Without doing any computations, order the data sets according to increasing value of standard deviations. (b) Why do you expect the difference in standard deviations between data sets (i) and (ii) to be greater than the difference in standard deviations between data sets (ii) and (iii)? Hint: Consider how much the data in the respective sets differ from the mean.
(a) (i), (ii), (iii) (b)The data change between data sets (i) and (ii) increased the squared difference Σ(x - x)2 by more than data sets (ii) and (iii).
The following data represent glucose blood levels (mg/100 ml) after a 12-hour fast for a random sample of 70 women (Reference: American Journal of Clinical Nutrition, Vol. 19, pp. 345-351). 45 66 83 71 76 64 59 59 76 82 80 81 85 77 82 90 87 72 79 69 83 71 87 69 81 76 96 83 67 94 101 94 89 94 73 99 93 85 83 80 78 80 85 83 84 74 81 70 65 89 70 80 84 77 65 46 80 70 75 45 101 71 109 73 73 80 72 81 63 74 For this problem, use six classes. (a) Find the class width.
(a) 11
Government agencies carefully monitor water quality and its effect on wetlands (Reference: Environmental Protection Agency Wetland Report EPA 832-R-93-005). Of particular concern is the concentration of nitrogen in water draining from fertilized lands. Too much nitrogen can kill fish and wildlife. Twenty-eight samples of water were taken at random from a lake. The nitrogen concentration (milligrams of nitrogen per liter of water) was determined for each sample. (a) Identify the variable. (b) Is the variable quantitative or qualitative? (c) What is the implied population?
(a) nitrogen concentration (b) quantitative (c) the entire lake
Where does all the water go? According to the Environmental Protection Agency (EPA), in a typical wetland environment, 40% of the water is outflow; 45% is seepage; 7% evaporates; and 8% remains as water volume in the ecosystem (Reference: United States Environmental Protection Agency Case Studies Report 832-R-93-005). Chloride compounds as residuals from residential areas are a problem for wetlands. Suppose that in a particular wetland environment the following concentrations (mg/l) of chloride compounds were found: outflow, 55.3; seepage, 74.7; remaining due to evaporation, 58.7; in the water volume, 41.7. (a) Compute the weighted average of chlorine compound concentration (mg/l) for this ecological system. (Round your answer to one decimal place.) (b) Suppose the EPA has established an average chlorine compound concentration target of no more than 58 mg/l. Does this wetlands system meet the target standard for chlorine compound concentration?
(a)63.18 (b)No. The average chlorine compound concentration (mg/l) is too high.
Consider these types of graphs: histogram, bar graph, Pareto chart, pie chart, stem-and-leaf display. (a) Which are suitable for qualitative data? (Select all that apply.) (b) Which are suitable for quantitative data? (Select all that apply.)
(a)bar graph pie chart Pareto chart (b)Pareto chart pie chart bar graph histogram stem-and-leaf display
Categorize the type of sampling (simple random, stratified, systematic, cluster, or convenience) used in each of the following situations. (a) To conduct a preelection opinion poll on a proposed amendment to the state constitution, a random sample of 10 telephone prefixes (first three digits of the phone number) was selected, and all households from the phone prefixes selected were called. (b) To conduct a study on depression among the elderly, a sample of 30 patients in one nursing home was used. (c) To maintain quality control in a brewery, every 20th bottle of beer coming off the production line was opened and tested. (d) Subscribers to a new smart phone app that streams songs were assigned numbers. Then a sample of 30 subscribers was selected by using a random-number table. The subscribers in the sample were invited to rate the process for selecting the songs in the playlist. (e) To judge the appeal of a proposed television sitcom, a random sample of 10 people from each of three different age categories was selected and those chosen were asked to rate a pilot show.
(a)cluster sample (b)convenience sample (c)systematic sample (d)systematic sample (e)stratified sample
Categorize these measurements associated with a robotics company according to level: nominal, ordinal, interval, or ratio. (a) Salesperson's performance: below average, average, above average. (b) Price of company's stock (c) Names of new products (d) Temperature (°F) in CEO's private office (e) Gross income for each of the past 5 years (f) Color of product packaging
(a)ordinal (b)ratio (c)nominal (d)interval (e)ratio (f)nominal
An important part of employee compensation is a benefits package, which might include health insurance, life insurance, child care, vacation days, retirement plan, parental leave, bonuses, etc. Suppose you want to conduct a survey of benefits packages available in private businesses in Hawaii. You want a sample size of 100. Some sampling techniques are described below. Categorize each technique as simple random sample, stratified sample, systematic sample, cluster sample, or convenience sample. (a) Assign each business in the Island Business Directory a number, and then use a random-number table to select the businesses to be included in the sample. (b) Use postal ZIP Codes to divide the state into regions. Pick a random sample of 10 ZIP Code areas and then include all the businesses in each selected ZIP Code area. (c) Send a team of five research assistants to Bishop Street in downtown Honolulu. Let each assistant select a block or building and interview an employee from each business found. Each researcher can have the rest of the day off after getting responses from 20 different businesses. (d) Use the Island Business Directory. Number all the businesses. Select a starting place at random, and then use every 50th business listed until you have 100 businesses. (e) Group the businesses according to type: medical, shipping, retail, manufacturing, financial, construction, restaurant, hotel, tourism, other. Then select a random sample of 10 businesses from each business type.
(a)simple random sample (b)cluster sample (c)convenience sample (d)systematic sample (e)stratified sample
You are conducting a study of students doing work-study jobs on your campus. Among the questions on the survey instrument are the following. A. How many hours are you scheduled to work each week? Answer to the nearest hour. B. How applicable is this work experience to your future employment goals? Respond using the following scale: 1= not at all, 2 = somewhat, 3 = very (a) Suppose you take random samples from the following groups: freshmen, sophomores, juniors, and seniors. What kind of sampling technique are you using (simple random, stratified, systematic, cluster, multistage, convenience)? (b) Describe the individuals of this study.
(a)stratified sample (b)Students on your campus with work-study jobs.
(f) Suppose only 40% of the students you selected for the sample respond. What is the nonresponse rate? Do you think the nonresponse rate might introduce bias into the study? Explain.
60 Yes, the people choosing not to respond may have some characteristics that would bias the study.
Describe how data outliers might be revealed in histograms and stem-and-leaf plots.
Any large gaps between bars or stems might indicate potential outliers.
(b) Make a frequency table using seven classes.
Class Limits 16 − 22 23 − 29 30 − 36 37 − 43 44 − 50 51 − 57 58 − 64 Class Boundaries 15.5 − 22.5 22.5 − 29.5 29.5 − 36.5 36.5 − 43.5 43.5 − 50.5 50.5 − 57.5 57.5 − 64.5 Midpoint 19 26 33 40 47 54 61 Frequency 8 11 11 7 6 4 3 Relative Frequency 0.16 0.22 0.22 0.14 0.12 0.08 0.06 Cumulative Frequency 8 19 30 37 43 47 50
How are dotplots and stem-and-leaf displays similar?
Dotplots and stem-and-leaf displays both show every data value.
(g) Would it be appropriate to generalize the results of your study to all work-study students in the nation? Explain.
No, the sample frame is restricted to one campus.
How are they different?
Stem-and-leaf displays group the data with the same stem, whereas dotplots only group the data with identical values.