Intro Into Stats
Will the following variables have positive correlation, negative correlation, or no correlation? (1) outside temperature and the day of the week (2) number of doctors and number of administrators at a hospital (3) interest rates on car loans and number of cars sold
(1) no correlation (2) positive correlation (3) negative correlation
The length of human pregnancies is approximately normal with mean μ = 266 and standard deviation σ = 16 days.
(A) Use statcrunch Normal Calculator (B) If 100 pregnant individuals were selected independently from this population, we would expect 33 pregnancies to last less than 259 days. The sampling distribution of x̄ is ( normal ) with μ- = 266 and σ- = 5.6569 (C) Switch out σ for the found σ- on the Normal Calculator The probability that the mean of a random sample of 8 pregnancies is less than 259 days is .1080 If 100 independent random samples of size n = 8 pregnancies were obtained from this population, we would expect 11 sample(s) to have a sample mean of 259 days or less. (D) Switch out σ- again for the new value (.0060) ...(1) samples to have a sample mean of 259 days or less (e) This result would be unusual, so the sample likely came from a population whose mean gestation period is less than 266 days. (f) What is the probability a random sample of size 18 will have a mean gestation period within 9 days of the mean? Find σ- = σ / (n)^1/2 for n = 18 (3.771236166) Add and Subtract 9 from the mean 266 - 9 = 257 and 266 + 9 = 275 Put 257 < x < 265 and the new σ- into the normal calculator ANSWER: .9830
According to flightstats.com, American Airlines flights from Dallas to Chicago are on time 80% of the time. Suppose 13 flights are randomly selected, and the number of on-time flights is recorded
(a) There are two mutually exclusive outcomes, The experiment is performed a fixed number of times, the trials are independent, the probability of success is the same for each trial of the experiment (b) n = 13 and p =.80 (c) Use StatCrunch 0.0691 In 100 trials of this experiment, it is expected that about 7 will result in exactly 8 flights being on time (d) Use StatCrunch .0300 In 100 trials of this experiment, it is expected that about 3 will result in fewer than 8 flights being on time (e) Use StatCrunch .9700 97 (f) Use StatCrunch .0979 10
For students who first enrolled in two-year public institutions in a recent semester, the proportion who earned a bachelor's degree within six years was 0.389. The president of a certain junior college believes that the proportion of students who enrolled in her institution have a higher completion rate. (a) State the null and hypothesis in words (b) State the null and hypothesis symbolically (c) Explain what is would mean to make a Type 1 error (d) Explain what it would mean to make a Type II error
(a) Among students who enroll at the certain junior college, the completion rate is 0.389. Among students who enrolled at the certain junior college, the completion rate is greater than 0.389 (b) H0 p = 0.389 H1 p > 0.389 Type I error: is committed when the null hypothesis is rejected when, in fact, it is true. Type II error: is committed when the null hypothesis is not rejected when, in fact, the alternative hypothesis is true (c) The president REJECTS the hypothesis that the proportion of students who earn a bachelor's degree within size years is EQUAL TO 0.389 when, in fact, the proportion is EQUAL TO 0.389 (d) The president FAILS TO REJECT the hypothesis that the proportion of students who earn a bachelor's degree within size years is EQUAL TO 0.389 when, in fact, the proportion is LESS THAN 0.394
Several years ago, the mean height of women 20 years of age or older was 63.7 inches. Suppose that a random sample of 45 women who are 20 years of age or older today results in a mean height of 64.6 inches (a) State the appropriate null and alternative hypothesis to assess whether women are taller today. (b) Suppose the P-value for this test is 0.16. Explain what this value represents. (c) Write a conclusion for this hypothesis test assuming an α = 0.05 level of significance.
(a) H0: μ = 63.7 in. versus H1: μ > 63.7 in. (b) There is a 0.16 probability of obtaining a sample mean height of 64.6 inches or taller from a population whose mean height is 63.7 inches. (c) Do not reject the null hypothesis. There is not sufficient evidence to conclude that the mean height of women 20 years of age or older is greater today.
To test H0: μ = 37 versus H1: μ ≠ 37, a simple random sample of size n = 35 is obtained. Complete parts (a) through (f) below.
(a) No - since the sample size is at least 30, the underlying population does not need to be normally distributed. (b) If x̄ = 41.4 and s = 10.3, compute the test statistic. USE STATCRUNCH T SUMMARY CALC t0 = 2.53 (c) Very small amount of shading on BOTH edges of graph (d) The probability of observing a SAMPLE statistic AS EXTREME OR MORE EXTREME THAN the one observed, assuming H0 is true, is in the range of 0.01 < P-value < 0.02 (e) Because the P-Value is GREATER than α, the researcher will FAIL TO REJECT the null hypothesis. (f) USE THE CONFIDENCE LEVEL IN THE SAME GRAPH TO FIND BOUNDARIES Because the value 37 lies WITHIN the confidence interval, we FAIL TO REJECT the null hypothesis
Determine if the following probability experiment represents a binomial experiment. If not, explain why. If the probability experiment is a binomial experiment, state the number of trials, n. (a) A random sample of 15 middle school students is obtained, and the individuals selected are asked to state their weights. (b) An experimental drug is administered to 80 randomly selected individuals, with the number of individuals responding favorably recorded. (c) Four cards are selected from a standard 52-card deck without replacement. The number of aces selected is recorded. (d) An investor randomly purchases 9 stocks listed on a stock exchange. Historically, the probability that a stock listed on this exchange will increase in value over the course of a year is 49%. The number of stocks that increase in value is recorded.
(a) No, because there are more than two mutually exclusive outcomes for each trial. (b) Yes, because the experiment satisfies all the criteria for a binomial experiment, n = 80 (c) No, because the trials of the experiment are not independent since the probability of success differs from trial to trial (d) Yes, because the experiment satisfies all the criteria for a binomial experiment, n = 9 and p = .49
A researcher wishes to estimate the average blood alcohol concentration (BAC) for the drivers involved in fatal accidents who are found to have positive BAC values. He randomly selected records from 82 such drivers in 2009 and determines the sample mean BAC to be 0.17 g/dL with a standard deviation of 0.060 g/dL.
(a) Since the distribution of blood alcohol concentrations is highly skewed right, a large sample size is necessary to ensure that the distribution of the sample mean is approximately normal. (b) The sample size is likely less than 5% of the population. (c) USE STATCRUNCH SAMPLE T SUMMARY CALC x̄ = 0.17 s = 0.060 sample size = 82 d) While it is possible that the population mean is not captured in the confidence interval, it is not likely.
Suppose a simple random sample of size n = 64 is obtained from a population that is skewed right with μ =70 and σ = 32. (a) Describe the sampling distribution of x̄ (b) What is P ( x̄ > 75.6) (c) What is P ( x̄ ≤ 61.4) (d) What is P ( 65.6 < x̄ < 78.8)
(a) The distribution is approximately normal. μ- = 70 σ- = σ / (n)^1/2 ANSWER: 4 (b) Use statcrunch Normal Calculator and use the FOUND values above. (.0808) (c) Use statcrunch Normal Calculator and use the FOUND values above. (.0158) (d) Use statcrunch Normal Calculator and use the FOUND values above. (.8504)
Determine whether the random variable is discrete or continuous. In each case, state the possible values of the random variable. (a) The number of fish caught during a fishing tournament. (b) The square footage of a house.
(a) The number of fish caught during a fishing tournament The random variable is discrete. The possible values are x = 0, 1, 2 ... (b) The square footage of a house The random variable is continuous. The possible values are a > 0 (c) The number of free-throw attempts before the first shot is made The random variable is discrete. The possible values are x = 0, 1, 2 ... (d) The time is takes for a light bulb to burn out The random variable is continuous. The possible values are a > 0 (e) The number of people with blood type A in a random sample of 39 people The random variable is discrete. The possible values are x = 0, 1, 2 ... (f) The time it takes to fly from City A to City B The random variable is continuous. The possible values are a > 0 (g) The number of students The random variable is discrete. The possible values are x = 0, 1, 2 ... (h) The amount of snow that falls The random variable is continuous. The possible values are a > 0 (I) Flight time accumulated The random variable is continuous. The possible values are a > 0 (J) Number of points scored The random variable is continuous. The possible values are a > 0
In a survey conducted by a reputable marketing agency, 249 of 1000 adults 19 years of age or older confessed to bringing and using their cell phone every trip to the bathroom (confessions included texting and answering phone calls).
(a) The sample is the 1000 adults 19 years of age or older. (b) The population is all adults 19 years of age or older. (c) The variable of interest is BRINGING ONE'S CELL PHONE EVERY TRIP TO THE BATHROOM. The variable is QUALITATIVE WITH TWO OUTCOMES because INDIVIDUALS ARE CLASSIFIED BASED ON A CHARACTERISTICS. (d) p̂ = x / n (0.249) (e) Why is the point estimate found in part (c) a statistic? ITS VALUE IS BASED ON A SAMPLE Why is the point estimate found in part (c) a random variable? ITS VALUE MAY CHANGE DEPENDING ON THE INDIVIDUALS IN THE SURVEY. What is the source of variability in the random variable? THE INDIVIDUALS SELECTED TO BE IN THE STUDY. (f) We are 95% confident the proportion of adults 19 years or older who bring their cell phone every trip to the bathroom is between 0.222 and 0.276. (g) What ensures that the results of this study are representative of all adults 19 years of age or older? RANDOM SAMPLING
The shape of the distribution of the time required to get an oil change as t 20-minute oil change facility is skewed right. However, records indicate that the mean time is 21.1 minutes, and the standard deviation is 3.7 minutes.
(a) The sample size needs to be greater than 30 (b) Find the σ- and put into Normal Calculator (.0393) (c) put the 10% chance in the outcome of the calculator (20.3)
Determine whether the following sampling is dependent or independent. Indicate whether the response variable is qualitative or quantitative. (a) A researcher wishes to compare the academic aptitudes of married lawyers and their spouses. She obtains a random sample of 961 such couples who take an academic aptitude test and determines each spouse's academic aptitude.
(a) The sampling is dependent because an individual selected for one sample does dictate which individual is to be in the second sample. The variable is qualitative because it classifies the individual.
In a survey, 900 adults in a certain country were asked how many hours they worked in the previous week. Based on the results, a 95% confidence interval for mean number of hours worked was lower bound: 40.5 and upper bound: 42.6. Which of the following represents a reasonable interpretation of the result? For those that are not reasonable, explain the flaw.
(a) There is a 95% chance the mean number of hours worked b adults in this country in the previous week was between 40.5 and 42.6 hours. FLAWED. THIS INTERPRETATION IMPLIES THAT THE POPULATION MEAN VARIES RATHER THAN THE INDIVIDUAL. (b) We are 95% confident that the mean number of hours worked by adults in this country in the previous week was between 40.5 and 42.6 hours. CORRECT. THIS INTERPRETATION IS REASONABLE. (c) 95% of adults in this country worked between 40.5 and 42.6 hours last week. FLAWED. THIS INTERPRETATION MAKES AN IMPLICATION ABOUT INDIVIDUALS RATHER THAN THE MEAN. (d) We are 95% confident that the mean number of hours worked by adults in a particular area of this country in the previous week was between 40.5 and 42.6 hours. FLAWED. THE INTERPRETATION SHOULD BE ABOUT THE MEAN NUMBER OF HOURS WORKED BY ADULTS IN THE WHOLE COUNTRY, NOT ABOUT ADULTS IN THE PARTICLAR AREA.
In the probability distribution to the right, the random variable X represents the number of hits a baseball player obtained in a game over the course of a season.
(a) This is a discrete probability distribution because (All of the probabilities are) between (0) and (1), inclusive, and the (sum) of the probabilities is (1). (b) C (the graph that matches the given table) (c) The distribution (Has one mode) and is (skewed right) (d) mean = 1.6296 (e) Over the course of many games, one would expect the mean number of hits per game to be the mean of the random variable (f) s.d = 1.182 (g) 0.2820
In the following probability distribution, the random variable x represents the number of activities a parent of a 6th to 8th grade student is involved in.
(a) This is a discrete probability distribution because the sum of the probabilities is 1 and each probability is between 0 and 1, inclusive. (b) Graph C, match the (0, 0.415) and find the graph (c) 1.333 (Use StatCrunch) (Stats > Custom Calc > Values in X and Weight in P(X). The mean of a discrete random variable, can be found by using either technology of by using the formula: ∑(x*P(x)), where P(x) is the probability of observing the value x. (d) As the number of experiments increases, the mean of the observations will approach the mean of the random variable (e) 1.3 activities (found same way as mean) (f)0.132 (Pulled from given table) (g) 0.196 (added the two columns in table)
The following data represents the muzzle velocity (in feet per second) of rounds fired from a 155-mm gun. For each round, two measurements of the velocity were recorded using two different measuring devices, resulting in the following data. Complete parts (a) through (d) below.
(a) Two measurements (A and B) are taken on the same round. (b) H0: μd = 0 versus μd ≠ 0 (c) Use T stat PAIRED CALC (-0.59) The P-value is in the range of 0.25 < P-value < 1 DO NOT REJECT H0. There IS NOT sufficient evidence at the α = 0.01 level of significance to conclude that there is a difference in the measurements of velocity between device A and device B. (d) Use the came PAIRED calc but use confidence levels L.B. -10.34 U.B. 7.71 One can be 99% confident that the mean difference in measurement lies in the interval found above. (d) Click edit on the graph and click make a boxplot Yes, because 0 is contained in the boxplot
The reading speed of second grade students in a large city is approximately normal, with a mean of 88 words per minute (wpm) and a standard deviation of 10 wpm.
(a) Use Statcrunch Normal Calculator (.2743) ...27 to read more than 94 words per minute (b) Redo σ- and put into calculator (.0289) ...3 sample(s) to have a sample mean reading rate of more than 94 words per minute (c) Redo σ- and put into calculator (.0036) ...0 sample(s) to have a sample mean reading rate of more than 94 words per minute (d) Increasing the sample size DECREASES the probability because σ- DECREASES as n INCREASES (e) change the σ- to the new sample size and find the probability of scoring a 90.8, then take that probability and * 100 to get the number of kids out of 100. ANSWER: A mean reading rate of 90.8 wpm is not unusual since the probability of obtaining a result of 90.8 or more is .1111. This means that we would expect a mean reading rate of 90.8 or higher from a population whose mean reading rate is 88 in 11 of every 100 random samples of size n =19. The new program is not abundantly more effective than the old program. (f) Change the σ- to the new sample size and put % in the outcome of the calculator. ANSWER: 91.43
Complete parts (a) through (d) for the sample distribution of the sample mean shown in the accompanying graph.
(a) Use the middle line at the peak (300) (b) σ- is the distance between the peak (mean) and one of the lines (20) (c) If the sample size is n = 9, what is likely true about the shape of the population? ANSWER: The shape of the population is approximately normal (d) σ- = σ / (n)^1/2 use equation but the σ- is (20) and you must work backwards to σ (60)
A researcher studies water clarity at the same location in a lake on the same dates during the course of a year and repeats the measurements on the same dates 5 years later. The researcher immerses a weighted disk painted black and white and measures the depth (in inches) at which it is no longer visible. The collected data is given in the table below. Complete parts (a) through (c) below.
(a) Using the same dates makes the second sample dependent on the first and reduces variability in water clarity attributable to the date. (b) Clarity is improving H0: μd = 0 versus μd < 0 Test statistic: Use PAIRED T CALC (-1.09) The P-value is in the range 0.10 < P-value < 0.25 DO NOT REJECT H0. There IS NOT sufficient evidence at the α = 0.05 level of significance to conclude that the clarity of the lake is improving. Click edit on the graph and click make a boxplot Yes, because 0 is contained in the boxplot
In a survey of 1045 adults in a certain country conducted during a period of economic uncertainty, 55% thought that wages paid to workers in industry were too low. The margin of error was 4 percentage points with 95% confidence. For parts (a) through (d) below, which represents a reasonable interpretation of the survey results? For those that are not reasonable, explain the flaw.
(a) We are 95% confident 55% of adults in the country during the period of economic uncertainty felt wages paid to workers in industry were too low. THE INTERPRETATION IS FLAWS. THE INTERPRETATION PROVIDES NO INTERVAL ABOUT THE POPULATION PROPORTION. (b) We are 91% to 99% confident 55% of adults in the country during the period of economic uncertainty felt wages paid to workers in industry were too low. THE INTERPRETATION IS FLAWED. THE INTERPRETATION INDICATEXS THAT THE LEVEL OF CONFIDENCE IS VARYING (c) We are 95% confident that the interval from 0.51 and 0.59 contains the true proportion of adults in the country during the period of economic uncertainty who believed wages paid to workers in industry were too low. THE INTERPRETATION IS REASONABLE (d) In 95% of samples of adults in the country during the period of economic uncertainty, the proportion who believed wages paid to workers in industry were too low is between 0.51 and 0.59. THE INTRERPRETATION IS FLAWED. THE INTERPRETATION SUGGESTS THAT THIS INTERVAL SETS THE STANDARD FOR ALL THE OTHER INTERVALS, WHICH IS NOT TRUE.
Conduct a test at the α = 0.02 level of significance by determining (a) the null and alternative hypothesis, (b) the test statistic, and (c) the P-value. Assume the samples were obtained independently from a large population using simple random sampling. Test whether p1 > p2. The sample data are x1 = 117, n1 = 251, x2 = 137, and n2 = 303
(a) h0: p1 = p2 versus H1: p1 > p2 (b) USE TWO SAMPLE PROPORTION CALC z-score = 0.33 (c)USE TWO SAMPLE PROPORTION CALC P-value = .371 (d) Do not reject the null hypothesis because there is not sufficient evidence to conclude that p1 > p2.
A researcher wishes to estimate the proportion of adults who have high-speed internet access. What size sample should be obtained if she wishes the estimate to be within 0.06 with 99% confidence if (a) she uses a previous estimate of 0.47? (b) she does not use any prior estimates?
(a) p̂ (1 - p̂ ) * ( (z α/2) / E ) ^ 2 where z α/2 is the z-score of α/2 Example: Z 0.01/2 = Z 0.005 Z 0.005 = 2.575 (from table) α is 1 - confidence % (.01) E is estimate probabitility (0.06) p̂ is previous estimate (0.47) (b) n = 0.25 ( z α/2 / E ) ^ 2 TABLE: 90% 1.645 95% 1.96 99% 2.575
Determine the point estimate of the population mean and margin of error for the confidence interval. Lower bound is 18, upper bound is 28
(a) p̂ = (u + b) / 2 (23) (b) m.e. = ( u - b ) / 2 (5)
What is the probability of obtaining three heads in a row when flipping a coin? Interpret this probability.
.5 * .5 * .5 = .125 1250
A survey of 100 randomly selected high school students determined that 33 play organized sports. (a) What is the probability that a randomly selected high school students plays organized sports? (b) Interpret this probability.
0.33 330
Find the probability P(E^c) if P(E)=0.19
0.81
Complete the sentence below. In a probability model, the sum of the probabilities of all outcomes must equal
1
Which of the following are properties of the linear correlation coefficient?
1) A linear correlation of -0.742 suggests a stronger negative association between two variables than a linear correlation of -0.472 2) The linear correlation coefficient is always between -1 and 1. 3) If r= -1, then a perfect negative linear relation exists between the two variables
Consider the data set given in the accompanying table. Construct a frequency marginal distribution:
1) Make a table x1 | x2 | x3 | Marginal distribution y1 | 20 25 50 y2 | 30 25 50 total: 2) find the totals of each row and column 3)Finding Relative Frequency Marginal Distribution: Divide all column and row totals by the table's total 20 + 25 + 50 + 30 + 25 +50 = 200 table total x1 | x2 | x3 | Marginal distribution y1 | 20 25 50 95/200 = 0.475 y2 | 30 25 50 105/200 = 0.525 50/200 50/200 100/200 = 0.25 = 0.25 = 0.50 200 (total) 4) Construct a conditional distribution by x: divide all columns by column totals x1 | x2 | x3 | y1 | 20/50 25/50 50/100 y2 | 30/50 25/50 50/100 Tots: 50/50 50/50 100/100 = 1 = 1 = 1
Which of the follow are criteria for a binomial probability experiement?
1. The experiment consists of a fixed number, n, of trials. 2. The trials are independent 3. Each trial has two possible mutually exclusive outcomes: success and failure 4. The probability of success, p, remains constant for each trial of the experiment.
According to an almanac, 80% of adult smokers started smoking before turning 18 years old.
18) μ = np Answer: 80 S.D = (np(1-p ))^1/2 Answer: 4 c) It is expected that in a random sample of 100 adult smokers, 80 will have started smoking before turning 18.
The graph to the right is the uniform probability density function for a friend who is x minutes late. (a) Find the probability that the friend is between 20 and 30 minutes late (b) It is 10 AM. There is a 30% probability the friend will arrive within how many minutes?
30-25 = 5 minutes 5/30 = 1/6 = .167 .50 = x min / 30 .50 *30 = 15 minutes
A women has nine skirts and nine blouses. Assuming that they all match, how many different skirt-and-blouse combinations can she wear?
9 *9 = 81
Match each word or phrase with the definition.
A HYPOTHESIS is a statement regarding a characteristic of one or more populations The NULL HYPOTHESIS is a statement of no change, no effect, or no difference. The ALTERNATIVE HYPOTHESIS is a statement we are trying to find evidence to support.
Define each of the following terms. (a) Point estimate (b) Confidence Interval (c) Level of confidence (d) Margin of error
A range of numbers based on a point estimate of an unknown parameter- Confidence interval The expected proportion of intervals that contain the parameter if a large number of different samples is obtained - Level of confidence Determines the width of the confidence interval - Margin of error The value of a statistic that estimates the value of a parameter - Point estimate
A simple random sample of size n=66 is obtained from a population that is skewed left with μ = 52 and σ = 10. Does the population need to be normally distributed for the sampling distribution of x̄ to be approximately normally distributed? Why? What is the sampling distribution of x̄?
A) No. The central limit theorem states that regardless of the shape of the underlying population, the sampling distribution of x̄ becomes approximately normal as the sample size, n, increases. B) μ- = μ (use same mean) σ- = σ / (n)^1/2 ANSWER: The sample distribution of x̄ is approximately normal with μ- = 86 and σ- = 1.429
The following table shows the distribution of murders by type of weapon for murder cases in a particular country over the past 12 years. Complete parts (a) through (e).
A) Yes; the rules required for a probability model are both met. B) P(rifle or shotgun) = 0.053 If 1000 murders were randomly selected, we would expect about 53 of them to have resulted from a rifle or shotgun C) P(handgun, rifle, or shotgun) = 0.53 If 1000 murders were randomly selected, we would expect about 530 of them to have resulted from a handgun, rifle, or shotgun. D) P(weapon other than a gun) = 0.327 If 1000 murders were randomly selected, we would expect about 327 of them to have resulted from a weapon other than a gun e) Are murders with a shot unusual? YES
A probability experiment is conducted in which the sample space of the experiment is S ={4,5,6,7,8,9,10,11,12,13,14,15}. Let event E = {5,6,7,8,9,10} and event F = {9,10,11,12}. List the outcomes in E and F. Are E and F mutually exclusive?
A. {9,10} B. No. E and F have outcomes in common.
Which histogram depicts a higher standard deviation?
Answer: Histogram b depicts the high standard deviation, because the distribution has more dispersion. Tips: Graph B had a bigger range on the x axis, and was more bell shaped
State the requirements to preform a goodness-of-fit test
At least 80% of expected frequencies ≥ 5 All expected frequencies ≥ 1
What is a bar graph? What is a Pareto chart?
Bar graph: A bar graph is a horizontal or vertical representation of the frequency or relative frequency of the categories. The height of each rectangle represents the category's frequency or relative frequency. Pareto chart: A Pareto chart is a bar graph whose bars are drawn in decreasing order of frequency or relative frequency.
In a survey, 44% of the respondents stated that they talk to their pets on the telephone. A veterinarian believed this result to be too high, so he randomly selected 170 pet owners and discovered that 68 of them spoke to their pets on the telephone. Does the veterinarian have a right to be skeptical? Use the α = 0.05 level of significance.
Because np0( 1 - p0) = 41.9 > 10, the sample size is LESS THAN 5% of the population size, and the sample is given to be random, all of the requirements for testing the hypothesis ARE satisfied. h0: p = .44 versus H1 p < .44 c) USE STAT CRUNCH Proportion summary calc The answer is the Z-Stat and P-value THE Z-STAT CAN BE NEGATIVE d) The veterinarian does not have a right to be skeptical. There is not sufficient evidence to conclude that the true proportion of pet owners who talk to their pets on the telephone is less than 44%
To determine customer opinion of their food quality, General Foods randomly selects 140 city blocks during a certain week and surveys all homes within the city blocks.
Cluster
To determine customer opinion of their service, Business Depot randomly selects 100 stores during a certain week and surveys all customers present in the store.
Cluster
Fill in the blank. The --------, R^2, measures the proportion of total variation in the response variable that is explained by the least squares regression line.
Coefficient of determination can be found by r^2 (correlation coefficient)
A police officer randomly selected 561 police records of larceny thefts. The accompanying data represent the number of offenses for various types of larceny thefts.
Constructing Probability Model: divide number by total b) Yes, because p(pocket-picking) < 0.05 c) No, because p(shoplifting larcenies) > 0.05
The contingency table below shows the results of a random sample of 200 registered voters that was conducted to see whether their opinions on a bill are related to their part affiliation.
Contingency Table WITH SUMMARY Select Columns (All except Party) Row Labels: Party Display: Expected Count 0.05 < P-value < 0.10
A magazine asks its readers to call in their opinion regarding the amount of advertising in the present issue.
Convenience
What does it mean if a statistic is resistant?
Definition: A statistic is resistant if it is not sensitive to extreme values Answer: Extreme values (very large or small) relative to the data do not affect its value substantially
Assume that the difference for the given data are normally distributed. Complete parts (a) through (d).
Determine di = Xi -Yi for each pair of data Subtract values by hand (b) Use statcrunch Summary Stats Type all found values into column in Statcrunch Find the mean and standard deviation c) H0: μd = 0 versus H2: μd < 0 d) Use the DATA T SAMPLE PAIRED CALC test statistic = -2.18 The P-value is BETWEEN 0.025 AND 0.05 REJECT the null hypothesis. There is SUFFICIENT evidence that μd < 0 and the α = 0.05 level of significance (e) use the same calculator but switch to confidence L.B. = -3.39 U.B. = 0.19
Finding the linear correlation coefficient:
Find mean of x values mean of y values standard deviation of x values (sx) standard deviation of y values (sy) find (x1 - x̄)/sx and do the same for the y's Multiple all (xi - x̄)/sx * (yi - y)/sy Add all products Divided by n-1 If the absolute value of the correlation coefficient is not greater than the critical value, NO linear relationship existsRG
The following data represents the pH of rain for a random sample of 12 rain dates. A normal probability plot suggests the data could come from a population that is normally distributed. A boxplot indicates there are no outliers.
Find the sample mean: USE STAT CRUNCH SUMMARY STATS TO FIND MEAN b) USE STATCRUNCH ONE SAMPLE T CALC There is a 95% confidence that the population mean pH of rain water is between 4.73 and 5.24 c) USE STATCRUNCH ONE SAMPLE T CALC There is 99% confidence that the population mean pH of rain water is between 4.62 and 5.34 d) As the level of confidence increases, the width of the interval INCREASES. This makes sense since the MARGIN OF ERROR INCREASES AS WELL.
Describe the sampling distribution of p̂. Assume the size of the population is 20,000. n = 700, p = 0.7
For a simple random sample size n such that n≤ 0.05N, the shape of the sampling distribution of p̂ is approximately normal provided np(1 - p) ≥ 10. Step 1: Determine 5% of the population size .05(20000) = 1,000 n≤ 0.05N Where N is the size of the population Step 2: Put into np(1 - p) ≥ 10 equation (700)(.7) * (1-.7) = 147 The shape of the sampling distribution is normal because 147 ≥ 10 ANSWER: Approximately normal because n ≤ 0.05N and np(1 - p) ≥ 10. b) The mean of the sampling distribution is the same as the population proportion μ = p (.7) c) σ = [ (p(1-p)) / n ] ^1/2 (.017)
Outside a home, there is a 10-key keypad numbered 0 through 9. The correct five-digit code will open the garage door. The numbers can be repeated in the code. a) How many codes are possible? b) What is the probability of entering the correct code on the first try, assuming that the owner doesn't remember the code?
Formula n^r 10^5 = 100,000 1/100,000
One graph in the figure represents a normal distribution with mean μ = 11 and standard deviation σ = 1. The other graph represents a normal distribution with mean μ = 15 and standard deviation σ = 1. Determine which graph is which and explain how you know.
Graph A has a mean of μ = 11 and graph B has a mean of μ = 15 because a larger mean shifts the graph to the right
Calcium is essential to tree growth. In 1990, the concentration of calcium in precipitation in Chautauqua, New York, was 0.11 milligrams per liter (mg / L ). A random sample of 8 precipitation dates in 2018 results in the following data: 0.262 0.126 0.183 0.120 0.234 0.313 0.108 0.065 A normal probability plot suggests the data could come from a population that is normally distributed. A boxplot does not show any outliers. Does the sample evidence suggest that calcium concentrations have changed since 1990? Use the a = 0.01 level of significance.
H0: μ = .11 versus H1: μ ≠ .11 b) USE STATCRUNCH T DATA CALC t0 = 2.17 P-value = 0.066 Since P-value > α, DO NOT REJECT the null hypothesis and conclude that there IS NOT sufficient evidence that the calcium level in rainwater has changed.
A researcher wanted to determine if carpeted or uncarpeted rooms contain more bacteria. The table shows the results for the number of bacteria per cubic foot for both types of rooms. A normal probability plot and boxplot indicate that the data are approximately normally distributed with no outliers. Do carpeted rooms have more bacteria than uncarpeted rooms at the α = 0.01 level of significance?
H0: μ(carpet) = μ(no carpet) versus H1: μ(carpet) > μ(no carpet) Put into T SUMMARY CALC t0 = 2.24 0.01 < P-value < 0.05 No, because the P-value is greater than or equal to the level of significance.
Assume that both populations are normally distributed. (a) Test whether μ1 ≠ μ2 at the α = 0.01 level of significance for the given sample data. (b) Construct a 99% confidence interval about μ1 - μ2.
H0: μ1 = μ2 H1: μ1 ≠ μ2 b) Put into T SUMMARY CALC (0.325) Do not reject H0, there is not sufficient evidence to conclude that the two populations have different means. c) Put into T SUMMARY CALC with Confidence L.B. -4.49 U.B. 2.09
Fill in the blank below. A researcher wants to show the mean from population 1 is less than the mean from population 2 in match-pairs data. If the observation from sample 1 is Xi and the observations from sample 2 are Yi, and di = Xi - Yi, then the null hypothesis is H0: μd = 0 and the alternative hypothesis is H1: μ d __ 0
H1: μ d < 0
Suppose a simple random sample of size n = 12 is obtained from a population with μ = 64 and σ = 15. (a) What must be true regarding the distribution of the population in order to use the normal model to compute probabilities regarding the sample mean? Assuming the normal model can be used, describe the sampling distribution x̄. (b) Assuming the normal model can be used, determine P( x̄ < 67.7 ). (c) Assuming the normal model can be used, determine P( x̄ ≥ 65.2).
IF N IS LESS THAN 30: (a) The population must be normally distributed. Normal, with μ- = 64 and σ- = 15/ (12)^1/2 (b through c) Use statcrunch (copy and paste entire σ value) IF N IS EQUAL TO OR GREATER THAN 30: (a) Since, the sample size is large enough, the population distribution does not need to be normal Approximately normal, with μ- = 66 and σ- = 18/ (40)^1/2 (b through c) Use statcrunch
Suppose that E and F are two events and that P(E and F)=0.2 and P(E)=0.4. What is P(F|E)? P(F|E) is read as "the probability of event F given event E."
If E and F are any two events, then the probability of event F occuring given the occurrence of event E is found by: P(E and F)/ P(E) = N(E and F)/ N(E)
Explain the difference between a single-blind and a double-blind experiment.
In a single-blind experiment, the subject does not know which treatment is received. In a double-blind experiment, neither the subject not the researcher in contact with the subject knows which treatment is received.
A sampling method is (Blank) when an individual selected for one sample does not dictate which individual is to be in the second sample.
Independent
Determine the level of measurement of the variable. Year of birth of college students
Interval
Class Width
Is the difference between the two lowest points in consecutive classes. For example 60-69 and 70-79 have the class width of 70-60=10
Explain why it does not make sense to interpret the y-intercept. Choose the correct answer below.
It does not make sense to interpret the y-intercept because an x-value of 0 is outside the scope of the model.
State the conclusion based on the results of the test. According to the report, the mean monthly cell phone bill was $48.78 three years ago. A researcher suspects that the mean monthly cell phone bill is different today. The null hypothesis is rejected.
LOOK TO SEE IF THE NULL HYPOTHESIS IS REJECTED OR NOT CHECK TO SEE IF DIFFERENT, LESS THAN, OR GREATER THAN If rejected: There is sufficient evidence to conclude that the mean monthly cell phone bill is less than its level three years ago of $48.78 If not rejected: There is not sufficient evidence to conclude that the mean monthly cell phone bill is less than its level three years ago of $48.78
A researcher wishes to estimate the percentage of adults who support abolishing the penny. What size sample should be obtained is he wishes to estimate to be within 2 percentage points with 99% confidence if (a) he uses a previous estimate of 38%? (b) he does not use any prior estimates?
MAKE SURE TO ROUND UP EVERY SINGLE TIME!!!! (a) p̂ (1 - p̂ ) * ( (z α/2) / E ) ^ 2 (3906) (b)n = 0.25 ( z α/2 / E ) ^ 2 (4145) TABLE: 90% 1.645 95% 1.96 99% 2.575
Recently, a random sample of 25-34 year olds was asked, "How much do you currently have in the savings, not including retirement savings?" The data in the table represent the responses to the survey. Approximate the mean and standard deviation amount of savings.
Make a table: Class # | Class Avg xi | Freq fi | xifi | xi - x̄ | z^2 * fi 1) find all class averages 2) find xifi (multiply class ave and class freq) 3) Find xifi column total 4) Find Freq Total, by adding all fi column 5) divide xifi total by Freq total = SAMPLE MEAN 6) find xi - x̄ = z; where x̄ is sample mean 7) find z^2 *fi 8) find total of z^2 *fi column 9) divide total z^2 *fi by (fi Total - 1) = SAMPLE VARIANCE 10) square root (sample variance) = STANDARD DEVIATION
Find the population mean or sample mean as indicated. Sample: 18, 17, 7, 8, 15
Mean = average μ = population mean x̄ = sample mean Answer: x̄ = 13
The graph of a normal curve is given on the right. Use the graph to identify the values of μ and σ.
Mean is the peak (11) S.D is mean - closest value (4)
Determine the expected count for each outcome.
Multiply n by each decimal 888*0.12 = 106.56
If the linear correlation between two variables is negative, what can be said about the slope of the regression line?
Negative
Determine whether the following graph can represent a normal curve
No, because the graph is not symmetric about its mean
Is the following a probability model? What do we call the outcome "brown"?
No, because the probabilities do not sum to 1 Impossible event
A simple random sample of size n = 79 is obtained from a population that is skewed left with μ = 84 and σ = 4. Does the population need to be normally distributed for the sampling distribution of x̄ to be approximately normally distributed? Why? What is the sampling distribution of x̄?
No. The central limit theorem states that regardless of the shape of the underlying population, the sampling distribution of x̄ becomes approximately normal as the sample size, n, increases. μ- = u (84) σ- = σ / (n)^1/2 (.450)
Determine the level of measurement of the variable below. Dress color
Nominal
Many track hurdlers believe that they have a better chance of winning if they start in the inside lane that is closest to the field. For the data below, the lane closest to the field is Lane 1, the next is Lane 2, and so on until the outermost lane, Lane 6. The data lists the number of wins for track hurdlers in the different starting positions. Find the P-value to test the claim that the probabilities of winning are the same in the different positions. Use α = 0.05. The results are based on 240 wins.
ONE SAMPLE VARIANCE CALC highlight the Number of Wins column find desired probability 240/6 = 40 for each lane H0: σ^2 = 40 verses H1: σ^2 ≠ 40 P-value > 0.10
Assume the random variable X is normally distributed with mean μ = 50 and standard deviation σ = 7. Compute the probability. Be sure to draw a normal curve with the area corresponding to the probability shaded P(x > 42)
Only one value means that it has to be fully shaded (no stripes) and greater than means to the right b) Use StatCrunch Normal Calculator
A company wants to determine if its employees have any preference among 5 different health plans that it offers to them. A sample of 200 employees provided the data below. Calculate the chi-square test statistics x^2α to test the claim that the probabilities show no preference. Use α = 0.01. Round to two decimal places.
Open the ONE SAMPLE VARIANCE only highlight employees next find your desired probability 200/5 = 40 for each test H0: σ^2 = 40 verses H1: σ^2 ≠ 40 Chi-Square Stat: 37.45
Determine the level of measurement of the variable. Birth order among siblings in a family
Ordinal
About 15% of the population of a large country is allergic to pollen. If two people are randomly selected, what is the probability both are allergic to pollen? What is the probability at least one is allergic to pollen?
P( .14 and .14) = .14 * .14 p( .14 and/or .14) = 0.0196 1) P(only 1 or both) = P(only 1) + P(both) 2) 1 - P(E) 1 - .14 = .86 3) .86 * .86 = .7396 4) 1 - P(neither computer literate) = 0.2604
Determine the required value of the missing probability to make the distribution a discrete probability distribution.
P(4) = .22
Suppose that events E and F are independent, P(E) = 0.3, and P(F) = 0.7. What is the p(E and F)?
P(E and F) = P(E) * P(F) 0.21
According to a research agency, in 23% of marriages the women has a bachelor's degree and the marriage lasts at least 20 years. According to a census report, 45% of women have a bachelor's degree. What is the probability a randomly selected marriage will last at least 20 years if the woman has a bachelor's degree? Note: 47% of all marriages last at least 20 years.
P(F|E) E is the event which is known to have occurred. The question asks for the probability that the selected marriage lasts at least 20 years if the woman has a bachelor's degree, so it is known that the woman in the selected marriage has a bachelor's degree F is the event for which the probability is sought . The question asks for the probability that the selected marriage lasts at least 20 years. P(E and F)/P(E) P(Woman has a bachelor's degree and marriage lasts longer than 20 years)/ P(woman has a bachelor's degree) ANSWER=.511
Suppose that E and F are two events and that P(E and F) = 0.15 and P(E)=0.8. What is P(F|E)?
P(F|E) = P(E and F) / P(E) 0.15/0.8
Ramp metering is a traffic engineering idea that requires cars entering a freeway to stop for a certain period of time before joining the traffic flow. The theory is that ramp metering controls the number of cars on the freeway and the number of cars accessing the freeway, resulting in a freer flow of cars, which ultimately results in faster travel times. To test whether ramp metering is effective in reducing travel times, engineers conducted an experiment in which a section of freeway had ramp meters installed on the on-ramps. The response variable for the study was the speed of vehicles. A random sample of 15 cars on the highway for a Monday at 6 p.m. with the ramp meters and a second random sample of 15 cars on a different Monday at 6 p.m. with the meters off resulted in the following speeds (in miles per hours).
Put into T SUMMARY CALC and click "include boxplots" Yes, the Meters On data appears to have higher speeds No, there does not appear to be any outliers H0: μ(on) = μ(off) versus H1: μ(on) > μ(off) P-value: 0.073 Do not reject H0. There is not sufficient evidence at the α = 0.05 level of significance that the ramp meters are effective in maintaining higher speed on the freeway.
Two researchers conducted a study in which two groups of students were asked to answer 42 trivia questions from a board game. The students in group 1 were asked to spend 5 minutes thinking about what it would mean to be a professor, while the students in group 2 were asked to think about soccer hooligans. These pretest thoughts are a form of priming. The 200 students in group 1 had a mean score of 23.5 with the standard deviation of 4.4, while the 200 students in group 2 had a mean score of 19.4 with a standard deviation of 3.5. Complete parts (a) and (b) below.
Put into T SUMMARY CALC using confidence L.B. 3.318 U.B. 4.882 The researchers are 95% confident that the difference of the means is in the interval Since the 95% confidence interval does not contain zero, the results suggest that priming does have an effect on scores.
Determine the level of measurement of the variable below. Height of a child: 28 in, 29 in, 30 in, 31 in, and 32 in
Ratio
Determine the level of measurement of the variable below. Volume of water used by a household in a day
Ratio
Complete the sentence below. The ______ ________, denoted p̂, is given by the formula p̂ = _______, where x is the number of individuals with a specified characteristic is a sample of n individuals.
Sample Proportion x/n
Find the sample variance and standard deviation. 20, 13, 3, 7, 8
Sample Variance (S^2): a measure of dispersion of the observations around their sample mean. USED FOR SAMPLE Standard Deviation (S): the standard deviation is the average amount of variability in your dataset. USED FOR POPULATION Steps For Sample Variance: 1) Find sample mean (x̄) 2) (1st data point - x̄)^2 +(2nd data point -x̄)^2... 3) divide by the number of data points you have - 1 (or n-1) Equation: s^2 = (1st-x̄)^2 + (2nd-x̄)^2... / (n-1) Steps for Sample Standard Deviation S= Squareroot (s^2)
Toshiba wants to administer a satisfaction survey to its current customers. Using their customer database, the company randomly selects 70 customers and asks them about their level of satisfaction with the company.
Simple random
Put the following in order from narrowest to widest interval. Assume the sample size and sample proportion is the same for all four confidence intervals. (a) 84% confidence interval (b) 97% confidence interval (c) 92% confidence interval (d) 96% confidence interval
Smallest level of confidence = narrowest interval ANSWER: (a) (c) (d) (b)
Determine whether the underlined numerical value is a parameter or a statistic. Explain your reasoning. A survey of 42 out of hundreds in a dining hall showed that 40.5% enjoyed their meal.
Statistic, because the data set of 42 people in a dining hall is a sample.
A simple random sample of size n is drawn from a population that is normally distributed. The sample mean, x̄, is found to be 114, and the sample standard deviation, s, is found to be 10. (a) Construct a 95% confidence interval about μ is the sample size, n, is 15 (b) Construct a 95% confidence interval about μ is the sample size, n, is 26 (c) Construct a 99% confidence interval about μ is the sample size, n, is 15 (d) Could we have computed the confidence intervals in part (a)-(c) if the population had not been normally distributed?
Step 1 Finding the U and B Boundaries: Z SCORES - The numbers on the side are your sample sizes - Sample sizes equal n - 1 - The numbers on top are your t values - Your t value = 1 - Confidience Level (.05) / 2 = .025 Plug into equations: x̄ - (tα/2) * s / (n)^1/2 where s is S.D. (LOWER BOUNDRY) x̄ + (tα/2) * s / (n)^1/2 (UPPER BOUNDRY) ANSWERS: As the sample size decreases, the margin of error increases As the level of confidence decreases, the size of the interval decreases No, the population needs to be normally distributed
The following data represents the number of games played in each series of an annual tournament from 1923 to 2019.
Steps: 1) add up all frequency values 2) divide all frequencies by total b) Pick the graph that represents data c) Use Statcrunch mean = 5.8 d) The series, if played many times, would be expected to last about 5.8 games on average. (e) S.D = 1.1
To determine her air quality, Carrie divides up her day into three parts: morning, afternoon, and evening. She then measures her air quality at 4 randomly selected times during each part of the day.
Stratified
Shapes of Distributions
Symmetric and bell-shaped: Pyramid with tall point in center Symmetric and uniformed: all the same height Skewed Right: the tall point has a tail going to the right Skewed Left: the tall point has a tail going to the left
To estimate the percentage of defects in a recent manufacturing batch, a quality control manager at General Foods selects every 20th soup can that comes off the assembly line starting with the third until she obtains a sample of 80 soup cans.
Systematic
The accompanying data represents the miles per gallon of a random sample of cars with a three-cylinder, 1.0 liter engine. (a) Compute the z-score corresponding to the individual who obtained 35.3 miles per gallon. Interpret this result. (b) Determine the quartiles (c) Compute and interpret the interquartile range, IQR. (d) Determine the lower and upper fences. Are there any outliers?
TIPS: Use the correct standard deviation for z-score calculation Finding Quartiles: Q2: since there are 24 data points 24/2= 12 and then use 13th data point. So Q2 = 12th + 13th / 2 Q1: 24/4 = 6 so 6th + 7th / 2 Q3: the 6th and 7th of the top 50% of data
A manufacturer of colored candies states that 13% of the candies in a bag should be brown, 14% yellow, 13% red, 24% blue, 20% orange, and 16% green. A student randomly selected a bag of colored candies. He counted the number of candies of each color and obtained the results shown in the table. Test whether the bag of colored candies follows the distribution stated above at α = 0.05 level of significance. Using the level of significance α = 0.05, test whether the color distribution is the same.
TWO sample variance calc σ1 = Variance of Freq σ2 = Variance of Claimed Prop H0: σ1^2/σ2^2 = 1 verses H1: σ1^2/σ2^2 ≠ 1 P-Value less than α, the colors are not the same a) H0: The distribution of colors is the same as stated by the manufacturer H1: The distribution of colors is not the same as stated by the manufacturer. b) Add up all found frequencies and then multiply by the expected percentages c) Chi-Square goodness-of-fit Calc Observed: Freq Expected: Claimed Prop Chi-Square = 14.117 d) P-Value is found next to Chi-Square in the same Calc e) Reject H0. There is sufficient evidence that the distribution of colors is not the same as stated by the manufacturer
The manufacturer of hardness testing equipment uses steel-ball indenters to penetrate metal that is being tested. However, the manufacturer thinks it would be better to use a diamond indenter so that all types of metal can be tested. Because of differences between the two types of indenters, it is suspected that the two methods will produce different hardness readings. The metal specimens to be tested are large enough so that two indentions can be made. Therefore, the manufacturers uses both indenters on each specimen and compares the hardness readings. Construct a 95% confidence interval to judge whether the two indenters result in different measurements. Note: A normal probability plot and boxplot of the data indicate that the differences are approximately normally distributed with no outliers.
Test statistic: Use PAIRED T CALC with confidence MAKE SURE THAT THE COLUMNS ARE CORRECTLY LABELED: DIAMOND MINUS STEEL BALL L.B. 0.2 U.B. 2.5 b) If the L.B. to U.B. includes 0 - There is insufficient If the L.B. to U.B. does not include 0 - There is sufficient
A license plate is to consist of 4 digits followed by 4 uppercase letters. Determine the number of different license plates possible if the first and second digits must be odd, and repetition is not permitted.
The Multiplication Rule of Counting states that if a task consists of a sequence of choices in which there are p selections for the first choice, q selections for the second choice, and r selections for the third choice, and so on, then the task of making these selections can be done in p * q * r ... different ways Digits 0-9 or 10 Letters 26 options Odd Digits: 1,3,5,7,9 or 5 options 5 * 4 (odd digits - 1 ) * 8 (digits - 2) * 7 (digits - 3) * 26 (full alphabet)* 25 (alpha -1) * 24 (alpha -2)* 23 (alpha -3)
Which of the following are properties of the normal curve?
The area under the normal curve to the right of the mean is 0.5 The high point is located at the value of the mean The graph of a normal curve is symmetric
Consider the following question from a recent poll. Thinking about how the gun control issue might affect your vote for major offices, would you vote only for a candidate who shares your views on gun control or consider a candidate's position on gun control as just one of many important factors? [rotated] Why is it important to rotate the two choices presented in the question?
The choices need to be rotated to minimize response bias
The following data represents the time between eruptions and the length of the eruption for 8 randomly selected geyser eruptions. The coefficient of determination is 91.4%. Provide an interpretation of this value
The coefficient of determination is found to be 91.4%. Provide an interpretation of this value: The least square regression line explains 91.4% of the variation in length of eruption
Determine whether the scatter diagram indicates that a linear relation my exist between two variables. If the relation is linear, determine whether it indicates a positive or negative association between the variables. Use this information to answer the following.
The data points do not have a linear relationship because they do not lie mainly in a straight line. The relationship is not linear
After giving a statistic exam, Professor Dang determined the following five-number summary for her class results. 55, 63, 73, 86, 95 Use this information to draw a boxplot of the exam scores
The five-number summary of a set of data consists of: the smallest data value the first quartile the median the third quartile the largest data value Drawing a box plot Steps: 1) draw a vertical line at Q1 2) draw a vertical line at the median 3) draw a vertical line at Q3 4) draw a horizontal line to the smallest and largest values
A study was conducted that resulted in the following relative frequency histogram. Determine whether or not the histogram indicates that a normal distribution could be used as a model for the variable.
The histogram is not bell-shaped, so a normal distribution could not be used as a model for the variable
Violent crimes include rape, robbery, assault, and homicide. The following is a summary of the violent-crime rate (violent crimes per 100,000 population) for all states of a country in a certain year. Q1 =273.8, Q2 =387.4, Q3 =529.7
The kth percentile of a set of data is a value such that k percent of the observations are LESS than or equal to the value. Q = Quartiles (Q divide data into fourths Q1 = the first Quartile) Q1 = bottom 25% or 25th percentile Q2 = bottom 50% Q3 = bottom 75% Determine and Interpret the interquartile range: IQR: is the range of the middle 50% IQR= Q3 - Q1 Check for Outliers using Quartiles: First determine the first and third quartiles of the data. Then compute the interquartile range and determine the fences, using the formulas below: Lower fence = Q1 - 1.5(IQR) Upper fence = Q3 + 1.5(IQR) Fences serve as cutoff points for determining outliers. If the data value is less than the lower or greater than the upper, it is considered an outlier. Determining distribution using Quartiles: Q2 - Q1 and Q3 - Q2 If the differences are about equal, then symmetric. If Q2-Q1 is larger than it is skewed left, if Q3 - Q2 is larger than it is skewed right
Explain the meaning of the accompanying percentiles. (a) The 5th percentile of the head circumference of males 3 to 5 months of age in a certain city is 41.0 cm (b) The 90th percentile of the waist circumference of females 2 years of age in a certain city is 49.8 cm (c) Anthropometry involves the measurement of the human body. One goal of these measurements is to assess how body measurements may be changing over time. The following table represents the standing height of males aged 20 years or older for various age groups in a certain city in 2015. Based on the percentile measurements of the different age groups, what might you conclude?
The kth percentile of a set of data is a value such that k percent of the observations are less than or equal to the value. (a) 5% of 3- to 5-month-old males have a head circumference that is 41.0 cm or LESS (b) 90% of 2-year-old females have a waist circumference that is 49.8 cm or LESS (c) Decrease -> taller
Suppose a life insurance company sells a $170,000 1-year term life insurance policy to a 20-year-old female for $300, According to the National Vital Statistics Report, 58(21), the probability that the female survives the year is 0.999544. Compute and interpret the expected value of this policy to the insurance company.
The mean of a random variable represents what is expected to happen in the long run, and so the mean of a random variable is also called the expected value. There are two possible outcomes to the experiment: survival or death. Let the random variable X represent the payout (money lost or gained), depending on survival or death of the female. The random variable X is a discrete variable with values of survival or death. x p(event) survives 0.999544 dies 0.000456 x = $150 - $120,000 if dies x p(event) $150(survives) 0.995 -119,850 (dies) 0.000 Mean = 222.48 b) The insurance company expects to make a profit of $222.48 on every 20-year-old female it insures for 1 year.
Determine the point estimate of the population proportion, the margin of error for the following confidence interval, and the number of individuals in the sample with the specified characteristic, x, for the sample size provided. Lower bound = 0.195, upper bound 0.585, n = 1000
The point estimate of the population proportion is .515 The Margin of error is: .194 The number of individuals: 515 p̂ = Point Estimate u = p̂ + E b = p̂ + E where u is upper and b is lower and E is the margin of error Find p̂: p̂ = (u + b) / 2 Find E: E = (u - b) / 2 Find x (number of individuals in the sample with specified characteristics): x = np̂
In a certain card game, the probability that a player is dealt a particular hand is 0.39. Explain what this probability means. If you play this card game 100 times, will you be dealt this hand exactly 39 times? Why or why not?
The probability 0.39 means that approximately 39 out of every 100 dealt hands will be that particular hand. No, you will not be dealt this hand exactly 39 times since the probability refers to what is expected in the long-term, not short-term.
Exit polling is a popular technique used to determine the outcome of an election prior to results being tallied. Suppose a referendum to increase funding for education is on the ballot in a large town (voting population over 100,000). An exit poll of 200 voters finds that 98 voted for the referendum. How likely are the results of your sample if the population proportion of voters in the town in favor of the referendum is 0.52? Based on your result, comment on the dangers of using exit polling to call elections.
The probability that fewer than 98 people voted for the referendum is Step 1: Find the sample proportion p̂ = x/n (.49) Step 2: Check to see if the distribution is normal np(1 - p) ≥ 10. (It is normal) Step 2: Find mean and SD μ = p (.52) σ = [ (p(1-p)) / n ] ^1/2 where p = .52 and n = 200 σ = 0.03532704347 Step 3: Put into StatCrunch Normal Calculator with .49 in the less than box b) The result is not unusual because the probability that p̂ is equal to or more extreme than the sample proportion is greater than 5%. Thus, it is not unusual for a wrong call to be made in an election if exit polling alone is considered.
In a relative frequency distribution, what should the relative frequencies add up to?
The relative frequencies add up to 1.
A researcher with the Department of Education followed a cohort of students who graduated from high school in a certain year, monitoring the progress the students made toward completing a bachelor's degree. One aspect of his research was to determine whether students who first attended community college took longer to attain a bachelor's degree than those who immediately attended and remained at a 4-year institution. The data in the table attached below summarizes the results of his study. Complete parts a) through e) below.
The response variable is the time to graduate. The explanatory variable is the use of community college or not. The samples can be reasonably assumed to be random The sample sizes are large (both greater than or equal to 30) The sample sizes are not more than 5% of the population The samples are independent. H0: μ (community college) = μ(no transfer), H1: μ(community college > μ(no transfer) b) Put into T SUMMARY CALC T = 13.33 P-value < 0.01 REJECT the null hypothesis because the P-value is LESS THAN the level of significance. There IS sufficient evidence that community college transfer students take longer to attain a bachelor's degree. d) Put into T SUMMARY CALC using confidence L.B. 0.887 U.B. 1.193 NO
Determine whether the study depicts an observational study or an experiment. Fifty children are divided into two groups. One group is exposed to a video on bullying. The other is not. After one week, both groups are questioned about their attitudes about violence.
The study is an experiment because the researchers control one variable to determine the effect on the response variable.
Determine whether the study depicts an observational study or an experiment A study is conducted to determine if there is a relationship between Parkinson's disease and childhood head trauma. Doctors look at the hospital records for patients with Parkinson's disease for any childhood head trauma
The study is an observational study because the study examines individuals in a sample, but does not try to influence the response variable
Determine whether the study depicts an observational study or an experiment. A study is conducted to determine if there is a relationship between heart arrhythmias and caffeine consumption. A sample of 100 people with a heart arrhythmia are asked about their caffeine consumption.
The study is an observational study because the study examines individuals in a sample, but does not try to influence the response variable.
Determine whether the underlined value is a parameter or a statistic. The average age of men who had walked on the moon was 39 years, 11 months, 15 days.
The value is a parameter because the men who had walked on the moon are a population
Determine whether the underlined value is a parameter or a statistic. In a national survey on substance abuse, 66.4% of respondents who were full-time college students aged 18 to 22 reported using alcohol within the past month.
The value is a statistic because the respondents who were full-time college students aged 18 to 22 are a sample
Determine whether the quantitative variable is discrete or continuous. Weight of a child
The variable is continuous because it is not countable
Determine whether the quantitative variable is discrete or continuous. Number of cars owned
The variable is discrete because it is countable
Determine whether the quantitative variable is discrete or continuous. Number of field goals attempted by a kicker
The variable is discrete because it is countable
Determine whether the quantitative variable is discrete or continuous. Number of words in a poem
The variable is discrete because it is countable
Determine whether the variable is qualitative or quantitative. Favorite basketball player
The variable is qualitative because it is an attribute characteristic
Determine whether the variable is qualitative or quantitative. Address
The variable is qualitative because it is an attribute characteristic.
Determine whether the variable is qualitative or quantitative. Distance in miles to nearest outdoor stadium
The variable is quantitative because it is a numerical measure
Determine whether the variable is qualitative or quantitative. Price of a new computer
The variable is quantitative because it is a numerical measure
Ways that Graphs can be misleading
The vertical axis starts a value other than 0, which indicates that things change faster that the rate it actually did Class widths/intervals are different. Not taking into account size or populations of different regions/groups Faulty representations in pictures or illustrations Pie graphs being 3D
A national survey of 2500 adult citizens of a nation found that 18% dreaded Valentine's Day. The margin of error for the survey was 1.1 percentage points with 85% confidence. Explain what this means.
There is 85% confidence that the proportion of the adult citizens of the nation that dreaded Valentine's Day is between 0.169 and 0.191. AND There is 85% confidence that the proportion of the adult citizens of the nation that did not dread Valentine's Day is between 0.809 and 0.831.
A national survey of 2000 adult citizens of a nation found that 25% dreaded Valentine's Day. The margin of error for the survey was 1.6 percentage points with 90% confidence. Explain what this means.
There is 90% confidence that the proportion of adult citizens of the nation that dreaded Valentine's Day is between 0.234 and 0.266 AND There is 90% confidence that the proportion of the adult citizens of the nation that did not dread Valentine's Day is between 0.734 and 0.766.F
Suppose the null hypothesis is rejected. State the conclusion based on the results of the test. Three years ago, the mean price of a single-family home was $243737. A real estate broker believes that the mean price has increased since then.
There is sufficient evidence to conclude that the mean price of a single-family home has increased.
Refer to the table linked below. Is constructing a conditional distribution by level of education different from constructing a conditional distribution by employment status? if they are different, explain the difference.
They are different because constructing a conditional distribution by level of education computes the relative frequency for each employment status, given the individual's level of education. Constructing a conditional distribution by employment status computes the relative frequency for each level of education, given the individual's employment status.
An educator wants to determine whether a new curriculum significantly improves standardized test scores for third-grade students. She randomly divides 90 third-graders into two groups. Group 1 is taught using the new curriculum, while Group 2 is taught using the traditional curriculum. At the end of the school year, both groups are given the standardized test and the mean scores are compared. Determine whether the sampling is dependent or independent. Indicate whether the response variable is qualitative or quantitative.
This sampling is independent because the individuals selected for one sample do not dictate which individuals are to be in a second sample. The variable is quantitative because it is a numerical measure.
A histogram of a set of data indicates that the distribution of the data is skewed right. Which measure of central tendency will likely be larger, the mean or the median? Why?
Tips: Note that extreme values tend to change the value of the mean, while having little effect on the median. Answer: The mean will likely be larger because the extreme values in the right tail tend to pull the mean in the direction of the tail.
Suppose that a recent poll found that 67% of adults believe that the overall state of moral values is poor.
To calculate mean use formula μ = np Answer: 201 To calculate S.D use formula S.D = (np(1-p ))^1/2 Answer: 8.1 c) For every 300 adults, the mean is the number of them that would be expected to believe that the overall state of moral values is poor. d) No
A binomial experiment is performed a fixed number of times. What is each repetition of the experiment called? For each repetition of a binomial experiment, what are the two mutually exclusive outcomes?
Trial Success/Failure
Is the statement below true or false? The distribution of the sample mean, x̄, will be normally distributed if the sample is obtained from a population that is normally distributed, regardless of the sample size.
True
True or False: When comparing two populations, the larger the standard deviation, the more dispersion the distribution has, provided that the variable of interest from the two populations has the same unit of measure.
True, because the standard deviation describes how far, on average, each observation is from the typical value. A larger standard deviation means that observations are more distant from the typical value, and therefore, more dispersed.
Suppose a doctor measures the height, x, and head circumference, y, of 8 children and obtains the data below. The correlation coefficient is 0.890 and the least squares regression line is y = 0.165x + 12.873
USE STATCRUNCH R^2 = 79.3% b) Approximately 79.3% of the variation in head circumference is explained by the least-squares regression model. According to the residual plot, the linear model appears to be appropriate. There must be no discernible pattern in the plot of the residuals in order to be appropriate
Construct a 99% confidence interval of the population proportion using the given information. x = 75, n = 250
USE STATCRUNCH Stat > Proportion Stats > One Sample > Summary x = # of successes (75) n = # of observations (250) Confidence level: 0.99 Method Standard-Wald ANSWERS: Lower: .225 Upper: .375
The following data represents the speed at which a ball was hit (in miles per hour) and the distance it traveled (in feet) for a random sample of home runs in a Major League baseball game in 2018.
USE STATCRUNCH a) 4.175 + -23.831 b) the slope of this least-squares regression line says that the distance the ball travels increases by the slope with every 1 mile per hour increase in the speed that the ball was hit. c) Interpreting the y-intercept is not appropriate d) the mean distance of all home runs hit at 414.5 (PLUG INTO EQUATION) e) If a ball was hit with a speed of 105 miles per hour, the distance that it is most likely to travel is 414.5 feet f) The ball did not travel farther than the 419.6 feet that would have been predicted given the speed with which the ball was hit g) No, because the least squares regression model cannot predict the distance of a home run when the speed of the ball is outside of the scope of the model
A survey was conducted that asked 1007 people how many books they had read in the past year. Results indicated that x̄ = 11.7 and s = 16.6 books. Construct a 90% confidence interval for the mean number of books people read. Interpret the interval.
USE STATCRUNCH SAMPLE T SUMMARY CALC x̄ = 11.7 s = 16.6 Sample size = 1007
Test the hypothesis using the P-value approach. Be sure to verify the requirements of the test. H0: p = 0.4 versus H1 : p > 0.4 n = 200; x = 90; α = 0.1
USE STATCRUNCH: one sample proportion summary calc x = successes n = observations put in p equations a) Whatever the Z-stat is (1.44) b) Whatever the P-value is (0.075) c) Reject the null hypothesis, because the P-value is less than α. For the P-value approach, compare the P-value with α. If the P-Value < α, reject the null hypothesis
Construct a confidence interval for p1 - p2 at the given level of confidence x1 = 358, n1 = 511, x2 = 444, n2 = 566, 95% confidence
USE TWO SAMPLE PROPORTION CALC The researchers are 95% confident the difference between the two population proportions, p1 - p2, is between -0.136 and -0.032
A survey asked, "How many tattoos do you currently have on your body?" Of the 1236 males surveyed, 190 responded that they had at least one tattoo. Of the 1068 females surveyed, 140 responded that they had at least one tattoo. Construct a 95% confidence interval to judge whether the proportion of males that have a least one tattoo differs significantly from the proportion of females that have at least one tattoo. Interpret the interval.
USE TWO SAMPLE PROPORTION CALC L.B: -0.006 U.B.: 0.051 There is 95% confidence that the difference of the proportions is in the interval. Conclude that there is insufficient evidence of a significant different in the proportion of males and females that have at least one tattoo.
An investment counselor calls with a hot stock tip. He believes that if the economy remains strong, the investment will result in a profit of $30,000. If the economy grows at a moderate pace, the investment will result in a profit of $20,000. However, if the economy goes into recession, the investment will result in a loss of $30,000. You contact an economist who believes there is a 30% probability the economy will remain strong, a 60% probability the economy will grow at a moderate pace, and a 10% probability the economy will slip into recession. What is the expected profit from this investment?
Use StatCrunch x p(event) 30,000 .3 20,000 .6 -30,000 .1 Find the mean = $18,000
A binomial probability experiment is conducted with the given parameters. Compute the probability of x successes in the n independent trials of the experiment. n =9, p = 0.9, x ≤ 3
Use StatCrunch Binomial calculator
Assume the random variable X is normally distributed, with mean μ = 60 and standard deviation σ = 8. Find the 12th percentile.
Use StatCrunch and leave the X value blank and put .12 into the output
The table to the right contains observed values and expected values in parentheses for two categorical variables, X and Y, where variable X has three categories and variable Y has two categories. Use the table to complete parts (a) and (b) below.
Use the Cell Test equation (( 31 - 32.66 )^2 / 32.66) + ... b) H0: The Y category and X category are independent H1: The Y category and X category are dependent c) All Chi-Square tests for independence are right-tailed, so the p-value is the area to the right of X^20 = 1.840 With 2 Degrees of Freedom The P-value is GREATER THAN 0.10 No, do not reject H0. There is not sufficient evidence at the α = 0.05 level of significance to conclude that X and Y are dependent because the P-value > α.
In randomized, double-blind clinical trials of a new vaccine, infants were randomly divided into two groups. Subjects in group 1 received the new vaccine while subjects in group 2 received a control vaccine. After the second dose, 108 of 653 subjects in the experimental group (group 1) experienced drowsiness as a side effect. After the second dose, 65 of 534 of the subjects in the control group (group 2) experienced drowsiness as a side effect. Does the evidence suggest that a higher proportion of subjects in group 1 experience drowsiness as a side effect than subjects in group 2 at the α = 0.10 level of significance?
Verify the model requirements. Select all that apply. The samples are independent n1p1(1-p) > 10 and n2p2(1-p2) > 10 The sample size is less than 5% of the population size for each sample. H0: p1 = p2 H1: p1 > p2 c) USE TWO SAMPLE PROPORTION CALC z-score = 2.12 P-value = 0.017 If the population proportions are EQUAL, one would expect a sample difference proportion GREATER THAN the one observed in about 17 out of 1000 repetitions of this experiment. p-value (.017) * 1000 = 17 Reject h0. There is sufficient evidence to conclude that a higher proportion of subjects in group 1 experience drowsiness as a side effect than subjects in group 2 at the α = 0.10 level of significance.
Determine whether the distribution is a discrete probability distribution.
Yes, because the sum of the probabilities is equal to 1 and each probability is between 0 and 1, inclusive
Suppose a surveyor wants to conduct a phone survey about a new song. He plans to take a simple random sample. However, some people are not at home. Do you believe this can affect the ability of the surveyor to obtain accurate polling results? If so, how?
Yes, especially if the people who are not at home have a traits that is not accurately represented by the remaining people in the sample.
On an international exam, students are asked to respond to a variety of background questions. For the 41 nations that participated in the exam, the correlation between the percentage of items answered in the background questionnaire (used as a proxy for student task persistence) and mean score on the exam was 0.821. Does this suggest there is a linear relation between student task persistence and achievement score? Write a sentence that explains what this result might mean.
Yes, since |0.821| is greater than the critical value for 30 Countries in which students answered a greater percentage of items in the background questionnaire tended to have higher mean scores on the exam.
Suppose babies born after a gestation period of 32 to 35 weeks have a mean weight of 3000 grams and a standard deviation of 700 grams while babies born after a gestation period of 40 weeks have a mean weight of 3200 grams and a standard deviation of 470 grams. If a 34-week gestation period baby weighs 3450 grams and a 40-week gestation period baby weighs 3650 grams, find the corresponding z-scores. Which baby weighs more relative to the gestation period?
Z-score = x - μ / σ (where x equals weight) Z-Scores CAN BE NEGATIVE All μ and σ are given in the word problem Answer: The baby born in week 41 weighs relatively more since its z-score , 0.34, is larger than the z-score of 0.19 for the baby born in week 33. REMEMBER: the larger the z-score the more it relatively weighs and
The following data represents the level of health and the level of education for a random sample of 1666 residents. Complete parts (a) and (b) below.
a) H0: Level of education and health are independent H1: Level of education and health are dependent b) Contingency Table WITH SUMMARY Select Columns (All but education) Row Labels: Education Display: Expected Count X^20 = 31.358 The P-value is LESS THAN 0.01 Reject H0. There is sufficient evidence that level of education and health are associated c) Using the same table DIVIDE Each Cell (original values not the ones in ( )'s) by the ROW total d) Match Bar graphs to correct cells
Given below is the smoking status by level of education for residents 18 years old or older living in a certain country from a random sample of 1052 residents. Complete parts (a) and (b)
a) H0: Smoking status and level of education are independent H1: Smoking status and level of education are dependent b)Contingency Table WITH SUMMARY Select Columns (All but Number of Years of Education) Row Labels: Number of Years of Education Display: Expected Count X^20 = 9.999 The P-value is GREATER THAN 0.10 The P-value is GREATER THAN α, so DO NOT REJECT H0. There IS NOT sufficient evidence that smoking status and level of education are related. c) Using the same table DIVIDE Each Cell (original values not the ones in ( )'s) by the ROW total d) Choose the matching bar graph for each group of cells e) Yes, None of the groups show a clear relationship with education.
A traffic safety company publishes reports about motorcycle fatalities and helmet use. in the first accompanying data table, the distribution shows the proportion of fatalities by location of injury for motorcycle accidents. The second data table shows the location of injury and fatalities for 2051 riders not wearing a helmet. Complete parts (a) and (b) below.
a) H0: The distribution of fatal injuries for riders not wearing a helmet FOLLOWS the same distribution for all other riders. H1: The distribution of fatal injuries for riders not wearing a helmet DOES NOT FOLLOW the same distribution for all other riders. b) Multiply the total number of riders (2051) by percentages c)Chi-Square goodness-of-fit Calc Observed: Frequency Expected: Probability P-value is less than 0.01 Chi-Square = 115.943 P-Value = 0.000 The range of P-values for the test is LESS THAN 0.01 REJECT H0. There IS sufficient evidence that the distribution of fatal injuries for riders not wearing a helmet DOES NOT FOLLOW the distribution for all riders. Motorcycle fatalities from HEAD INJURIES occur MORE frequently for riders not wearing a helmet, while they occur LESS frequently for all other locations.
A book claims that more hockey players are born in January through March than in October through December. The following data shows the number of players selected in a draft of new players for a hockey league according to their birth months. Is there evidence to suggest that hockey player's birthdates are not uniformly distributed throughout the year? Use the level of significance α = 0.01.
a) H0: The distribution of hockey players' birth months is uniformly distributed H1: The distribution of hockey players' birth months is not uniformly distributed. b) Add up total number of hockey players and multiply them by their probabilities. Jan-March = 3/12 (You should get the same value for all (46.75)) c) Chi-Square goodness-of-fit Calc YOU HAVE TO MANUALLY INSERT THE EXPECTED COUNT INTO THE NEXT COLUMN Observed: Frequency Expected: Expected (all 46.75) P-value is less than 0.005 Chi-Square = 15.76 P-Value = 0.0013 Yes, because the calculated P-value is less than the given α level of significance
A book claims that more hockey players are born in January through March than in October through December. The following data shows the number of players selected in a draft of new players for a hockey league according to their birth months. Is there evidence to suggest that hockey player's birthdates are not uniformly distributed throughout the year? Use the level of significance α = 0.01.
a) H0: The distribution of hockey players' birth months is uniformly distributed H1: The distribution of hockey players' birth months is not uniformly distributed. b) Add up the total number of hockey players and multiply them by their probabilities. Jan-March = 3/12 (You should get the same value for all (44.5)) c) Chi-Square goodness-of-fit Calc YOU HAVE TO MANUALLY INSERT THE EXPECTED COUNT INTO THE NEXT COLUMN Observed: Frequency Expected: Expected (all 44.5) Chi-Square = 14.40 P-Value = 0.002 Yes, because the calculated P-value is less than the given α level of significance
A researcher wanted to determine whether certain accidents were uniformly distributed over the days of the week. The data shows the day of the week for n = 303 randomly selected accidents. Is there a reason to believe that the accidents occur with equal frequency with respect to the day of the week at the α = 0.05 level of significance?
a) H0: p1 = p2 = ... = p7 = 1/7 H1: At least one proportion is different from the others. b) Multiply the total number of accidents by their expected proportion (1/7) = 43.286 SHOULD ALL BE THE SAME c)Chi-Square goodness-of-fit Calc YOU HAVE TO MANUALLY INSERT THE EXPECTED COUNT INTO THE NEXT COLUMN Observed: Frequency Expected: Expected (all 43.286) Chi-Square = 14.541 P-Value = 0.0241 The P-value is BETWEEN 0.01 AND 0.025. Reject H0, because the calculated P-value is less than the given level of significance.
To test the belief that sons are taller than their fathers, a student randomly selects 13 fathers who have adult male children. She records the height of both the father and son in inches and obtains the following data. Are sons taller than their fathers? Use the α = 0.05 level of significance. Note: a normal probability plot and boxplot of the data indicate that the differences are approximately normally distributed with no outliers.
a) The sampling method results in a dependent sample. The differences are normally distributed or the sample size is large. The sample size is no more than 5% of the population size. H0: μd = 0 versus μd < 0 Test statistic: Use PAIRED T CALC (0.02) The P-value is in the range of 0.25 < P-value < 1. DO NOT REJECT H0 because the P-value is GREATER than the level of significance. There IS NOT sufficient evidence to conclude that sons ARE TALLER THAN their fathers at the 0.05 level of significance.
Determine whether the events E and F are independent or dependent. Justify your answer.
a) E: A person having a high GPA F: The same person being a heavy reader of assigned course material. E and F are dependent because being a heavy reader of assigned course materials can affect the probability of a person having a high GPA. b) E: A randomly selected person planting tulip bulbs in October F: A different randomly selected person planting tulip bulbs in April E cannot affect F and vice versa because the people were randomly selected, so the events are independent c) E: The consumer demand for synthetic diamonds. F: The amount of research funding for diamond synthesis The consumer demand for synthetic diamonds could affect the amount of research funding for diamond synthesis, so E and F are dependent.
A simple random sample of size n is drawn. The sample mean, x̄, is found to be 17.5, and the sample standard deviation, s, is found to be 4.1.
a) x̄ - (tα/2) * s / (n)^1/2 where s is S.D. (LOWER BOUNDRY) x̄ + (tα/2) * s / (n)^1/2 (UPPER BOUNDRY) b) x̄ - (tα/2) * s / (n)^1/2 where s is S.D. (LOWER BOUNDRY) x̄ + (tα/2) * s / (n)^1/2 (UPPER BOUNDRY) How does increasing the sample size affect the margin of error, E? THE MARGIN OF ERROR DECREASES c) x̄ - (tα/2) * s / (n)^1/2 where s is S.D. (LOWER BOUNDRY) x̄ + (tα/2) * s / (n)^1/2 (UPPER BOUNDRY) Compare the results to those obtained in part (a). How does increasing the level of confidence affect the size of the margin of error E? THE MARGIN OF ERROR INCREASES d) If the sample size is 13, what conditions must be satisfied to compute the confidence interval? THE SAMPLE DATA MUST COME FROM A POPULAZTION THAT IS NORMALLY DISTRIBUTED WITH NO OUTLIERS.
Twelve jurors are randomly selected from a population of 5 million residents. Of these 5 million residents, it is know that 47% are of a minority race. Of the 12 jurors selected, 2 are minorities.
a) 2 minority jurors / 12 total jurors = .17 b) Use Statcrunch Use ≤ c) The number of minorities on the jury is unusually low, given the composition of the population from which it came
A doctor wants to estimate the mean HDL cholesterol of all 20 - to 29-year-old females. How many subjects are needed to estimate the mean HDL cholesterol within 3 points with 99% confidence assuming s = 18.3 based on earlier studies? Suppose the doctor would be content with 95% confidence. How does the decrease in confidence affect the sample size required?
a) 247 [(zα/2) * s) / E ] ^2 E = 3 points NOT A DECIMAL REMEMBER TO ROUND UP!!!! b) 143 [(zα/2) * s) / E ] ^2 c) Decreasing the confidence level decreases the sample size needed. USING STATCRUNCH: T Confidence calculator for QUANTATITIVE measures Confidence level = .99 Std dev. = 18.3 Width = Margin of error * 2 3 points * 2 = 6 width
The following graph represents the results of a survey, in which a random sample of adults was asked if they believe that divorce was morally wrong in general.
a) 70% b)54 mill c) If a polling organization claimed that the results of the survey indicate the 10% of adults believe that divorce is acceptable in certain situations, would you say this statement is descriptive or inferential? Why? The statement is inferential because it makes a prediction
What is meant by a marginal distribution? What is meant by a conditional distribution?
a) A marginal distribution is a frequency or relative frequency distribution of either the row or column variable in a contingency table b) A conditional distribution lists the relative frequency of each category of the response variable, given a specific value of the explanatory variable in a contingency table
A polling organization contacts 1689 adult men who are 40 to 60 years of age and live in the United States and asks whether or not they had seen their family doctor within the past 6 months.
a) Adult men who are 40 to 60 years of age and live in the United States. b) The 1689 adult men who are 40 to 60 years of age and live in the United States
A quality-control manager randomly selects 100 bottles of olive oil that were filled on August 14 to assess the calibration of the filling machine. a) What is the population in the study? b) What is the sample in the study?
a) All bottles of olive oil produced in the plant on August 14. b) The 100 bottles of olive oil selected in the plant on August 14.
The following newspaper type graphic illustrates the ideal family size (total children) based on a survey of adults from a certain country.
a) Bar graph b) A reader cannot tell whether the graph ends at the top of the nipple on the baby bottle or at the end of the milk c) the accurate bar graph
A certain drug can be used to reduce the acid produced by the body and heal damage to the esophagus due to acid reflux. The manufacturer of the drug claims that more than 91% of patients taking the drug are healed within 8 weeks. In clinical trials, 219 of 237 patients suffering from acid reflux disease were healed after 8 weeks. Test the manufacturer's claim at the α = 0.01 level of significance.
a) Because np0( 1 - p0) = 19.4 > 10, the sample size is LESS THAN 5% of the population size, and the sample CANBE REASONABLY ASSUMED TO BE RANDOM, all of the requirements for testing the hypothesis ARE satisfied. n = 237 p0 = 91% b) H0: p = .91 versus H1: p > .91 c) USE STAT CRUNCH Proportion summary calc The answer is the Z-Stat and P-value d) Do not reject the null hypothesis. There is insufficient evidence to conclude that more than 91% of patients taking the drug are healed within 8 weeks.
Researchers selected 839 patients at random among those who take a certain widely-used prescription drug daily. In a clinical trial, 24 out of the 839 patients complained of flu-like symptoms. Suppose that it is known that 2.5% of patients taking competing drugs complain of flu-like symptoms. Is there sufficient evidence to conclude that more than 2.5% of this drug's users experience flu-like symptoms as a side effect at the α = 0.01 level of significance?
a) Because np0( 1 - p0) = 20.5 > 10, the sample size is LESS THAN 5% of the population size, and the patients in the sample WERE selected at random, all of the requirements for testing the hypothesis ARE satisfied. n = 839 p0 = 2.5% b) μ = population mean, σ = population S.D., and p = population proportion H0: p = 0.025 versus H1: p > 0.025 c) USE STAT CRUNCH Proportion summary calc The answer is the Z-Stat and P-value d) Since the P-value is GREATER than α, DO NOT REJECT the null hypothesis. There IS NOT sufficient evidence at the α = 0.01 level of significance to conclude that MORE THAN 2.5% of the users who take the prescription daily complained of flu-like symptoms.
In a previous poll, 31% of adults with children under the age of 18 reported that their family ate dinner together seven nights a week. Suppose that, in a more recent poll, 1189 adults with children under the age of 18 were selected at random, and 344 of those 1189 adults reported that their family ate dinner together seven nights a week. Is there sufficient evidence that the proportion of families with children under the age of 18 who eat dinner together seven nights a week has decreased? Use α = 0.01 the significance level.
a) Because np0( 1 - p0) = 254.3 > 10, the sample size is LESS THAN 5% of the population size, and the adults in the sample WERE selected at random, all of the requirements for testing the hypothesis ARE satisfied. n = 1189 p0 = 31% b) H0: p = .31 versus H1: p < .31 c) USE STAT CRUNCH Proportion summary calc The answer is the Z-Stat and P-value THE Z-STAT CAN BE NEGATIVE d) Since the P-value is GREATER than α, DO NOT REJECT the null hypothesis. There IS NOT sufficient evidence at the α = 0.01 level of significance to conclude that the proportion of families with children under the age of 18 who eat dinner together seven nights a week is LESS THAN .31.
Researchers wish to know if there is a link between hypertension (high blood pressure) and consumption of salt. Past studies have indicated that the consumption of vegetables offsets the negative impact of salt consumption. It is also known that there is quite a bit of person-to-person variability as far as the ability of the body to process and eliminate salt. However, no method exists for identifying individuals who have a higher ability to process salt. It is recommended that daily intake of salt should not exceed 2500 milligrams (mg). The researchers want to keep the design simple, so they choose to conduct their study using a completely randomized design. Complete parts (a) through (c).
a) Blood pressure b) Body's ability to process salt, daily consumption of salt, and daily consumption of vegetables c) BP -not a factor Daily consump. of salt - can be controlled Daily consump. of veggies - can be controlled Body's ability to process salt - cannot be controlled Age- not a factor gender - not a factor Experimental units should be randomly assigned to each treatment group
Researchers wanted to determine if there was an association between the level of stress of an individual and their risk of breast cancer. The researchers studied 1555 people over the course of 7 years. During this 7-year period, they interviewed the individuals and asked questions about their daily lives and the hassles they face. In addition, hypothetical scenarios were presented to determine how each individual would handle the situation. These interviews were videotaped and studied to assess the emotions of the individuals. The researchers also determined which individuals in the study experienced any type of breast cancer over the 7-year period. After their analysis, the researchers concluded that the stress-free individuals were less likely to experience breast cancer. Complete parts (a) through (c).
a) Cohort study -> by observing them over a long period of time b) Whether or not breast cancer was contracted -> is the variable of interest level of stress -> affects the other variables c) The researchers may be concerned with confounding that occurs when the effects of two or more explanatory variables are not separated or when there are some explanatory variables that were not considered in a study, but that affect the value of the response variable.
Researchers wanted to evaluate whether ginkgo, an over-the-counter herb marketed as enhancing memory, improves memory in elderly adults as measured by objective tests. To do this, they recruited 87 men and 135 women older than 50 years and in good health. Participants were randomly assigned to receive ginkgo, 30 milligrams (mg) 3 times per day, or a matching placebo. The measure of memory improvement was determined by a standardized test of learning and memory. After 6 weeks of treatment, the data indicated that ginkgo did not increase performance on standard tests of learning, memory, attention, and concentration. These data suggest that, when taken following the manufacturer's instructions, ginkgo provides no measurable increase in memory or related cognitive function to adults with healthy cognitive function. Complete parts (a) through (g) below.
a) Completely randomized design b) Adults older that 50 years and in good health c) The score on the standardized test of learning and memory d) The drug: 30 mg 3 times a day or a matching placebo The 30 mg of the drug 3 times a day or a matching placebo e) The 87 men and 135 women older than 50 who are in good health that participated in the study f) The placebo group serves as the control group because this group corresponds to the reference point that will be compared to the other group g) The picture with a group of (herb) and (placebo), not the one that has two of each
Suppose a simple random sample of size n = 200 is obtained from a population whose size is N = 30,000 and whose population proportion with a specified characteristic is p = 0.6
a) Find 5% of the population and use np(1 - p) ≥ 10 ANSWER: Approximately normal because n ≤ 0.05N and np(1 - p) ≥ 10.
Several years ago, 43% of parents with children in grade k-12 were satisfied with the quality of education the students receive. A recent poll found that 488 of 1,135 parents with children in grades k-12 were satisfied with the quality of education the students receive. Construct a 90% confidence interval to access whether this represents evidence that parents' attitude toward the quality of education have changed.
a) H0: p = .43 versus H1: p ≠ .43 b) USE STATCRUNCH Proportion summary calc but use confidence instead of hypothesis c) Since the interval contains the proportion stated in the null hypothesis, there is insufficient evidence that parents' attitudes toward the quality of education have changed.
One year, the mean age of an inmate on death row was 40.8 years. A sociologist wondered whether the mean age of a death-row inmate has changed since then. She randomly selects 32 death-row inmates and finds that their mean age is 39.8, with a standard deviation of 8.4. Construct a 95% confidence interval about the mean age. What does the interval imply?
a) H0: μ = 40.8 versus μ ≠ 40.8 b) USE STATCRUNCH T SUMMARY CALC mean = 39.8 With 95% confidence, the mean age of a death row inmate is between 36.77 years and 42,83 years. c) Since the mean age from the earlier year is contained in the interval, there is not sufficient evidence to conclude that the mean age had changed.
A credit score is used by credit agencies (such as mortgage companies and banks) to assess the creditworthiness of individuals. Values range from 300 to 850, with a credit score over 700 considered to be a quality credit risk. According to a survey, the mean credit score is 710.6. A credit analyst wondered whether high-income individuals (incomes in excess of $100,000 per year) had higher credit scores. He obtained a random sample of 42 high-income individuals and found the sample mean credit score to be 723.9 with a standard deviation of 84.9. Conduct the appropriate test to determine if high-income individuals have higher credit scores at the α = 0.10 level of significance.
a) H0: μ = 710.6 H1: μ > 710.6 b) USE STATCRUNCH T SUMMARY CALC mean = 723.9 t0 = 1.02 c) The P-value is in the range of P-value > 0.10 DO NOT REJECT the null hypothesis. There IS NOT sufficient evidence to conclude that the mean credit score of high-income individuals is GREATER THAN 710.6.
Does a platelet-rich plasma (PRP) injection into the scalp promote hair growth? Researchers identified 40 female patients with female pattern hair loss. The patients ranged in age from 30 to 60 years. For each patient, two areas with hair loss were identified. A coin toss was used to decide which area received the PRP injection and which received a saline injection. Injections were given to each patient at one-week intervals. After 6 months the change in the patients' hair density (number of hairs per square centimeter) and hair diameter (millimeters) was measured. The mean difference in hair density (PCP minus saline) was 62.65 hairscm2 and the mean difference in hair thickness was 0.15 mm. Complete parts (a) through (g) below.
a) Matched-pairs design b) Females with hair loss 30 to 60 years of age c) Hair diameter and hair density d) The treatment is the injection and it is set at two levels: platelet-rich plasma (PRP) injection or saline injection e) The forty female patients f) Randomly choose the area of the scalp that receives the treatment g) Figure 2 that only has 1 row of boxes
The following survey has bias. (a) Determine the type of bias. (b) Suggest a remedy. A creamery is considering opening a new store in another town. Before opening the store, the company would like to know the percentage of households in this town that regularly visit an ice cream shop. The market researcher obtains a list of households and randomly selects 150 of them. He mails a questionnaire to the 150 households that asks about ice cream eating habits. Of the 150 questionnaires mailed, 4 are returned.
a) Nonresponse bias b) Conduct face-to-face or telephone interviews.
The survey has bias. (a) Determine the types of bias. (b) Suggest a remedy. A polling organization conducts a study to estimate the percentage of households that have high-speed Internet access. It mails a questionnaire to 1640 randomly selected households across the country and asks the head o each household if he or she has high-speed Internet access. Of the 1640 households selected 28 responded.
a) Nonresponse bias b) The polling organizations should try contacting households that do not respond by phone or face-to-face.
A survey of 36 randomly selected students who dropped a course was conducted at a college. The following results were collected. Complete parts (a) through (c).
a) Open Statcrunch and sort both categories by Ascending and count up the frequencies MAKE SURE TO ADD ALL UP TO DOUBLE CHECK b) H0: Gender and drop reason are independent H1: Gender and drop reason are dependent c) Delete qualitative date and input the found contingency table (including the genders in row 1 and 2, and reasons as column titles) Contingency Table WITH SUMMARY Select Columns (All Except Gender) Row Labels: Gender Display: Expected Count X^20 = 0.007 The P-value is GREATER THAN 0.10 Fail to reject H0. There is not sufficient evidence that gender and drop reasons are dependent. Using the same table DIVIDE Each Cell (original values not the ones in ( )'s) by the ROW total Choose the matching bar graph for each group of cells
A test to determine whether a certain antibody is present is 99.4% effective. This means that the test will accurately come back negative if the antibody is not present (in the test subject) 99.4% of the time. The probability of a test coming back positive when the antibody is not present (a false positive) is 0.006. Suppose the test is given to four randomly selected people who do not have the antibody. a) What is the probability that the test comes back negative for all four people? b) What is the probability that the test comes back positive for at least one of the four people?
a) P(all 4 tests are negative)= 0.9762 (Multiplied .994 four times) b) P(at least one positive)= 0.0238 (1-0.9762)
Michael went to the driving range with his range finder and hit 75 golf balls with his pitching wedge and measured the distance each ball traveled (in yards). The accompanying table shows his data.
a) Pick the most accurate graph (check range of data) and pick the one with the y scale being decimals (c) The shape of the distribution is approximately normal b) Pick the one where the line curves and has decimal y scale Yes, because the histogram shape resembles a normal curve, and the area of each bar is roughly equal to the area under the normal curve for the same region.
The survey has bias. (a) Determine the type of bias. (b) Suggest a remedy. An anti-smoking advocate wants to estimate the percentage of people who favor decreasing the use of advertising to sell cigarettes. He conducts a nationwide survey of 1440 randomly selected adults 18 years and older. The interviewer asks the respondents, "Do you favor higher health insurance premiums for smokers?"
a) Response bias b) The interviewer should reword the question
The owner of a shopping mall wishes to expand the number of shops available in the food court. He has a market researcher survey the first 90 customers who come into the food court during weekend afternoons to determine what types of food the shoppers would like to see added to the food court. Complete parts (a) and (b) below.
a) Sampling bias b) Ask customers throughout the day on both weekdays and weekends
A college entrance exam company determined that a score of 23 on the mathematics portion of the exam suggests that a student is ready for college-level mathematics. To achieve this goal, the company recommends that students take a core curriculum of math courses in high school. Suppose a random sample of 250 students who completed this core set of courses results in a mean math score of 23.3 on the college entrance exam with a standard deviation of 3.5. Do these results suggest that the students who complete the core curriculum are ready for college-level mathematics? That is, are they scoring above 23 on the mathematics portion of the exam?
a) The appropriate null and alternative hypotheses are H0: μ = 23 versus H1: μ > 23. b) The students' test scores were independent of one another. The sample size is larger than 30 The students were randomly sampled c) USE STATCRUNCH T SUMMARY CALC mean = 23.3 t0 = 1.36 p-value = 0.088 d) DO NOT REJECT the null hypothesis and claim that there IS NOT sufficient evidence to conclude that the population mean is GREATER than 23.
The random-number generator on calculators randomly generates a number between 0 and 1. The random variable x, the number generated, follows a uniform probability distribution. (a) Identify the graph of the uniform density function. (b) What is the probability of generating a number between 0.89 and 1? (c) What is the probability of generating a number greater than 0.84?:
a) The graph that goes from 1 to 1 b) 1 - .89 = .11 c) 1 - .84 = .16
Three years ago, the mean price of an existing single-family home was $243,799. A real estate broker believes that existing home prices in her neighborhood are lower. (a) State the null and hypothesis in words (b) State the null and hypothesis symbolically (c) Explain what is would mean to make a Type 1 error (d) Explain what it would mean to make a Type II error
a) The mean price of a singe family home in the broker's neighborhood is $243,799 The mean price of a single family home in the broker's neighborhood is less than $243,799. b) H0: μ = 243799 H1: μ < 243799 c) The broker REJECTS the hypothesis that the mean price Is EQUAL TO 243799, when the true mean price is EQUAL TO 243799 d) The broker FAILS TO reject the hypothesis that the mean price Is EQUAL TO 243799, when the true mean price is LESS THAN 243799
Suppose that a single card is selected from a standard 52-card deck. What is the probability that the card drawn is a king? Now suppose that a single card is drawn from a standard 52-card deck, but it is told that the card is a face card (jack, queen, or king). What is the probability that the card drawn is a king?
a) The probability that the card drawn from a standard 52-card deck is a king is 0.077 (4/52) b) The probability that the card drawn from a standard 52-card deck is a king, given that this card is a face card, is 0.333 (4/12)
According to a study, the proportion of people who are satisfied with the way things are going in their lives is .88. Suppose that a random sample of 100 people is obtained.
a) The response is qualitative because the responses can be classified based on the characteristic of being satisfied or not. b) The sample proportion p̂ is a random variable because the value of p̂ varies from sample to sample. The variability is due to the fact that different people feel differently regarding their satisfaction. c) Since the sample size is NO MORE than 5% of the population size and np(1-p)= 10.56 ≥10, the distribution of p̂ is approximately normal with μ = .88 and σ = .032 d) USE STATCRUNCH put .90 into the greater than box e) USE STATCRUNCH put .82 into the less than box ANSWER: The probability that 82 or fewer people in the sample are satisfied is .0324, which IS unusual because this probability IS less than 5%
In a certain game of chance, a wheel consists of 38 slots numbered 00, 0, 1, 2,..., 36. To play the game, a metal ball is spun around the wheel and is allowed to fall into one of the numbered slots. Complete parts (a) through (c) below.
a) The sample space is {00, 0, 1, 2,..., 036} b) The probability that the metal ball falls into the slot marked 8 is 0.0263 If the wheel is spun 1000 times, it is expected that about 26 of those times result in the ball landing in slot 8. c) The probability that the metal ball lands in an odd slot is 0.4737 (00 and 0 count as odd numbers) If the wheel is spun 100 times, it is expected that about 47 of those times result in the ball landing on an odd number.
Researchers wanted to determine if there was an association between daily cantaloupe consumption and the occurrence of breast cancer. The researchers looked at 94,899 women and asked them to report their cantaloupe-eating habits. The researchers also determined which of the women had IDC breast cancer. After their analysis, the researchers concluded that the consumption of two or more servings of cantaloupe per day was associated with a reduction in IDC breast cancer. a) What type of observational study was this? Explain. b) What was the response variable in the study? Is the response variable qualitative or quantitative? What is the explanatory variable? c) In their report, the researchers stated that "After adjusting for various demographics and lifestyle variables. daily consumption of two or more servings was associated with a 30% reduced prevalence of IDC breast cancer." Why was it important to adjust for these
a) This was a cross-sectional because all information about the individuals was collected at a specific point in time. b)The response variable is whether the woman has IDC breast cancer or not. The response variable is qualitative. The explanatory variable is the consumption of cantaloupe. C)The researchers may be concerned with confounding that occurs when the effects of two or more explanatory variables are not separated or when there are some explanatory variables that were not considered in a study, but that affect the value of the response variable.
The first significant digit in any number must be 1,2,3,4,5,6,7,8, or 9. It was discovered that first digits do not occur with equal frequency. Probabilities of occurrence to the first digit in a number are shown in the accompanying table. The probability distribution is now know as Benford's Law. For example, the following distributions represents the first digits in 213 allegedly fraudulent checks written to a bogus company by an employee attempting to embezzle funds from his employer. Complete parts (a) through (c) below.
a) To lower the chances of a Type I error use 0.01 Chi-Square goodness-of-fit Calc Observed: Frequency Expected: Probability P-value is less than 0.01 Chi-Square = 34.246 P-Value = 0.000 Reject the H0 because the calculated P-value is less than the given α level of significance Yes, the first digits do not obey Benford's Law
The null and alternative hypotheses are given. Determine whether the hypothesis test is left-tailed, right-tailed, or two-tailed. What parameter is being tested? H0: μ = 3 H1: μ ≠ 3
a) Two-tailed test If =/≠ two-tailed If =/> Right tailed If =/< Left tailed μ = population mean, σ = population S.D., and p = population proportion b) Population mean
The accompanying data represents the total travel tax (in dollars) for a 3-day business trip in 8 randomly selected cities. A normal probability plot suggests the data could come from a population that is normally distributed. A box plot indicates there are no outliers.
a) USE STATCRUNCH SUMMARY STATS for MEAN b) USE STATCRUNCH ONE SAMPLE T CALC One can be 95% confident that the mean travel tax for all cities is between $73.86 and $93.69 c) The researcher could decrease the level of confidence.
According to a study done by Nick Wilson of Otago University Wellington, the probability a randomly selected individual will not cover his or her mouth when sneezing is 0.267. Suppose you sit on a bench in a mall and observe people's habits as they sneeze.
a) Use statCrunch Use = b) Use StatCrunch Use < c) Recall that success is defined as people NOT covering their mouth when sneezing so P(X > 7)
Test the hypothesis using the P-value approach. Be sure to verify the requirements of the test. H0: p = 0.5 versus H1: p > 0.5 n = 250; x = 145; α = 0.1
a) Yes b) Use STATCRUNCH one sample proportion summary calc REJECT the null hypothesis, because the P-value is LESS than α
Suppose a simple random sample of size n = 200 is obtained from a population whose size is N = 30000 and whose population proportion with a specified characteristic is p = 0.6
a) find 5% of population and use np(1 - p) ≥ 10\ ANSWER: Approximately normal because n ≤ 0.05N and np(1 - p) ≥ 10. μ = p (.6) σ = [ (p(1-p)) / n ] ^1/2 (.034641) b) use statcrunch (.2819) c) use statcrunch (.3864)
A polling agency reported that the proportion of 12th grade students who are engaged is 0.34. Engagement is defined as a students' involvement and enthusiasm with school. The superintendent of one high school district believes the engagement of her students is higher than the proportion reported. She randomly surveys 90 of the 3600 students in the district and finds that 38 reported being engaged in school.
a) qualitative b) Find population mean x̄ = x/n x̄ = .4222 c) The sample size, n = 90 is LESS THAN 5% of the population size, or 0.05N = 180 .05(3600) = 180 np̂(1-p̂) = 22.0 ≥ 10 d) μ- = u (.34) USE THE SAME ONE FROM THE QUESTION σ- = [ (p(1-p)) / n ] ^1/2 (.0499) e) Use Normal Calculator mean = .34 S.D.=.0499 and put .4222 in the greater than box If 100 different random samples of 90 students were obtained, 5 would be expected to have at least 38 students that are engaged.
A media statistics agency reports that the mean number of televisions in a household in a particular country is 2.24. Assume that the population standard deviation number of television sets in this country is 1.38. a) What type of variable is number of television sets in a household? b) A random sample of 40 households results in a total of 102 television sets. What is the mean number of televisions in these 40 households? c) What is the probability of obtaining the sample mean obtained in part (b) or higher if the population mean is 2.24?
a) quantitative and discrete b) Find population mean x̄ = x/n x̄ = 2.55 c) Find σ- = σ / (n)^1/2 and put into Normal Calculator with 2.55 in the greater than box
People were polled on how many books they read the previous year. Initial survey results indicate that s = 10.8 books.
a)29 [(zα/2) * s) / E ] ^2 E = 3 books NOT A DECIMAL and WRITTEN IN WORD PROBLEM REMEMBER TO ROUND UP!!!! b) 113 [(zα/2) * s) / E ] ^2 E = 3 books NOT A DECIMAL and WRITTEN IN WORD PROBLEM REMEMBER TO ROUND UP!!!! c) Doubling the required accuracy nearly quadruples the sample size. d)49 [(zα/2) * s) / E ] ^2 E = 3 books NOT A DECIMAL and WRITTEN IN WORD PROBLEM REMEMBER TO ROUND UP!!!! Increasing the level of confidence increases the sample size required. For a fixed margin of error, greater confidence can be achieved with a larger sample size.
The mean incubation time for a type of fertilized eqq kept at a certain temperature is 24 days. Suppose that the incubation times are approximately normally distributed with a standard deviation of 1 day.
b) Use statcrunch (.0228) If 100 fertilized eggs were randomly selected, 2 of them would be expected to hatch in less than 22 days. c) (.0228) 100 fertilized eggs were randomly selected, 2 of them would be expected to take more than 26 days to hatch d) .3413 100 fertilized eggs were randomly selected, 34 of them would be expected to hatch between 23 and 24 days e) The probability of an egg hatching in less than 21 days is .0013, so it WOULD be unusual since the probability is LESS than 0.05
The length of a 10-year-old female's upper arm is approximately normally distributed with mean μ = 30.7 and a standard deviation of σ = 2.1 cm.
c) The proportion of 10-year-old females with an upper arm length OF A MOST 30 cm is 0.3694 The probability that a randomly selected 10-year-old female has an upper arm length OF AT MOST 30 cm is 0.3694.
The number of chocolate chips in an 18-ounce bag of chocolate chip cookies is approximately normally distributed with mean of 1552 and standard deviation 129 chips
d) A bag that contains 1450 chocolate chips is in the _th percentile Find P(X < 1425) and round to nearest interger
According to a survey in a country, 18% of adults do not own a credit card. Suppose a simple random sample of 300 adults is obtained.
d) first find the sample probability p̂ = x/n than put into statcrunch in the less than box
If you constructed one hundred 92% confidence intervals based on one hundred different random samples of size n, how many of the intervals would you expect to include the unknown parameter? Assume all model requirement are satisfied.
intervals = percentage ANSWER 92
Suppose 19 cars start at a car race. In how many ways can the top 3 cars finish the race?
nCr= n! / (n-r)! n = number of cars r = number of cars chosen Desmos scientific calculator lets you use !
Suppose Allan is going to build a playlist that contains 12 songs. In how many ways can Allan arrange the 12 songs on the playlist?
nCr= n! / (n-r)! n= number of songs to choose from =12 r= number of cars to choose = 12 12 C 12 = 12! / (12 - 12) ! = 12 ! = ! marks mean 12 * 11 * 10 * 9 ... Answer: 479001600
Four members from a 59-person committee are to be selected randomly to serve as chairperson, vice-chairperson, secretary, and treasurer. The first person selected is the chairperson; the second, the vice-chairperson; the third, the secretary; and the fourth, the treasurer. How many different leadership structures are possible?
nPr = n! / (n-r)! n = the number of people to choose from (59) r = number of people to choose (4) 59! / (59-4)! = 863,040
A survey of 2323 adults in a certain large country aged 18 and older conducted by a reputable polling organization found that 428 have donated blood in the past two years.
p̂ = Point Estimate p̂ = x / n ANSWER: . (b) The sample CAN BE ASSUMED TO BE a simple random sample, the value of np̂(1-p̂) is 334.550, which is GREATER THAN OR EQUAL TO 10, and the SAMPLE SIZE CAN BE ASSUMED TO BE less than or equal to 5% of the POPULATION SIZE. (c) USE STATCRUNCH PROPORTION CALC We are 90% confident the proportion of adults in the country aged 18 and older who have donated blood in the past two years is between 0.171 and 0.197.
Match the coefficient of determination to the scatter diagram. The scales on the x-axis and y-axis are the same for each scatter diagram
the higher the R^2 the more linear (1 = perfect line)
Determine whether the study depicts an observational study or an experiment Seventh-grade students are randomly divided into two groups. One group is taught math using traditional techniques; the other is taught math using a reform method. After 1 year, each group is given an achievement test to compare proficiency.
the study is an experiment because the researchers control one variable to determine the effect on the response variable.
For the histogram on the right determine whether the mean is greater than, less than, or approximately equal to the median. Justify your answer.
x̄ = M because the histogram is symmetric
The data below represent commute times (in minutes) and scores on a well-being survey. (Least-Square Regression Lines)
y=b1x+b0 Where b1= r * sy/sx and b0= ȳ - b1x̄ https://www.socscistatistics.com/tests/regression/default.aspx b) For every unit increase in commute time, the index score falls by 0.047, on average. NO NEGATIVE c) For a commute time of zero minutes, the index score is predicted to be 69.064 d) 67.7 e) No, Barbara is less well-off because the typical individual who has a 20-minute commute scores 68.1
Determine μ- and σ- from the given parameters of the population and sample size. μ = 70, σ = 14, n = 49
μ- = μ (use same mean) σ- = σ / (n)^1/2
What are the two requirements for a discrete probability distribution?
∑P(x) = 1 and 0 ≤ P(x) ≤ 1