Statistics Final Review

Ace your homework & exams now with Quizwiz!

A binomial probability experiment is conducted with the given parameters. Compute the probability of x successes in the n independent trials of the experiment. n=30​, p=0.02​, x=2

p(2) = .0988 1. 2nd --> VARS --> A:binompdf 2. input values (trials = 30, x = 2, p = 0.02)

A binomial probability experiment is conducted with the given parameters. Compute the probability of x successes in the n independent trials of the experiment. n=6​, p=0.25​, x=5

p(5) = .0044 1. 2nd --> VARS --> A:binompdf

Find the population mean or sample mean as indicated. ​Sample: 25​, 18​, 5​, 8​, 9

x (with line over it) = 13 1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A

??? is the​Z-score such that the area under the curve to the left is 0.52.

.05 1. 2nd --> VARS --> 3:invNorm 2. Plug in given area 3. Enter, Solve NOTE: use given value if it is area to the LEFT

Determine whether the following statement is true or false. The shape of the distribution shown is best classified as skewed left. Description: Its peak is on the left hand side and its tail is to the right (decreases as it goes left)

FALSE note: The given graph is skewed in the direction of the tail.​ Thus, the graph is classified as skewed right.

Determine if the following statement is true or false. When two events are​ disjoint, they are also independent.

False

True or False​: The population proportion and sample proportion always have the same value.

False

The notation P(F I E) means the probability of event ??? given event ???

F and E

Could the graph represent a normal density​ function? Description: Tall curve with sides going off below x-axis

NO

Let the sample space be S={1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Suppose the outcomes are equally likely. Compute the probability of the event E =​ "an odd number less than 10​."

P(E) = 0.5 note: "an odd number less than 10" = 1,3,5,7,9 therefore, 5 / 10 = 0.5

Suppose that E and F are two events and that P(E and F) = 0.2 and P(E) = 0.8. What is P(F|E)​?

P(F|E)​ = .25 note: P(F|E)​ = P(E and F) / P(E) therefore, 0.2 / 0.8 = 0.25

Determine whether the study depicts an observational study or an experiment. Fourth-grade students are randomly divided into two groups. One group is taught English using traditional techniques. The other is taught English using a reform method. After 1 year, each

The study is an experiment because the researchers control one variable to determine the effect on the response variable.

Is the statement below true or​ false? The mean of the sampling distribution of p is p.

True

True or False​: In a probability​ model, the sum of the probabilities of all outcomes must equal 1.

True

a. What is the probability of an event that is​ impossible? b. Suppose that a probability is approximated to be zero based on empirical results. Does this mean that the event is​ impossible?

a. 0 b. No note: Just because the event is not​ observed, does not mean that the event is impossible.

In a previous​ poll, 46​% of adults with children under the age of 18 reported that their family ate dinner together seven nights a week. Suppose​ that, in a more recent​ poll, 493 of 1112 adults with children under the age of 18 reported that their family ate dinner together seven nights a week. Is there sufficient evidence that the proportion of families with children under the age of 18 who eat dinner together seven nights a week has​ decreased? Use the α=0.05 significance level. (a) Because np0(1−p0) = ??? ??? ​10, the sample size is ??? than 5% of the population​ size, and the sample ???, the requirements for testing the hypothesis ??? satisfied. (b) What are the null and alternative​ hypotheses? (c) Find the test statistic, z0? (d) Find the P-value?

(a) 276.2, >, less than, can be reasonably assumed to be random, are (b) H0​: p = .46 H1​: p < .46 (c) -1.11 1. STAT --> TESTS --> 5:1-PropZTest 2. Input info (d) .133 (e) No​, there is not sufficient evidence because the​P-value is greater than the level of significance.​Therefore, do not reject the null hypothesis.

The following data represent exam scores in a statistics class taught using traditional lecture and a class taught using a​ "flipped" classroom. (a) Which course has more dispersion in exam scores using the range as the measure of​ dispersion? The traditional course has a range of ???​, while the​ "flipped" course has a range of ???. The ??? course has more dispersion. (b) Which course has more dispersion in exam scores using the sample standard deviation as the measure of​ dispersion? The traditional course has a standard deviation of ???​, while the​ "flipped" course has a standard deviation of ???. The ??? course has more dispersion. (c) Suppose the score of 60.5 in the traditional course was incorrectly recorded as 605. How does this affect the​ range? The range is now ??? (d) How does this affect the standard​ deviation? The standard deviation is now ??? (e) What property does this illustrate?

(a) 29.1, 28.8, traditional 1. STAT --> ENTER --> input values for L1 and L2 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 or L2 and Frequency: N/A 4. Take highest value - lowest value (b) 8.925, 7.858, traditional note: look at "Sx" values!!! Higher number has more dispersion (c) 548.5 note: change 60.5 to 605 in L1 (d) 147.871 (e) Neither the range nor the standard deviation is resistant.

​(a) Determine the total area under the standard normal curve to the left of z=−2 or to the right of z=2. (b) The total area under the standard normal curve to the left of z=−2 or to the right of z=2 is ???

(a) Graph has area to the left and right shaded, but not the middle (b) .0455 1. 1. 2nd --> VARS --> 2:normalcdf 2. μ = 0 and σ = 1 and plug in values 3. HOWEVER since its asking for the area not between... subtract 1 from the solved for .9545 1 - .9545 = .0455

A stock analyst wondered whether the mean rate of return of​ financial, energy, and utility stocks differed over the past 5 years. He obtained a simple random sample of eight companies from each of the three sectors and obtained the​ 5-year rates of return shown in the accompanying table​ (in percent) (a) State the null and alternative hypotheses (b) ​Normal probability plots indicate that the sample data come from normal populations. Are the requirements to use the​ one-way ANOVA procedure​ satisfied? (c) F0 = ??? (d) Since the​ P-value is ??? there ??? enough evidence to reject the null hypothesis.​ Thus, we ??? conclude that the mean rates of return are different at the α=0.05 level of significance. (e) Pick graph

(a) H0: μfinancial = μenergy = μutilities versus H1: at least one mean is differentnote: "differed over" = something is different/changed (b) Yes, because there are k=3 simple random​ samples, one from each of k​ populations, the k samples are independent of each​ other, and the populations are normally distributed and have the same variance.note: SAME variance (c) F0 = 2.06 1. plug values into L1, L2, L3 2. STAT --> TESTS --> A:ANOVA --> ANOVA(L1, L2, L3) (d) 0.153, is not, cannot (e) Graph fits each data2nd --> STAT PLOT --> set each to its own L --> Graph

A manufacturer of alloy steel beams requires that the standard deviation of yield strength not exceed 7000psi. The​ quality-control manager selected a sample of 20 beams and measured their yield strength. The standard deviation of the sample was 8000psi. Assume that the yield strengths are normally distributed. Does the evidence suggest that the standard deviation of yield strength exceeds 7000psi at the α=0.01 level of​ significance? (a) State the hypotheses. (b) P-value? (c) State the conclusion for the test? (d) State the conclusion in context of the problem. There ??? sufficient evidence at the α=0.01 level of significance to conclude that the standard deviation of yield strength exceeds 7000 psi.

(a) Hypotheses: H0​: σ = 7000psi H1: σ > 7000psi note: does not exceed = > (b) P-value = .167 [(20 - 1)(8000^2)] / 7000^2 = 24.82 https://www.socscistatistics.com/pvalues/chidistribution.aspx (c) Do not reject, H0, because the​ P-value is greater than the α=0.01 level of significance. (d) is not

A certain vehicle emission inspection station advertises that the wait time for customers is less than 9 minutes. A local resident wants to test this claim and collects a random sample of 49 wait times for customers at the testing station. He finds that the sample mean is 7.54 ​minutes, with a standard deviation of 3.5 minutes. Does the sample evidence support the inspection​ station's claim? Use the α=0.005 level of significance to test the advertised claim that the wait time is less than 9 minutes. (a) What are the hypotheses for this​ test? (b) What is the test statistic? (c) What is the P-value? (d) Since the​ P-value is ??? α​, ??? the null hypothesis. There ???sufficient evidence to conclude that the mean wait time is less than 9 minutes. In other​ words, the evidence ??? the advertised claim.

(a) Hypotheses: ​H0: μ = 9 minutes H1​: μ < 9 minutes (b) t0 = -2.92 (c) P-value = .003 (d) less than, reject, is, supports

A highway safety institution conducts experiments in which cars are crashed into a fixed barrier at 40 mph. In the​ institute's 40-mph offset​ test, 40% of the total width of each vehicle strikes a barrier on the​ driver's side. The​ barrier's deformable face is made of aluminum​ honeycomb, which makes the forces in the test similar to those involved in a frontal offset crash between two vehicles of the same​ weight, each going just less than 40 mph. You are in the market to buy a family car and you want to know if the mean head injury resulting from this offset crash is the same for large family​ cars, passenger​ vans, and midsize utility vehicles​ (SUVs). The data in the accompanying table were collected from the​ institute's study. (a) State the null hypothesis (b) ​Normal probability plots indicate that the sample data come from normal populations. Are the requirements to use the​ one-way ANOVA procedure​ satisfied? (c) F0 = ??? (d) Since the​ P-value is ??? there is ??? evidence to reject the null hypothesis.​ Thus, we ??? conclude that the means are different at theα=0.01 level of significance.

(a) Hypothesis: H0: μCars=μVans=μSUVs H1: at least one mean is different (b) Yes, all the requirements for use of a​ one-way ANOVA procedure are satisfied. (c) F0 = 0.419 1. plug values into L1, L2, L3 2. STAT --> TESTS --> A:ANOVA --> ANOVA(L1, L2, L3) (d) 0.664, insufficient, cannot (e) Graph fits each data 1. 2nd --> STAT PLOT --> set each to its own L --> 2. Graph --> FIX WINDOW SETTINGS

According to the Centers for Disease Control and​ Prevention, 10.4​% of high school students currently use electronic cigarettes. A high school counselor is concerned the use of​ e-cigs at her school is higher. ​(a) Determine the null and alternative hypotheses. (b) If the sample data indicate that the null hypothesis should not be​ rejected, state the conclusion of the high school counselor. (c) Suppose, in​ fact, that the proportion of students at the​ counselor's high school who use electronic cigarettes is 0.234. Was a type I or type II error​ committed?

(a) Hypothesis: H0​: p = .104 H1​: p > .104 (b) There is not sufficient evidence to conclude that the proportion of high school students exceeds 0.104 at this​ counselor's high school. (c) A Type II error was committed because the sample evidence led the counselor to conclude the proportion of​ e-cig users was 0.104​, when, in​ fact, the proportion is higher.

One​ year, the mean age of an inmate on death row was 40.4 years. A sociologist wondered whether the mean age of a​ death-row inmate has changed since then. She randomly selects 32 death-row inmates and finds that their mean age is 39.8​, with a standard deviation of 8.6. Construct a​ 95% confidence interval about the mean age. What does the interval​ imply? (a) State the hypotheses. (b) Construct a​ 95% confidence interval about the mean age. With​ 95% confidence, the mean age of a death row inmate is between ??? years and ??? years. (c) What does the interval imply?

(a) Hypothesis: H0​: μ = 40.4 H1​: μ ≠ 40.4 (b) 36.7 and 42.9 1. STAT --> TESTs --> 8:T-Interval 2. Input values (c) Since the mean age from the earlier year is contained in the​ interval, there is not sufficient evidence to conclude that the mean age had changed

It has long been stated that the mean temperature of humans is 98.6°F. ​However, two researchers currently involved in the subject thought that the mean temperature of humans is less than 98.6°F. They measured the temperatures of 61 healthy adults 1 to 4 times daily for 3​ days, obtaining 275 measurements. The sample data resulted in a sample mean of 98.3°F and a sample standard deviation of 1.1°F. Use the​ P-value approach to conduct a hypothesis test to judge whether the mean temperature of humans is less than 98.6°F at the α=0.01 level of significance. (a) State the hypotheses. (b) Identify the t-statistics? (c) Identify the P-value? (d) Make a conclusion regarding the hypothesis?

(a) Hypothesis: H0​: μ = 98.6°F H1​: μ < 98.6°F (b) -4.52 (c) 0 (CHECK SCIENTIFIC NOTATION) (d) Reject H0 since the​ P-value is less than the significance level.

A can of soda is labeled as containing 10 fluid ounces. The quality control manager wants to verify that the filling machine is neither over-filling nor under-filling the cans. ​(a) Determine the null and alternative hypotheses that would be used to determine if the filling machine is calibrated correctly. (b) The quality control manager obtains a sample of 77 cans and measures the contents. The sample evidence leads the manager to reject the null hypothesis. Write a conclusion for this hypothesis test. There ??? not sufficient evidence to conclude that the machine is out of calibration. (c) Suppose, in​ fact, the machine is not out of calibration. Has a Type I or Type II error been​ made? A ??? has been made since the sample evidence led the​ quality-control manager to ??? the ???, when the null hypothesis is true. (d) Management has informed the quality control department that it does not want to shut down the filling machine unless the evidence is overwhelming that the machine is out of calibration. What level of significance would you recommend the quality control manager to​ use? The level of significance should be ??? because this makes the probability of Type I error ???

(a) hypothesis: H0​: μ = 10 H1​: μ ≠ 10 (b) is note: reject null hypothesis = is sufficient evidence do not reject = is not sufficient (c) type I, reject, null hypothesis (d) 0.01, small

According to a food​ website, the mean consumption of popcorn annually by Americans is 56 quarts. The marketing division of the food website unleashes an aggressive campaign designed to get Americans to consume even more popcorn. ​(a) Determine the null and alternative hypotheses that would be used to test the effectiveness of the marketing campaign. (b) A sample of 859 Americans provides enough evidence to conclude that marketing campaign was effective. Provide a statement that should be put out by the marketing department. (c) Suppose, in​ fact, the mean annual consumption of popcorn after the marketing campaign is 56 quarts. Has a Type I or Type II error been made by the marketing​ department? If we tested this hypothesis at the α=0.1 level of​ significance, what is the probability of committing this​ error? Select the correct choice below and fill in the answer box within your choice.

(a) hypothesis: H0​: μ = 56 H1​: μ > 56 (b) There is sufficient evidence to conclude that the mean consumption of popcorn has risen. (c) The marketing department committed a Type I error because the marketing department rejected the null hypothesis when it was true. The probability of making a Type I error is 0.1

Determine the point estimate of the population mean and margin of error for the confidence interval. Lower bound is 20​, upper bound is 30. (a) The point estimate of the population mean is ??? (B) The margin of error is ???

(a) mean = 25 note: The point estimate of the population mean should be in the middle of the confidence interval. (20 + 30) / 2 = 25 (b) Margin of Error = 5 note: mean + E = upper bound --> 25 + E = 30 --> E = 5 The margin of error of a confidence interval estimate of a parameter is a measure of how accurate the point estimate is.

According to a recent article about individuals who have credit​ cards, the mean number of cards per person with credit cards is 4. To test this result a random survey of 60 individuals who have credit cards was conducted. The survey only includes the number of credit cards per participant. The results of the survey are attached below. ​(a) What is the variable of interest in this​ study? Is it qualitative or​ quantitative? The variable of interest is ???. It is a ??? variable. ​(b) State the null and alternative hypotheses. (c) Determine the​ t-statistic. (d) P-value = ??? (e) Make a conclusion regarding the hypothesis. The​ P-value is ??? the level of significance. ??? the null hypothesis. There ??? sufficient evidence to claim that the mean number of credit cards is ??? 4.

(a) number of credit cards, quantitive (b) Hypotheses: H0​: μ = 4 H1​: μ < 4 (c) t0 = -.84 1. Input values into L1 2. STATS --> TEST --> 2: T-Test (d) P-value = .202 (e) greater than, do not reject, is not, less than

Could the graph represent a normal density​ function? (a) Description: Even curve with sides hovering above the x-axis (b) Description: Even surve with sides crossing x-axis

(a) yes (b) no

The frequency distribution was obtained using a class width of 0.5 for data on cigarette tax rates. Use the frequency distribution to approximate the population mean and population standard deviation. Compare these results to the actual mean μ =​ $1.715 and standard deviation σ = ​$1.112 (a) Population mean? (b) Standard Deviation? (c) Compare these results to the values found using the actual data.

(a) μ = 1.769 note: use midpoints of values and plug in to L1 and L2 (b) σ = 1.168 (c) The grouped values are both slightly larger

Suppose babies born after a gestation period of 32 to 35 weeks have a mean weight of 2800 grams and a standard deviation of 700 grams while babies born after a gestation period of 40 weeks have a mean weight of 3100 grams and a standard deviation of 420 grams. If a 33​-week gestation period baby weighs 2900 grams and a 41​-week gestation period baby weighs 3200 ​grams, find the corresponding​ z-scores. Which baby weighs more relative to the gestation​ period? Which baby weighs relatively more​? The baby born in week 41 weighs relatively more since its​ z-score, ??? is larger than the​ z-score of ??? for the baby born in week 33

.24 and .14 note: use z = (x - μ) / σ 32 - 35 week: (2900 - 2800) / 700 = .14 > 40 week: (3200 - 3100) / 420 = .24

Find the population variance and standard deviation. 6​, 15​, 27​, 33​, 39

1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A 4. It gives you population dev (σ) so solve for population variance by squaring it σ^2 = 144 σ = 12

Find the value of the permutation. 2P0

2P0 = 1 1. Input first value into calc (in this case 2) 2. MATH --> PRB --> 2: nPr 3. input in second value (r = 0)

Find the value of the permutation. 2P2

2P2 = 2 1. Input first value into calc (in this case 2) 2. MATH --> PRB --> 2: nPr 3. input in second value (r = 2)

Find the value of the permutation. 4P3

4P3 = 24 1. Input first value into calc (in this case 4) 2. MATH --> PRB --> 2: nPr 3. input in second value (r = 3)

What does it mean for an event to be​ unusual? Why should the cutoff for identifying unusual events not always be​ 0.05?

An event is unusual if it has a low probability of occurring. The choice of a cutoff should consider the context of the problem.

Could the graph represent a normal density​ function? Description: Tall curve, even distribution, with ends curving back up

NO note: has to come near, but never cross, x-axis

Determine whether the distribution is a discrete probability distribution. x: P(x) 0: 0.13 10: 0.36 20: 0.14 30: 0.31 40: 0.42 Is the distribution a discrete probability​ distribution?

No​, because the sum of the probabilities is not equal to 1. probabilties add up to equal 1.36

A binomial probability experiment is conducted with the given parameters. Compute the probability of x successes in the n independent trials of the experiment. n = 12​, p = 0.7​, x = 9

P(9) = .2397 1. 2nd --> VARS --> A:binompdf 2. input values (trials = 12, x = 9, p = 0.7)

Find the probability of the indicated event if ​P(E)=0.30 and ​P(F)=0.45. Find​ P(E and​ F) if​ P(E or ​F) = 0.50

P(E and F) = 0.25 note: P(E or​ F) = P(E) +​ P(F) −​ P(E and​ F) 0.50 = 0.30 + 0.45 - P(E and F) --> 0.50 = 0.75 - P(E and F) --> -0.25 = -P(E and F) --> P(E and F) = 0.25

Find the probability​ P(E or​ F) if E and F are mutually​ exclusive, ​P(E)=0.34​, and P(F)=0.51.

P(E or F) = .85 note: P(E or​ F) = P(E) +​ P(F) −​ P(E and​ F) 0.34 + 0.51 - 0 = .85 First determine the probability​ P(E and​ F). Recall that E and F are mutually exclusive. Therefore, P(E and F) = 0

Find the probability of the indicated event if ​P(E)=0.35 and ​P(F)=0.45. Find​ P(E or​ F) if​ P(E and ​F)=0.05.

P(E or F) = 0.75 note: P(E or​ F) = P(E) +​ P(F) −​ P(E and​ F) 0.35 + 0.45 - 0.05 = 0.75

Let the sample space be S={1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Suppose the outcomes are equally likely. Compute the probability of the event E={4, 5, 6}.

P(E) = 0.3 note: 3 / 10 = 0.3

Suppose that E and F are two events and that N(E and F) = 350 and N(E) = 520. What is P(F|E)​?

P(F|E)​ = .673 note: P(F|E)​ = P(E and F) / P(E) therefore, 350 / 520 = .673

In a​ boxplot, if the median is to the left of the center of the box and the right whisker is substantially longer than the left​ whisker, the distribution is skewed

Right note: it is skewed in the direction of the whisker/tail

Determine whether the following probability experiment represents a binomial experiment and explain the reason for your answer. An experimental drug is administered to 150 randomly selected​ individuals, with the number of individuals responding favorably recorded.

Yes, because the experiment satisfies all the criteria for a binomial experiment.

Determine whether the distribution is a discrete probability distribution. x: P(x) 10: 0.44 20: 0.18 30: 0.07 40: 0.16 50: 0.15 Is the distribution a discrete probability​ distribution?

Yes​, because the sum of the probabilities is equal to 1 and each probability is between 0 and 1, inclusive.

a. Draw a scatter diagram of the data. b. The correlation coefficient is ??? c. Because the correlation coefficient is ??? and the absolute value of the correlation​ coefficient, ???, is ??? than the critical value for this data​ set, ???​, ??? linear relation exists between x and y.

a. 2nd --> STAT Plot --> check scatter plot b. r = -.274 1. Double Check: 2nd --> CATALOG (above 0) 2. DiagnosticOn --> Enter x2 3. 2nd --> STAT --> 4:Linear Reg 4. XList:L1 and YList:L2 5. make sure FreqList is clear!!! c. negative, -.274, not greater, .878, no note: if absolute value < critical value, no relationship exists (also hinted by low r value)

A concrete mix is designed to withstand 3000 pounds per square inch​ (psi) of pressure. The following data represent the strength of nine randomly selected casts​ (in psi). 3970​, 4090​, 3100​, 3000​, 2950​, 3830​, 4090​, 4030​, 3430 a. The mean exam time is ??? b. The median exam time is ??? c. The mode is ???

a. 3610 1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A b. 3830 c. 4090 note: the number repeated the most

a. What is the median of variable ​x? b. What is the third quartile of variable​ y? c. Which variable has more dispersion? d. Describe the shape of variable x? e. Describe the shape of variable y?

a. 60 note: median of a box plot is the value where the line in the middle of the box is b. 94 note: third quartile is where the box ends (right end of the box) c. Variable y—the interquartile range of variable y is larger than that of variable x. note: y has a longer box and longer whiskers; therefore, a longer range d. Symmetric—the median is the center of the box and the left and right whiskers are about the same length. e. Skewed left—the median is right of center in the box and the left whisker is longer than the right whisker.

An experiment was conducted in which two fair dice were thrown 100 times. The sum of the pips showing on the dice was then recorded. The frequency histogram to the right gives the results. Description: Peak at 7 and it evens off to the side equally a. What was the most frequent outcome of the​ experiment? b. What was the least frequent outcome of the​ experiment? c. How many times did they roll a 9 d. How many more 6​'s were observed than 12​'s? e. Determine the percentage of time a 9 was observed. f. Describe the shape of the distribution. Choose the correct answer below.

a. 7 note: the one with the highest bar/frequency b. 12 note: the one with the lowest bar/frequency c. 11 note: look how high the frequency is d. 15 note: 16 (frequency of 6's) - 1 (frequency of 12's) e. 11 note: 11 (frequency of 9's) / 100 (total trials) f. Bell-shaped Uniform means flat and equal for all values Bell-shaped means it peaks in middle and evens out

The following data represent the amount of time​ (in minutes) a random sample of eight students took to complete the online portion of an exam in a particular statistics course. Compute the​ mean, median, and mode time. 62.9​, 74.9​, 89.9​, 108.9​, 128.4​, 94.9​, 94.7​, 117.6 a. The mean exam time is ??? b. The median exam time is ??? c. The mode is ???

a. 96.53 1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A b. 94.8 c. The mode does not exist

a. What is an observational​ study? b. What is a designed​ experiment? c. Which allows the researcher to claim causation between an explanatory variable and a response​ variable?

a. An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables b. A designed experiment is when a researcher assigns individuals to a certain​ group, intentionally changing the value of an explanatory​ variable, and then recording the value of the response variable for each group. c. A designed experiment allows the researcher to claim causation between an explanatory variable and a response variable

a. Explain what is meant by confounding. b. What is a lurking​ variable? c. What is a confounding​ variable?

a. Confounding in a study occurs when the effects of two or more explanatory variables are not separated.​ Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study. b. A lurking variable is an explanatory variable that was not considered in a​ study, but that affects the value of the response variable in the study. In​ addition, lurking variables are typically related to explanatory variables in the study. c. A confounding variable is an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.

a. Draw a scatter diagram of the data. b. The correlation coefficient is ??? c. Because the correlation coefficient is ??? and the absolute value of the correlation​ coefficient, ???, is ??? than the critical value for this data​ set, ???​, ??? linear relation exists between x and y.

a. Look at points and graph b. r = .921 1. Double Check: 2nd --> CATALOG (above 0) 2. DiagnosticOn --> Enter x2 3. 2nd --> STAT --> 4:Linear Reg 4. XList:L1 and YList:L2 5. make sure FreqList is clear!!! c. positive, .921, greater, .878, a positive note: critical value is given

a. Thirty university students are divided into two groups. One group receives free tutoring in mathematics, the other doesn't. After one semester, scores on final mathematical examinations are compared. b. A study is conducted to determine if there is a relationship between colon cancer and fat consumption. Patients with colon cancer are asked about their fat consumption. Does the description correspond to an observational study or an​ experiment?

a. The study is an experiment because the researchers control one variable to determine b. The study is an observational study because the study examines individuals in a sample,

a. Dogs are randomly divided into two groups. One group is trained using food as a reward; the other is trained using sound cues. After 2 months, each group is taught a new skill to compare obedience. Determine whether the study depicts an observational study or an experiment.

a. The study is an experiment because the researchers control one variable to determine the effect on the response variable.

a. Area of a park b. Volume of liquid in a glass Is the variable discrete or​ continuous?

a. The variable is continuous because it is not countable. b. The variable is continuous because it is not countable. note: must be solved for?

a. Shots saved by a goaltender in a hockey game b. weight of a child Is the variable discrete or​ continuous?

a. The variable is discrete because it is countable. b. The variable is continuous because it is not countable. note: can't look at it and easily count

a. Number of lightening strikes in a city in a year b. Number of touchdowns done by a quarterback in a game Is the variable discrete or​ continuous?

a. The variable is discrete because it is countable. b. The variable is discrete because it is countable. note: easy to count using "0,1,2,3,4"

a. Gallons of water in a swimming pool b. Amount of money won in a lottery Is the variable qualitative or​ quantitative?

a. The variable is quantitative because it is a numerical measure. b. The variable is quantitative because it is a numerical measure.

a. For ages of students in a public school​, state whether you would expect a histogram of the data to be​ bell-shaped, uniform, skewed​ left, or skewed right. b. For a number of children in a family​, state whether you would expect a histogram of the data to be​ bell-shaped, uniform, skewed​ left, or skewed right.

a. uniform note: each grade level has equal number of students b. skewed right note: not alot of families have many kids so it would dwindle off the further right you go

The ??? ???​, denoted p^​, is given by the formula p^ = ???​, where x is the number of individuals with a specified characteristic in a sample of n individuals.

sample, proportion, x/n

Find the population mean or sample mean as indicated. ​Population: 4​, 8​, 16​, 14​, 23

μ = 13 1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A

Determine whether the following probability experiment represents a binomial experiment and explain the reason for your answer. Three cards are selected from a standard​ 52-card deck without replacement. The number of jacks selected is recorded.

​No, because the trials of the experiment are not independent and the probability of success differs from trial to trial

Determine if the following probability experiment represents a binomial experiment. A random sample of 30 high school seniors is​ obtained, and the individuals selected are asked to state their hair length.

​No, this probability experiment does not represent a binomial experiment because the variable is​ continuous, and there are not two mutually exclusive outcomes.

Determine whether the distribution is a discrete probability distribution. x:P(x) 0:0.07 1:0.21 2:0.33 3:0.24 4: 0.15 Is the distribution a discrete probability​ distribution?

​Yes, because the sum of the probabilities is equal to 1 and each probability is between 0 and​ 1, inclusive.

For a large sporting event the broadcasters sold 64 ad slots for a total revenue of $134 million. What was the mean price per ad​ slot?

$2.1 million note: $134 / 64 = 2.09 rounded to one decimal = 2.1

Find the sample variance and standard deviation. 22​, 14​, 2​, 8​, 10

1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A 4. It gives you standard dev (s) so solve for sample variance by squaring it s^2 = 55.2 s = 7.4

The following data represent the number of people aged 25 to 64 years covered by health insurance​ (private or​ government) in 2018. Value: Frequency 25-34: 22.8 35-44: 33.9 45-54: 35.8 55-64: 24.7 (a) What is the mean? (b) What is the standard deviation?

(a) μ = 45.32 note: find midpoints of all value by adding starting value by the end of the next catgeory and divide it by 1 (ie. 35 + 25 / 2 = 30) (b) σ = 10.29

One year Mike had the lowest ERA​ (earned-run average, mean number of runs yielded per nine innings​ pitched) of any male pitcher at his​ school, with an ERA of 2.57. ​Also, Jane had the lowest ERA of any female pitcher at the school with an ERA of 3.49. For the​ males, the mean ERA was 4.081 and the standard deviation was 0.636. For the​ females, the mean ERA was 4.321 and the standard deviation was 0.536. Find their respective​ z-scores. Which player had the better year relative to their​ peers, Mike or Jane​? ​(Note: In​ general, the lower the​ ERA, the better the​ pitcher.) (a) Mike had an ERA with a​ z-score of ??? (b) Jane had an ERA with a​ z-score of ??? (c) Which player had a better year in comparison with their peers?

(a) -2.38 z = (x - μ) / σ --> (2.57 - 4.081) / 0.636 = -2.38 (b) -1.55 (3.49 - 4.321) / 0.536 = -1.55 (c) Mike had a better year because of a lower​ z-score. -2.38 < -1.55

Determine the area under the standard normal curve that lies between... (a) Z=−0.93 and Z=0.93​ ​(b) Z=−1.12 and Z=0​ (c) Z=−1.54 and Z=0.13

(a) .6476 1. 2nd --> VARS --> 2:normalcdf 2. μ = 0 and σ = 1 (b) .3686 (c) .4899

Twenty years​ ago, 54​% of parents of children in high school felt it was a serious problem that high school students were not being taught enough math and science. A recent survey found that 255 of 750 parents of children in high school felt it was a serious problem that high school students were not being taught enough math and science. Do parents feel differently today than they did twenty years​ ago? Use the α=0.01 level of significance. (a) Because np0(1−p0) = ??? ??? ​10, the sample size is ??? than 5% of the population​ size, and the sample ???, the requirements for testing the hypothesis ??? satisfied. (b) What are the null and alternative​ hypotheses? (c) Find the test statistic, z0? (d) Determine the critical​ value(s). (e) Chose the correct statement.

(a) 186.3, >, less than, can be reasonably, are note: 750(.54)(1 - .54) = 186.3 (b) H0​: p = .54 H1​: p ≠ .54 (c) -10.99 1. STAT --> TESTS --> 5:1-PropZTest 2. Input info (d) ±z(a/2) = ±2.58 note: level of significance: left-tailed, right-tailed, two-tailed 0.10: -1.28, 1.28, ±1.645 0.05: -1.645, 1.645, ±1.96 0.01: -2.33, 2.33, ±2.575 (e) Reject the null hypothesis. There is sufficient evidence to conclude that the number of parents who feel that students are not being taught enough math and science is significantly different from 20 years ago.

Find the sample variance and standard deviation. 7​, 48​, 14​, 48​, 37​, 23​, 32​, 31​, 26​, 31

1. STAT --> ENTER --> input values 2. STAT --> CALC --> 1:Var Stats --> ENTER 3. List: L1 and Frequency: N/A 4. It gives you standard dev (s) so solve for sample variance by squaring it s^2 = 172.45 s = 13.1

According to a certain government agency for a large​ country, the proportion of fatal traffic accidents in the country in which the driver had a positive blood alcohol concentration​ (BAC) is 0.39. Suppose a random sample of 109 traffic fatalities in a certain region results in 55 that involved a positive BAC. Does the sample evidence suggest that the region has a higher proportion of traffic fatalities involving a positive BAC than the country at the α=0.05 level of​ significance? (a) Because np0(1−p0) = ??? ??? ​10, the sample size is ??? than 5% of the population​ size, and the sample ???, the requirements for testing the hypothesis ??? satisfied. (b) What are the null and alternative​ hypotheses? (c) Find the test statistic, z0? (d) Find the P-value?

(a) 25.9, >, less, is given to be random, are (b) Hypothesis: H0​: p = .39 H1​: p > .39 (c) z0 = 2.45 1. STAT --> TESTS --> 5:1-PropZTest 2. Input info (d) P-value = .007 (e) Since ​P-value<α​, reject the null hypothesis and conclude that there is sufficient evidence that the region has a higher proportion of traffic fatalities involving a positive BAC than the country.

Assume the random variable X is normally distributed with mean μ=50 and standard deviation σ=7. Compute the probability. Be sure to draw a normal curve with the area corresponding to the probability shaded. P(34<X<63) (a) Which of the following normal curves corresponds to P(34<X<63)? (b) P(34<X<63) = ???

(a) Description: Shaded area is to the right of 34 and left of 63, so only the area in between. (b) P(34<X<63) = .9572 1. 2nd --> VARS --> 2:normalcdf 2. lower = 34, upper = 63, μ=50, σ=7

Assume the random variable X is normally distributed with mean μ=50 and standard deviation σ=7. Compute the probability. Be sure to draw a normal curve with the area corresponding to the probability shaded. P(56≤X≤70)​ (a) Which of the following normal curves corresponds to P(56≤X≤70)? (b) P(56≤X≤70)​ = ???

(a) Description: Shaded area is to the right of 56 and left of 70, so only the area in between. (b) P(56≤X≤70)​ = .1935 1. 2nd --> VARS --> 2:normalcdf 2. lower = 56, upper = 70, μ=50, σ=7

Determine whether the events E and F are independent or dependent. Justify your answer. ​(a) E: A person having an at-fault accident. ​F: The same person being prone to road rage. (b) ​E: A randomly selected person planting tulip bulbs in October. ​F: A different randomly selected person planting tulip bulbs in April. (c) E: The consumer demand for synthetic diamonds. ​F: The amount of research funding for diamond synthesis.

(a) E and F are dependent because being prone to road rage can affect the probability of a person having an at-fault accident. (b) E cannot affect F and vice versa because the people were randomly​ selected, so the events are independent. (c) The consumer demand for synthetic diamonds could affect the amount of research funding for diamond synthesis​, so E and F are dependent.

List all the permutations of four objects a, b, c, and d taken two at a time without repetition. What is 4P2​? (a) List all the permutations of four objects a, b, c, and d taken two at a time without repetition.

(a) ab, ac, ad, ba, bc, bd, ca, cb, cd, da, db, dc note: solve 4P2 = 12 which means there is 12 combos

A credit score is used by credit agencies​ (such as mortgage companies and​ banks) to assess the creditworthiness of individuals. Values range from 300 to​ 850, with a credit score over 700 considered to be a quality credit risk. According to a​ survey, the mean credit score is 709.1. A credit analyst wondered whether​ high-income individuals​ (incomes in excess of​ $100,000 per​ year) had higher credit scores. He obtained a random sample of 30 ​high-income individuals and found the sample mean credit score to be 721.4 with a standard deviation of 84.1. Conduct the appropriate test to determine if​ high-income individuals have higher credit scores at the α=0.05 level of significance. (a) State the null and alternative hypotheses. (b) Identify the t-statistics? (c) Identify the P-value? (d) Make a conclusion regarding the hypothesis? ??? the null hypothesis. There ??? sufficient evidence to claim that the mean credit score of​high-income individuals is ??? ???

(a) Hypothesis: H0​: μ =709.1 H1​: μ > 709.1 (b) t-statistics = .8 1. STAT --> TESTS --> 2:T-Test 2. Input values (c) p-value = .215 (d) Fail to reject, is not, > 709.1

Match the histograms on the right to the summary statistics given. I. Mean: 57, Median: 57, Standard Deviation: 2.1 II. Mean: 65, Median: 65, Standard Deviation: 10 III. Mean: 57, Median: 57, Standard Deviation: 12 IV. Mean: 57, Median: 57, Standard Deviation: 21 (a) Peaks around 59 and displays some bell-shaped (b) Does not appear bell shaped at all (c) Very good bell-shape to the left, peaks around 60 (d) Peaks around 50, bell-shaped until it levels off to the right

(a) I (b) II (c) III (d) IV note: higher standard deviation means that the values are more spread out since III and IV are the same except standard deviation, the one with the larger range has the larger standard deviation Low Standard Deviation = Low range

a. Choose the correct answer below for the shape of the distribution. b. The​ five-number summary is ???

(a) The distribution is skewed left note: that is where the whisker/tail is (b) 0, 14, 17, 19, 20 note: summary is the end of the left whisker, start of the box, middle of the box, end of the box, end of the right whisker

Determine the point estimate of the population mean and margin of error for the confidence interval. Lower bound is 19​, upper bound is 23. (a) The point estimate of the population mean is ??? (b) The margin of error is ???

(a) mean = 21 (19 + 23) / 2 = 21 (b) Margin of Error = note: mean + E = upper bound --> 21 - E = 23 --> E = 2

List all the combinations of five objects x, y, z, s, and t taken two at a time. What is 5C2​? (a) List all the combinations of five objects x, y, z, s, and t taken two at a time

(a) xy, xz, xs, xt, yz, ys, yt, zs, zt, st note: solve 5C2 = 10 which means there is 10 combos 1. Input first value into calc (in this case 2) 2. MATH --> PRB --> 3; nCr 3. input in second value (r = 0)

The following data represent the​ high-temperature distribution for a summer month in a city for some of the last 130 years. Treat the data as a population. (a) Mean? (b) Standard deviation? (c) Use the frequency histogram of the data to verify that the distribution is bell shaped. (d) According to the empirical​ rule, 95% of days in the month will be between what two​ temperatures?

(a) μ = 80.6 note: use midpoints of values and plug in to L1 and L2 (b) σ = 8.1 (c) Yes the frequency of the distribution is bell-shaped (d) 64.4 and 96.8 Since​ 95% of the data lie between μ-2σ and μ+2σ μ+2σ --> 80.6 + 2(8.1) = 96.8 μ-2σ --> 80.6 - 2(8.1) = 64.4

The approximate​ Z-score that corresponds to a right tail area of 0.42 is ???

.2 1. Since it is area is the right tail, do 1 - value and use that to solve for z-score (ie. 1 - 0.42 = .58) 2. 2nd --> VARS --> 3:invNorm 3. Plug in that solved for area (.58)

In a certain​ city, the average​ 20- to​ 29-year old man is 69.6 inches​ tall, with a standard deviation of 3.0 ​inches, while the average​ 20- to​ 29-year old woman is 64.3 inches​ tall, with a standard deviation of 3.9 inches. Who is relatively​ taller, a​ 75-inch man or a​ 70-inch woman? Find the corresponding​ z-scores. Who is relatively​ taller, a​ 75-inch man or a​ 70-inch woman? The​ z-score for the man​, ???​, is larger than the​ z-score for the woman​, ???​, so he is relatively taller.

1.8 and 1.46 note: use z = (x - μ) / σ man: (75 - 69.6) / 3 = 1.8 woman: (70 - 64.3) / 3.9 = 1.46

a. Age of a car driven Is the variable qualitative or​ quantitative?

a. The variable is quantitative because it is a numerical measure note: quant = count vs. qual = quality characteristic

Which histogram depicts a higher standard​ deviation? Graph (a): Peaks around 55 and displays more evening out to the sides (more bell-shaped) Graph (b): Appears more uniform and scattered

Histogram a depicts the higher standard​ deviation, because the distribution has more dispersion. note: a has a greater variety in frequencies

Construct a 90​% confidence interval of the population proportion using the given information. x=105, n=150 Lower Bound = ??? Upper Bound = ???

Lower Bound = .638 Upper Bound = .762 p^ ± z(α/2) • √p^(1−p^)/n --> 0.70 ± 1.645 • √[0.70(1 - 0.70)] / 150 = .638 and .762 note: p^ = x/n (in this case 105 / 150 = 0.7) 90% = 1.645

Construct a 90​% confidence interval of the population proportion using the given information. x=160, n=200 Lower Bound = ??? Upper Bound = ???

Lower Bound = .754 Upper Bound = .847 p^ ± z(α/2) • √p^(1−p^)/n --> 0.80 ± 1.645 • √[0.80(1 - 0.80)] / 200 = .847 and .754 note: p^ = x/n (in this case 160 / 200 = 0.80) 90% = 1.645 95% = 1.96 99% = 2.575

a. Number of students at a high school b. Favorite TV Show Is the variable qualitative or​ quantitative?

a. The variable is quantitative because it is a numerical measure. b. The variable is qualitative because it is an attribute characteristic. note: Qualitative variables allow for classification of individuals based on some attribute or characteristic. Quantitative variables provide numerical measures of​ individuals, and can be added or subtracted and provide meaningful results.


Related study sets

ECO 240 Ch 4 The Economic Theory of Pollution Control: The Optimal Level of Pollution

View Set

Credit Agreements and Covenants Questions

View Set

PREPU CH 62: MANAGEMENT OF PTS WITH CEREBROVASCULAR DOS

View Set

THE UNAUTHORIZED PRACTICE OF LAW IN TEXAS

View Set

Element,Compound,Homogeneous Mixture/Solution, Or Heterogenous Mixture

View Set

Chapter 23: Growth and Development of the Infant: 28 Days to 1 Year

View Set

Global History 2: Regents Review Chapters 19-24

View Set