STAT 217 Unit 1-4 Study Review

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

b) will

In the Wrap-up videos I created about Activity 3.2, I mentioned that if a number is a plausible value for the parameter of interest, then this number ___________ be included in the interval. Fill in the blank: a) will not b) will

increases margin of error

Increasing the confidence level has what effect, if any, on the margin of error of this interval?

a) produce a short interval

Increasing the sample size in a study will have what impact(s) on the confidence interval (check all that apply) a) produce a short interval b) increase the confidence level c) shift the midpoint of the interval d) improve our ability to generalize from the sample to the population

reject

Indicate whether or not you would reject the null hypothesis, at the α = 0.05 significance level, for the p-value = 0.001.

from parachute venter

The observational units were obtained from parachute center or from a standard test

the amount students spent; more than two

A quantitative variable from this scenario is Choose your answer; what students were told, the amount students spent --- for which there are ---- possible outcomes.

d) distance

A "standardized statistic" is a measure of a) shape b) center c) spread d) distance

a) True

A 2SD 95% confidence interval for the probability the competent-face method will work was calculated to be 0.72 ± 2(0.09) . Based on your confidence interval, we can conclude that the probability the competent-face method will work is greater than 50%. a) True b) False

the explanatory and the response variable

A randomized experiment allows for the possibility of drawing a cause-and-effect conclusion between _________ and ________.

what students were told; two

A categorical variable from this scenario is ---- for which there are ---- possible outcomes.

decreases; slowly

As n increases, the SD fo p^ _____ more and more ______

The sample size is greater than 20, but we do not know what is the shape of the sample distribution of IQs (the shape of the sample distribution cannot be strongly skewed). So, at this point we are not sure if the validity conditions are met or not.

Are the validity conditions met to conduct a one-sample t-test? Explain.

Well, we CANNOT say this for sure, but we can say .5 is a plausible value for the probability of landing up because .5 is captured inside our 95% confidence interval.

Based on your analyses, would it be legitimate to conclude that the probability a spun tennis racquet lands up is .5? Discuss both the validity of this conclusion based your analyses and what it means to say "probability" in this context.

income levels; median age; population

Consider all of the countries in the world as the observational units in a study Which of the following are quantitative variables that could be recorded on these countries? employment college education median age population ethnicity income levels

a) quantitative variable

The length of the song represents a --- a) quantitative variable b) categorical variable

-true

Consider the scenario: "An instructor wants to investigate whether the color of paper (blue or green) on which an exam is printed has an effect on students' exam scores.​" The observational units in this scenario are the students. -true -false

The parameter of interest is the probability of a spun racquet landing up.

Define the parameter of interest in words.

two

Gilbert uses a sample size of 25. Sullivan uses a sample size of 100. Gilbert's estimated SD for p^ will be ________ times as large as Sullivan's.

In this context μ represents the average IQ of all people who claim to have had an intense experience with an UFO.

Identify clearly what the symbol μ represents in this context.

true

Is the reasoning below a good explanation as to why it makes sense (at least in hindsight) that a couple with four children is more likely not to have two of each sex than to have two of each sex? This fact can be correctly explained by saying that having exactly two of each gender is pretty specific. Even though a specific 3-1 or 4-0 split is less likely than a 2-2 split, there are more ways to obtain a result other than a 2-2 split. true false

c) Less than 0.97, since there are many pairs of people in the room that do not involve you and only 49 pairs of people that involve you.

Now consider the question of how likely it is that at least one person in the room of 50 matches your particular birthday. (Do not attempt to calculate this probability.) ​ Is this event more, less, or equally likely as the event that at least two people share any birthday? In other words, do you think this probability will be smaller than 0.97, larger than 0.97, or equal to 0.97 and why? a) The probability cannot be predicted since we don't know how the people were selected to be in the room. b) Equal to 0.97 since the probability has been determined previously. c) Less than 0.97, since there are many pairs of people in the room that do not involve you and only 49 pairs of people that involve you. d) Greater than 0.97 since your birthday is as likely to occur as any other.

will be smaller

Now keeping the sample size the same, you take a new sample and find a sample proportion of 0.55. How will the new p-value compare to the p-value of 0.08 you first obtained? will be smaller won't change will be larger will double

H0: π = 0.5 Ha: π > 0.5

Null and Alternative Hypothesis State in symbols the appropriate null and alternative hypotheses to be tested.

b) false

Random sampling is a more important consideration than random assignment if the research question is whether students tend to receive higher scores on essays if they are encouraged to submit a draft than if they are not so encouraged. a) true b) false

The strongest evidence is 65 out of 300 The least strong evidence is 100 out of 400

Select the strongest and least strong evidence of the long-run proportion that the friend will choose scissors is less than 1/3. The strongest evidence is The least strong evidence is

We are 95% confidence that the probability a spun racquet lands up is between .366 and .557 for racquets similar to the one in the study

The 95% confidence interval for the probability a racquet lands up is (0.366, 0.557). Interpret this 95%confidence interval.

experienced; novice

The average anxiety score for --- skydivers was 27, and the average anxiety score for --- skydivers was 43.

binary

The categorical variable in this scenario can be classified as nonbinary binary

colors of paper

The categorical variable in this scenario is the exam scores color of paper

is/are ; do not

The categorical variables(s) --- binary because they --- have more than two possible outcomes is/are or is/are not do or do not

a) buzz's long run probability of pushing the correct lever

The following questions refer to the Doris and Buzz study (Example 1.1). Recall that in this study Buzz the Dolphin pushed the correct lever 15 out of 16 times when given a choice between the two levers. Buzz pushed chose a lever after (possibly) being communicated to by Doris (another Dolphin). The parameter in the Doris and Buzz study is a) buzz's long run probability of pushing the correct lever b) whether or not doris and buzz can communicate c) buzz pushing the correct lever 15 out of 16 times

b) If you repeatedly record whether or not it rains for a large number of days with the same weather conditions as tomorrow, in the long run you will see rain on 30% of such days.

The probability of rain tomorrow is 0.3. a) Thirty percent of the region will receive rain tomorrow. b) If you repeatedly record whether or not it rains for a large number of days with the same weather conditions as tomorrow, in the long run you will see rain on 30% of such days. c) Residents of the city should prepare for about 7.2 hours (30% of 24 hours) of rain on the next day. d) None of these are correct.

exam scores

The quantitative variable in this scenario is the exam scores color of paper

anxiety score; expertise level

The quantitative variable in this study is the---- and the categorical variable is the -----.

novice; experienced

The study included 11 -- skydivers and 13 --- skydivers.

d) categorical variable

The subject in which the student is majoring would be considered which of the following? a) observational unit b) quantitative variable c) research question d) categorical variable

a) large

To be strong evidence against the null hypothesis, I want the standardized statistic to be a) large b) small c) the standard statistic doesn't tell me about strength of evidence

Friend D because they played more games

Which friend's data do you think provides more evidence against the null hypothesis as both of the friends played rock the same proportion of times? Friend D because they played more games Friend D because they played less games Friend E because they played less games Friend E because they played more games

ethnicity; employment; education level

Which of the following are categorical variables that could be recorded on these countries? employment education level median age population ethnicity income levels

false

You should not take a random sample of more than 5% of the population size. T F

a) how far the cat can jump and the length of the cat

identify the variables in this question: "Can you predict how far a cat can jump based on factors such as its length?" a) how far the cat can jump and the length of the cat b) how far the cat can jump and the number of jumps c) length of jump and age of cat d) length of cat and frequency of jumps

-which way the label lands

in this scenario, ---- is the variable. -tennis racket -each of 100 spins -which way the label lands -the probability of the label lands facing up

a) which group the prof placed students

--- is the categorical variable in this study. a) which group the prof placed students b) how many hours students practiced each week c) violin students d) music academcy of berlin

fail to reject

Indicate whether or not you would reject the null hypothesis, at the α = 0.05 significance level, for the p-value = 0.078.

p^ (phat) = 80/124 = 0.645 categorical data, statistic symbol= phat

A German bio-psychologist, Onur Güntürkün, was curious whether the human tendency for right-sightedness (e.g., right-handed, right-footed, right-eyed), manifested itself in other situations as well. In trying to understand why human brains function asymmetrically, with each side controlling different abilities, he investigated whether kissing couples were more likely to lean their heads to the right than to the left. He and his researchers observed couples (estimated ages 13 to 70 years, not holding any other objects like luggage that might influence their behavior) in public places such as airports, train stations, beaches, and parks in the United States, Germany, and Turkey. Of the 124 couples observed, 80 leaned their heads to the right when kissing. Research question: Are couples more likely to lean to the right when kissing? Statistic In the study conducted by Onur Güntürkün, what proportion of couples leaned their heads to the right? Use the appropriate symbol.

b) how many hours students practiced each week

A famous study titled "The Effect of Deliberative Practice in the Acquisition of Expert Performance," published in Psychological Review (Ericsson, Krampe, and Tesch-Romer, 1993) led to the now-conventional wisdom that 10,000 hours of deliberate practice are necessary to achieve expert performance in skills such as music and sports. The researchers in this study asked violin students at the Music Academy of West Berlin to keep a diary indicating how they spent their time. The researchers also asked the students' professors to indicate which were the top students with the potential for careers as international soloists, which were good violinists but not among the best, and which were studying to become music teachers. Researchers found that those in the top two groups devoted much more time per week to individual practice than did those in the group studying to become music teachers.​ --- is the quantitative variable in this study. a) which group the prof placed students b) how many hours students practiced each week c) violin students d) music academcy of berlin

c. (0.60, 0.66) d. (0.47, 0.53)

A recent Gallup poll showed the president's approval rating at 60%. Some friends use this information (along with the sample size from the poll) and find theory-based confidence intervals for the proportion of all adult Americans that approve of the presidents approve of the president's performance. Of the following four confidence intervals, identify the ones that were definitely done incorrectly. (There may be more than term-12one interval that is incorrect.) a. (0.57, 0.63) b. (0.58, 0.62) c. (0.60, 0.66) d. (0.47, 0.53)

d) Do novice skydivers tend to have higher levels of self-reported anxiety prior to a skydive than experienced skydivers?

A recent study investigated self-reported anxiety levels immediately before a skydive among 11 first-time skydivers and 13 experienced skydivers (at least 30 jumps) recruited from a parachute center in northern England (Hare et al., 2013). The researchers found that anxiety levels were substantially higher (average anxiety score of 43, higher means more anxiety) among the first-time skydivers as compared to the experienced skydivers (average anxiety score of 27) on a standard test for anxiety (Spielberger StateTrait Anxiety Inventory). Which of the following represents Step 1 of a statistical investigation? a) The 24 skydivers b) From a parachute center in Northern England c) Novice or expert skydiver d) Do novice skydivers tend to have higher levels of self-reported anxiety prior to a skydive than experienced skydivers?

a) theory-based inference applet

In the Wrap-up videos I created about Activity 2.1, I talked about the conducting the theory-based approach using a different applet. What is the name of the applet? a) theory-based inference applet b) theory applet c) one proportion applet

d) The new interval would be narrower than (2.619, 3.401) hours, because the sample size is bigger.

According to a 2011 report by the United States Department of Labor, civilian Americans spend 2.75 hours per day watching television. A faculty researcher, Dr. Sameer, at California Polytechnic State University (Cal Poly) conducts a study to see whether a different average applies to Cal Poly students. Suppose that for a random sample of 100 Cal Poly students, the mean and standard deviation of hours per day spent watching TV turns out to be 3.01 and 1.97 hours, respectively. The data were used to find a 95% confidence interval: (2.619, 3.401) hours/day.​ Suppose that the data had actually been collected from a sample of 150 students, and not 100, but everything else (mean and SD) was the same as reported earlier. How, if at all, would the new 95% confidence interval based on these data differ from the interval mentioned earlier: (2.619, 3.401) hours?​ a) More information is needed to answer this question. b) The new interval would be wider than (2.619, 3.401) hours, because the sample size is bigger. c) The new interval would still be (2.619, 3.401) hours, because we are still 95% confident. d) The new interval would be narrower than (2.619, 3.401) hours, because the sample size is bigger.

a) If you repeatedly draw M&Ms at random a very large number of times, in the long-run 20% of those M&Ms will be red.

Answer this question for each of the following statements: Which of the following explains what it means to say "the probability of ..." while describing the random process that is repeated over and over again? The probability of getting a red M&M candy is 0.2. a) If you repeatedly draw M&Ms at random a very large number of times, in the long-run 20% of those M&Ms will be red. b) Each time you draw ten M&Ms, two of the M&Ms should be red. c) For every 100 M&Ms, there should be 20 M&Ms in the bag of candies. d) All of these are correct interpretations of probability.

d) there is very strong evidence against the null hypothesis

As part of a class project, a student conducts a survey of 50 students at her school finding that, on average, the students in the sample report spending 2.3 hours per day watching TV with a standard deviation of s=2 hours per day. The student wonders whether this is significantly less the national average of 4.9 hours per day. The t-statistic for this data is given as -9.19. This means that a) there is little to no evidence against the null hypothesis b) there is moderate evidence against the null hypothesis c) there is strong evidence against the null hypothesis d) there is very strong evidence against the null hypothesis

a) The patient does not have the disease, but the test says the patient has the disease

As with a jury trial, another analogy to hypothesis testing involves medical diagnostic tests. These tests aim to indicate whether or not the patient has a particular disease. But the tests are not infallible, so errors can be made. The null hypothesis can be regarded as the patient being healthy. The alternative hypothesis can be regarded as the patient having the disease. Describe what Type I error (false alarm) represents in this situation. a) The patient does not have the disease, but the test says the patient has the disease b) The patient had the disease, but the test says the patient is healthy.

0.625

Based on the probability given in part b (p = 0.375), what is the probability that a couple with four children does not have 2 boys and 2 girls?

1) z = 3.012. (This number may vary a bit since the SD of the simulated null distribution will vary a bit.) 2) Because the value of z is greater than 2, we have enough evidence to reject the null hypothesis. 3) The standard deviation of the simulated sample proportions should be around 0.082, so the standardized statistic is approximately z = (0.50 - 0.25)/0.083 = 3.012.

Calculate the standardized statistic (z). Show/type every step of your calculations. Based on the value of (z) you found, state whether you would reject or fail to reject the null hypothesis. Please use this template for your answer: 1) z = 2) Answer (Reject the null OR fail to reject the null): 3) Type your work for the calculation of z:

c) cats

Can you predict how far a cat can jump based on factors such as its length? Identify the observational units in this question. a) length of cat b) number of jumps c) cats d) distance jumped

d) The bimodal clustering is likely due to widely varying geography in San Luis Obispo County, including locations nearer to the ocean (cooler) and farther away from the ocean (warmer).

Can you suggest an explanation for the bimodal (two clusters/peaks) nature of this distribution? a) There should be no difference in the predictions because the area is in the same region of California and should have the same temperatures each day. b) Different organizations are making the predictions using different measuring instruments and computer programming. c) There are two competing weather organizations - the national weather service and the local weather service which are providing the predictions. d) The bimodal clustering is likely due to widely varying geography in San Luis Obispo County, including locations nearer to the ocean (cooler) and farther away from the ocean (warmer).

Yes. We had 46 up (success) and 54 down (failure), both above 10. So yes, we would predict a normal shape to the null distribution of sample proportions.

Can you use a Theory-based approach to answer the research question? Explain.

In this study, the proportion of couples who leaned their heads to the right was 0.645. With a very small p-value of 0.00 we have very strong evidence against the null hypothesis. We can reject the null hypothesis and conclude that it was very unlikely to observe 64.5% of the couples (or more) leaning to the right, under the assumption that the null hypothesis is true (couples have no preference to a side when they kiss). Thus, we have evidence to say that couples do have a preference for leaning to the right when kissing.

Conclusion Summarize the conclusion that you draw from this study based on your analysis. Make sure you explain the reasoning process behind your conclusion by referring to the (1) observed statistic, (2) p-value (3) strength of evidence against the null model (very strong, strong, moderate, weak), (4) reject/fail to reject the null hypothesis, and (5) overall conclusion in context.

a) residence situation of each student

Consider the following question: "Is the residence situation of a college student (on-campus, off-campus with parents, off-campus without parents) related to how much alcohol the student consumes in a typical week?" Which of the following would be a categorical, non-binary variable? a) residence situation of each student b) alcohol consumption of each student c) type of alcohol consumed d) type of college

d) alcohol consumption of each student

Consider the following question: "Is the residence situation of a college student (on-campus, off-campus with parents, off-campus without parents) related to how much alcohol the student consumes in a typical week?" Which of the following would be a quantitative variable? a) type of alcohol consumed b) type of college c) residence situation of each student d) alcohol consumption of each student

true

Consider the scenario: "Statistical evidence was used in the murder trial of Kristen Gilbert, a nurse who was accused of killing patients. More than 1,000 eight-hour shifts were analyzed. Was the proportion of shifts with a death substantially higher for the shifts that Gilbert worked?​" The observational units in this scenario are the 8-hour shifts. true false

students height; students age

Consider the students in your class as the observational units in a study Which of the following are quantitative variables that could be recorded on these students? -students hair color -students residence situation -students height -students with backpacks -students age

quantitative

Consider this question: "Can you predict how far a cat can jump based on factors such as its length?" The variables in this question are quantitative categorical

a) Amanda, because there is more consistency in her scores... they are all equal to each other (there is no variation). b) Charlene, because there are large differences between her individual scores and her mean score. For this question, you need to recognize that the standard deviation measures the average distance from the mean. Charlene is the one that has most of the values away from her mean score.

Consider three students with the following distributions of 24 quiz scores: Amanda: 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8 Barney: 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 10 Charlene: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10 a) Which student has the smallest standard deviation of quiz scores? Explain. Do not do any calculations. b) Which student has the largest standard deviation of quiz scores? Explain. Do not do any calculations.

Flip a coin 100 times and record the time of times the coin lands heads. Let this represent the racquet landing "up." Repeat this process a large number of times and look at the distribution of the proportion of heads across all these trials. Then see where 0.46 falls in this distribution. If it falls is in the tail of the distribution, we have evidence that the racquet was not landing equally heads/tails in the long-run. If 0.46 falls among the "typical" values in this distribution, then we don't have evidence against the belief that the racquet is 50/50 when spun this way.

Describe how you could use a coin to conduct a simulation analysis of this study and its result. Give sufficient detail that someone else could implement this simulation analysis based on your description. Be sure to indicate how you would decide whether the observed data provide much evidence against believing that spinning the racquet is really a fair 50-50 process.

b) The patient has the disease, but the test says the patient is healthy.

Describe what Type II error (missed opportunity) represents in this situation. a) The patient does not have the disease, but the test says the patient has the disease. b) The patient has the disease, but the test says the patient is healthy.

b) two sided

For the following scenario, classify the alternative hypothesis as one-sided or two-sided. A snack-food company produces a 454 g bag of pretzels. If the mean net weight is less than 454 g, the company will be short-changing its customers; and if the mean new weight exceeds 454g, the company will be unnecessarily overfilling the bags. The quality assurance department periodically tests whether the packaging machine is working properly. a) one sided b) two sided

true

For this question: "Do college students who pull all-nighters tend to have lower grade point averages than those who do not pull all-nighters?", an example of a categorical binary variable is whether or not each student pulls all-nighters. true false

quantitative variable

For this question: "Do college students who pull all-nighters tend to have lower grade point averages than those who do not pull all-nighters?", the students' GPA represents a Choose the answer from the menu in accordance to the question statement quantitative variable or categorical variable

We cannot claim that the mean IQ of the population of those who claim to have had an intense experience with an UFO is greater than 100. In other words, it is plausible to say that the mean IQ of the population (those who claim to have had an intense experience with an UFO) is 100.

Give a conclusion in context, make sure you refer to the parameter.

The confidence interval should get narrower (less wide) with the larger sample size

How do you expect the confidence interval to change if the sample size was 200 and the sample proportion was still .46?

a) alcohol consumption of each student and residence situation of each student

Identify the variables in this question a) alcohol consumption of each student and residence situation of each student b) alcohol consumption of each student and types of alcohol consumed c) age of student and residence situation of each student d) type of alcohol consumed and age of student

b) not correct

If a researcher conducted a randomized experiment, then we know for sure that random assignment did NOT happen in the study. a) correct b) not correct

b) No, the proportion of games that team A wins will likely be close to but not exactly 2/3 (such as 18, 19, 20, 21, 22 wins), so there could still be a low probability that A wins exactly 2/3 of the 30 games.

If team A plays team B for 30 games, is it very likely that team A will win exactly 20 times? Explain. a) This cannot be determined without seeing each game's outcome. b) No, the proportion of games that team A wins will likely be close to but not exactly 2/3 (such as 18, 19, 20, 21, 22 wins), so there could still be a low probability that A wins exactly 2/3 of the 30 games. c) Yes, since the probability has been calculated to be 2/3, we expect team A to win 20 games. d) Yes, team A will be likely to win twenty of thirty games because the long-run probability is 2/3.

b) make the ages of respondents similar

If the average age of novice skydivers was 21 and the average age of experienced skydivers was 28, how could the study of anxiety levels of skydivers based on experience be better identified in a future study? a) make the experiences of respondents similar b) make the ages of respondents similar c) use a different inventory d) use respondents of only one gender

age

If the average age of novice skydivers was 21 and the average age of experienced skydivers was 28, this could impact the conclusions of the researchers by implying that the difference in anxiety levels is due to age gender level of expertise type of test administered

c) It would be preferable for team A to play the best-of-three series, because in the longer series there is less of a chance for a weaker team to achieve the upset win multiple times.

If you are a fan of team A, would it be preferable to play a single game or a best-of-three series, or is there no difference? a) It would be preferable for team A to play a single game, because they can play their best pitcher and probably win. b) It doesn't matter whether they play a single game or a best-of-three series. They should win anyway with the large probability. c) It would be preferable for team A to play the best-of-three series, because in the longer series there is less of a chance for a weaker team to achieve the upset win multiple times.

b) False

If you are concerned that the validity conditions aren't met, use a theory-based approach to compute a confidence interval for the mean. a) True b) False

d) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have at least 50 numbers recorded. If any of the 49 numbers are the same, then you have a 'match.' Repeat this process 1000 times and see in what proportion of the sets of 50 numbers there is a match.

Imagine you had a hat with 365 slips of paper in it, each with a number from 1 to 365 listed on it to represent each day of the year. Select from the simulations below the best simulation method you could use to confirm that the probability is 0.97 (as given in Part (a)). a) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have at least 50 numbers recorded. If any of the 49 numbers are the same as the first number drawn, then you have a 'match.' Repeat this process 10 times and find the proportion of the 10 sets of 50 numbers for which there is a match. b) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have at least 50 numbers recorded. If any of the 49 numbers are the same as the first number drawn, then you have a 'match.' Repeat this process 100 times and find the proportion of the 10 sets of 50 numbers for which there is a match. c) All simulations are equivalent. d) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have at least 50 numbers recorded. If any of the 49 numbers are the same, then you have a 'match.' Repeat this process 1000 times and see in what proportion of the sets of 50 numbers there is a match.

d) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have 50 numbers recorded. If any of the 50 numbers are the same as the first number drawn, then you have a 'match'. Repeat this process 1000 times and find the proportion of the 1000 sets of 50 numbers for which there is a match.

Imagine you had a hat with 365 slips of paper in it, each with a number from 1 to 365 listed on it to represent each day of the year. ​​​Select from the simulations below the best simulation method you could use to confirm the probability of someone in the room matching your particular birthday will be smaller than 0.97, larger than 0.97, or equal to 0.97. a) All simulations are equivalent. b) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have 50 numbers recorded. If any of the 50 numbers are the same, then you have a 'match'. Repeat this process many, many times and see in what proportion of the sets of 50 numbers there is a match. c) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have 50 numbers recorded. If any of the 50 numbers are the same as the first number drawn, then you have a 'match'. Repeat this process 10 times and find the proportion of the 10 sets of 50 numbers for which there is a match. d) Draw a slip of paper out of the hat and write down the number on a piece of paper. Replace this slip, mix up the slips, and repeat the draw 49 more times until you have 50 numbers recorded. If any of the 50 numbers are the same as the first number drawn, then you have a 'match'. Repeat this process 1000 times and find the proportion of the 1000 sets of 50 numbers for which there is a match.

Ho: μ = 100 vs. Ha: μ > 100.

In a 1993 study, researchers took a sample of people who claimed to have had an intense experience with an unidentified flying object (UFO) and a sample of people who did not claim to have had such an experience (Spanos et al., 1993). They then compared the two groups on a wide variety of variables, including IQ. Suppose you want to test whether or not the average IQ of those who have had such a UFO experience is higher than 100. The mean IQ of the 25 people in the study who claimed to have had an intense experience with a UFO was 101.6; the standard deviation of these IQs was 8.9. State the null and alternative hypothesis in symbols.

false

In general, there will be more variability in the sample means across samples than in the variability in the quantitative variable across observational units. t f

99%

In order to investigate how many hours a day students at their school tend to spend on course work outside of regularly scheduled class time, a statistics student takes a random sample of 150 students from their school by randomly choosing names from a list of all full-time students at their school that semester. The student finds that the average reported daily study hours among the 150 students is 2.23 hours. The standard deviation of the hours studied is 1.05 hours. Which confidence interval would be the widest?​ 95% 90% 99% 85%

representative

In order to make their conclusions more relevant, the researchers are hoping that the 24 skydivers in this study are (representative or not representative) when compared to all skydivers in general.

Replacement: 0.167 Regular: 0.070

In the 2012 National Football League (NFL) season, the first three weeks' games were played with replacement referees because of a labor dispute between the NFL and its regular referees. Many fans and players were concerned with the quality of the replacement referees' performance. We could examine whether data might reveal any differences between the three weeks' games played with replacement referees and the next three weeks' games that were played with regular referees. For example, did games generally take less or more time to play with replacement referees than with regular referees? The pair of dotplots below display data about the duration of games (in minutes), separated by the type of referees officiating the game.​ What proportion of the 48 games officiated by replacement referees lasted for at least 3.5 hours (210 minutes)? What proportion of the 43 games officiated by regular referees lasted for this long? (Round your answer to 3 decimal places, ex. 5.245.) Replacement: Regular:

b) Both distributions are fairly symmetric, centered around 12 penalties, with a minimum of 4 penalties and a maximum of 24-25 penalties. c) The game that had the highest number of penalties assessed was officiated by a regular referee. d) The games with 23-25 penalties are a bit unusual for both types of referees, with a few more of these extreme games for the replacement referees.

In the 2012 National Football League (NFL) season, the first three weeks' games were played with replacement referees because of a labor dispute between the NFL and its regular referees. Many fans and players were concerned with the quality of the replacement referees' performance. We could examine whether data might reveal any differences between the three weeks' games played with replacement referees and the next three weeks' games that were played with regular referees. The pair of dotplots shown display data on the total number of penalties called in the game, again separated by the type of referee. Select all the statements below which might reveal whether the two types of referees differ with regard to the distributions of this variable. a) The number of penalties assessed by regular referees seems to be higher than the number of penalties assessed by replacement referees. b) Both distributions are fairly symmetric, centered around 12 penalties, with a minimum of 4 penalties and a maximum of 24-25 penalties. c) The game that had the highest number of penalties assessed was officiated by a regular referee. d) The games with 23-25 penalties are a bit unusual for both types of referees, with a few more of these extreme games for the replacement referees.

a) the 99% confidence interval is wider

In the Wrap-up videos I created about Activity 2.2_3.1, I compared a 95% confidence interval with a 99% confidence interval. Based on what I said in the videos, which of the following is the correct answer? a) the 99% confidence interval is wider b) the 95% confidence interval is wider c) both intervals are the same width

a) True

In this scenario, how long the subjects thought the song snippet lasted is the variable. a) True b) False

categorical

In this scenario, the variable(s) would be -- categorical or quantitative

it decreases margin of error

Increasing the sample size, if all else remains the same, has what effect, if any, on the margin of error of this interval?

reject

Indicate whether or not you would reject the null hypothesis, at the α = 0.05 significance level, for the p-value = 0.045.

fail to reject

Indicate whether or not you would reject the null hypothesis, at the α = 0.05 significance level, for the p-value = 0.051.

b) college students

Is the residence situation of a college student (on-campus, off -campus with parents, off -campus without parents) related to how much alcohol the student consumes in a typical week? Answer each of the following questions. Identify observational units. a) vol of alcohol consumed b) college studentss c) place of residence d) where alcohol is consumed

This is a one-sided test, because we wish to test whether or not the average IQ of this group is greater than 100. So there is a clear direction in the research question.

Is this a one-sided or a two-sided test? Explain how you can tell.

true

It has been reported that the probability that a new business closes or changes owners within its first 3 years is about 0.6. Saying, "About 60% of all new businesses close or change owners within the first three years," would be a correct interpretation of the probability in this case. true false

b) In a very large number of couples with four children, roughly 37.5% of the couples will have 2 boys and 2 girls (assuming each birth is equally likely to be a boy or a girl).

It turns out that the probability is 0.375 that a couple with four children would have two boys and two girls. Which is the best interpretation of what this probability means? a) In 1000 couples with 4 children, you would expect 375 couples to have 2 boys and 2 girls. b) In a very large number of couples with four children, roughly 37.5% of the couples will have 2 boys and 2 girls (assuming each birth is equally likely to be a boy or a girl). c) Both A and B are correct statements. d) The probability cannot be 0.375 because that would be a fractional number of couples which cannot happen.

a) Team A has a higher chance of winning the best of three series than the 2/3 chance of winning any one game). b) If teams A and B repeatedly play a best-of-three series, then in the long run team A will win 74.1% of those.

It turns out that the probability is 0.741 that team A would win this best-of-three series against team B. What does this probability mean? Select all that apply. a) Team A has a higher chance of winning the best of three series than the 2/3 chance of winning any one game). b) If teams A and B repeatedly play a best-of-three series, then in the long run team A will win 74.1% of those. c) All of these statements are correct. d) If teams A and B play a best-of-three series for 1000 times, then team A will win 741 of those series. e) None of these statements are correct.

false

It would be correct to say that if team A plays team B for 3 games, A guaranteed to win exactly twice. true false

false

It would be correct to say that if team A plays team B for 30 games, A is guaranteed to win exactly twenty games. true false

true

Larger random samples are always better than smaller random samples. t F

False

Larger samples are always better than smaller samples, regardless of how the sample was collected. T F

The corresponding p-value will be smaller than 0.01.

Let π denote some population proportion of interest and suppose a 99% confidence interval for π is calculated to be (0.60, 0.70). Also, suppose that we want to testH0: π = 0.74 vs. Ha: π ≠ 0.74What can you say about the corresponding p-value? The corresponding p-value will be smaller than 0.01. I can't say anything about the corresponding p-value until I run the test. The corresponding p-value will be larger than 0.05. The corresponding p-value will be smaller than 0.05 but larger than 0.01.

"If you took another sample of 1467 adult Americans, there is a 95% chance that its sample mean would fall within this interval" is incorrect because we are not trying to capture sample means - we are trying to capture the population mean.

Look back at the previous question and explain why alternative B is incorrect.

a) Let rolls 1 and 2 represent team B winning a game and 3-6 represent team A. Roll the die and record who wins the game until one team has won two games (two or three times). Repeat the simulation a large number of times (say 1000) and record how often team A wins divided by the number of repetitions.

Of the methods listed below, select which would be the best use of a six-sided die to approximate the probability that team A would win the best-of-three series against team B. a) Let rolls 1 and 2 represent team B winning a game and 3-6 represent team A. Roll the die and record who wins the game until one team has won two games (two or three times). Repeat the simulation a large number of times (say 1000) and record how often team A wins divided by the number of repetitions. b) Let rolls 1 and 2 represent team A winning a game and 3-6 represent team B. Roll the die and record who wins the game until one team has won two games (two or three times). Repeat the simulation for 100 times and record how often team A wins divided by the number of repetitions. c) There is not a preferred method of the three listed. d) Let rolls 1, 2, and 3 represent team A winning a game and 4, 5, and 6 represent team B winning a game. Roll the die and record who wins the game until one team has won two games (two or three times). Repeat the simulation 50 times and record how often team A wins divided by the number of repetitions.

d) 0.12

Suppose the student finds a 95% confidence interval of 0.64 to 0.88. What is the margin of error for this confidence interval? a) 0.76 b) 0.38 c) 0.24 d) 0.12 e) 0.05

a) If you repeatedly select an adult American at random a large number of times, in the long run, roughly 30% of the time the selected adult will vote to get rid of the penny.

Pennies can be a nuisance. Suppose 30% of the population of adult Americans want to get rid of the penny. If I randomly select one person from this population, the probability this person wants to get rid of the penny is 0.30. a) If you repeatedly select an adult American at random a large number of times, in the long run, roughly 30% of the time the selected adult will vote to get rid of the penny. b) If you ask 10 people if they are in favor of getting rid of the penny, 3 of them will say yes. c) There are 30 out of every 100 people who would agree that they would vote to get rid of the penny. d) All of these are correct.

it has no effect on midpoint

Recall the form of a theory-based confidence interval for a population proportion π is p^± multiplier ×square root p^(1-p^) / n. Increasing the confidence level has what effect, if any, on the midpoint of this interval?

statistic: t= 101.6-100 / 8.9 / square root 25 = 0.90

Regardless of your answer to the previous question, calculate the test statistic and use technology to find the theory based p value

-3.47 (100 out of 400; 25%), -3.80 (20 out of 120; 16.7%), -4.17 (65 out of 300; 21.7%)

Rock-paper-scissorsHave you ever played rock-paper-scissors (or Rochambeau)? It's considered a "fair game" in that the two players are equally likely to win (like a coin toss). Both players simultaneously display one of three hand gestures (rock, paper, or scissors), and the objective is to display a gesture that defeats that of your opponent. The main gist is that rocks break scissors, scissors cut paper, and paper covers rock, which explored players' choices in the game rock-paper-scissors.Suppose that you play the game with three different friends separately with the following results: Friend A chose scissors 100 times out of 400 games, Friend B chose scissors 20 times out of 120 games, and Friend C chose scissors 65 times out of 300 games. Suppose that for each friend you want to test whether the long-run proportion that the friend will pick scissors is less than 1/3.​ Select the appropriate standardized statistics for each friend from the null distribution produced by applet. -3.80 (100 out of 400; 25%), -3.47 (20 out of 120; 16.7%), -4.17 (65 out of 300; 21.7%) -3.47 (100 out of 400; 25%), -3.80 (20 out of 120; 16.7%), -4.17 (65 out of 300; 21.7%) -3.47 (100 out of 400; 25%), -4.17 (20 out of 120; 16.7%), -3.80 (65 out of 300; 21.7%) -4.17 (100 out of 400; 25%), -3.80 (20 out of 120; 16.7%), -3.47 (65 out of 300; 21.7%)

a) Both types of referees saw most games take between roughly 180 and 210 minutes (3-3.5 hours) b) Neither distribution of game durations is symmetric, because both have a few games that took a very long time compared to the others. d) Games with replacement referees displayed more variability in game durations, compared to slightly greater consistency in game durations for regular referees. e) Games with replacement referees tended to take a bit longer than those with regular referees, roughly 10 minutes longer on average.

Select the statements below which best describe the distributions of game durations between the two types of referees. Be sure to consider shape, center, variability, and unusual observations. a) Both types of referees saw most games take between roughly 180 and 210 minutes (3-3.5 hours) b) Neither distribution of game durations is symmetric, because both have a few games that took a very long time compared to the others. c) The two distributions show very little overlap. d) Games with replacement referees displayed more variability in game durations, compared to slightly greater consistency in game durations for regular referees. e) Games with replacement referees tended to take a bit longer than those with regular referees, roughly 10 minutes longer on average. f) Both distributions of game durations are symmetric, because both have a few games that took a very long time compared to the others.

Null hypothesis: the probability of the racquet landing up is .50 (Ho: π = 0.5) Alternative hypothesis: the probability of the racquet landing up is not .50 (Ha: π ≠ 0.5) We are using a two-sided alternative hypothesis because there is not clear "direction" in the research question. We wouldn't want to use the method of spinning the racquet to see who serves first if either the racquet lands up more than half the time or if it lands up less than half the time.

State the null and alternative hypotheses corresponding to this research question.

The 95% interval will not contain 0.25, but the 99% interval will contain 0.25.

Suppose I am conducting a test of significance where the null hypothesis is my cat Grayce will pick the correct cancer specimen 25% of the time and the alternative hypothesis is that she will pick the cancer specimen at a rate different than 25%. I end up with a p-value of 0.02. I also construct 95% and 99% confidence intervals from my data. What will be true about my confidence intervals? The 95% interval will contain 0.25, but the 99% interval will not contain 0.25. Neither the 95% nor the 99% intervals will contain 0.25. The 95% interval will not contain 0.25, but the 99% interval will contain 0.25. Both the 95% and the 99% intervals will contain 0.25.

0.38 plus minus 0.11

Suppose a 95% confidence interval for a population proportion is (0.27, 0.49). Rewrite this interval in the form of ρ^ ± margin of error.

true

Suppose a polling organization takes a random sample of 100 people from the population of adults in a city, (where 30% of this population wants to get rid of the penny). Then the probability is 0.015 that the sample proportion who want to get rid of the penny is less than 0.20. Stating, "If you repeatedly select a sample of 100 adults from this city and record the proportion that want to get rid of the penny for each sample, in the long run roughly 1.5% of these samples will have at most 20% of the sample wanting to get rid of the penny," would explain what it means to say "the probability of ..." while describing the random process that is repeated over and over again. true false

a) Sandy - temperatures in San Diego tend to be higher than in New York

Suppose that Nellie records the highest temperature every day for one year in New York City, while Sandy does the same in San Diego.​ Who would you expect to have the larger mean of these temperatures? a) Sandy - temperatures in San Diego tend to be higher than in New York b) They would be the same because they are both on the coasts c) can't be determined d) Nellie - New York has high temperatures similar to San Diego, just not as many

reject

Suppose that you perform a significance test and, based on the p-value, decide to reject the null hypothesis at the α = 0.05 significance level. Then suppose that your colleague decides to conduct the same test on the same data but using α = 0.065 as significance level. For the new significant level, would you reject the null hypothesis, fail to reject the null hypothesis, or would you not have enough information to say? (Hint: First ask yourself what must be true about the p-value based on your decision to reject the null hypothesis at the α = 0.05 significance level.)

reject

Suppose that you perform a significance test and, based on the p-value, decide to reject the null hypothesis at the α = 0.05 significance level. Then suppose that your colleague decides to conduct the same test on the same data but using α = 0.10 as significance level. For the new significant level, would you reject the null hypothesis, fail to reject the null hypothesis, or would you not have enough information to say? (Hint: First ask yourself what must be true about the p-value based on your decision to reject the null hypothesis at the α = 0.05 significance level.)

b) Let heads represent a boy and tails a girl. Flip the coin four times and record the number of boys (heads) in those four children (tosses). Repeat this 1,000 times, and look at what proportion of those 1,000 repetitions results in 2 boys and 2 girls. This is the probability that a couple will have four children with 2 boys and 2 girls.

Suppose that a birth is equally likely to be a boy or a girl, and the outcome of one birth does not change this probability for future births. Select the best simulation below for how a coin could be used to approximate the probability that a couple with four children would have two boys and two girls. a) Let heads represent a boy and tails a girl. Flip the coin four times and record the number of boys (heads) in those four children (tosses). Divide the number of outcomes with 2 boys and 2 girls by 4. This is the probability that a couple will have four children with 2 boys and 2 girls. b) Let heads represent a boy and tails a girl. Flip the coin four times and record the number of boys (heads) in those four children (tosses). Repeat this 1,000 times, and look at what proportion of those 1,000 repetitions results in 2 boys and 2 girls. This is the probability that a couple will have four children with 2 boys and 2 girls. c) Let heads represent a boy and tails a girl. Flip the coin four times and record the number of boys (heads) in those four children (tosses). Repeat this 10 times, and look at what proportion of those 10 repetitions results in 2 boys and 2 girls. This is the probability that a couple will have four children with 2 boys and 2 girls. d) There is no best way to calculate the probability of having 2 boys and 2 girls.

b) If these two teams play each other many, many times under identical conditions, team A will win 2/3 of the games in the long run.

Suppose that baseball team A is better than baseball team B. Team A is better by enough that it has a 2/3 probability of beating team B in any one game, and this probability remains the same for each game, regardless of the outcomes of previous games. Explain what it means to say that team A has a 2/3 probability of beating team B in any one game. a) If these two games play each other 99 times, team A will win 66 of the 99 games. b) If these two teams play each other many, many times under identical conditions, team A will win 2/3 of the games in the long run. c) All answers are equivalent. d) If these two teams play each other 3 times, team A will win 2 of the three games.

Blood Type: categorical Waiting time: quantitative Mode of arrival (ambulance, personal car, on foot, other):categorical Whether or not men have to wait longer than women: this is a research qustion/comparison, not a variable posed to visitors; variables asked of the visitors would be the length of wait and gender Number of patients who arrive before noon: quantitative Whether or not the patient is insured: categorical Number of stitches required: quantitative Whether or not stitches are required: categorical Number of patients who are insured: there is a summary of the data, not a variable posed to individual visitors Assigned room number: categorical

Suppose that the observational units in a study are the patients arriving at an emergency room in a given day. For each of the following, indicate whether it can legitimately be considered a variable or not. If it is a variable, classify it as categorical (and if it is binary) or quantitative. If it is not a variable, explain why not. Blood Type: Waiting time: Mode of arrival (ambulance, personal car, on foot, other): Whether or not men have to wait longer than women: Number of patients who arrive before noon: Whether or not the patient is insured: Number of stitches required: Whether or not stitches are required: Number of patients who are insured: Assigned room number:

a) In many, many rooms each containing 50 people, 97% of the rooms will have at least two people with the same birthday.

Suppose that there are 50 people in a room. It can be shown (under certain assumptions) that the probability is approximately 0.97 that at least two people in the room have the same birthday (month and date, not necessarily year).​ Select which statement would best explain what 0.97 means to a person who has never studied probability or statistics. a) In many, many rooms each containing 50 people, 97% of the rooms will have at least two people with the same birthday. b) If you have 100 rooms with 50 people, then 97 of the rooms should have at least two people with the same birthday. c) Each of these statements are equivalent statements about probability. d) Neither of these statements correctly explain what the probability means.

reject

Suppose that you perform a significance test and, based on the p-value, decide to reject the null hypothesis at the α = 0.05 significance level. Then suppose that your colleague decides to conduct the same test on the same data but using α = 0.0001 as significance level. For the new significant level, would you reject the null hypothesis, fail to reject the null hypothesis, or would you not have enough information to say? (Hint: First ask yourself what must be true about the p-value based on your decision to reject the null hypothesis at the α = 0.05 significance level.)

reject

Suppose that you perform a significance test and, based on the p-value, decide to reject the null hypothesis at the α = 0.05 significance level. Then suppose that your colleague decides to conduct the same test on the same data but using α = 0.01 as significance level. For the new significant level, would you reject the null hypothesis, fail to reject the null hypothesis, or would you not have enough information to say? (Hint: First ask yourself what must be true about the p-value based on your decision to reject the null hypothesis at the α = 0.05 significance level.)

p values greater than 0.05

Suppose that you perform a significance test using the α = 0.05 significance level. For what p-values would you fail to reject the null hypothesis?

p values less than or equal to 0.05

Suppose that you perform a significance test using the α = 0.05 significance level. For what p-values would you reject the null hypothesis?

will be smaller

Suppose you are testing the hypothesis H0: π=0.50 versus Ha : π>0.50. You get a sample proportion of 0.54 and find that your p-value is 0.08. Now suppose you redid your study with each of the following changes. You increase the sample size and still find a sample proportion of 0.54. How will the new p-value compare to the p-value of 0.08 you first obtained? will be larger will double will be smaller won't change

each of the 100 spins

Tennis players often spin a racquet to decide who serves first. The spun racquet can land with the manufacturer's label facing up or down. A reasonable question to investigate is whether a spun tennis racquet is equally likely to land with the label facing up or down. (If the spun racquet is equally likely to land with the label facing in either direction, we say that the spinning process is fair). Suppose that you gather data by spinning your tennis racquet 100 times, each time recording whether it lands with the label facing up or down. in this scenario, --- are the observational units. -tennis racket -each of 100 spins -which way the label lands -the probability of the label lands facing up

A bar graph showing the proportion that landed up (success) and the proportion that landed down (failure): phat: 46/100 = 0.46 sample proportion of spins that landed with the logo facing up

Tennis players often spin a tennis racquet and observe whether it lands with the logo facing up or down, to determine who serves first. But is this really a 50-50 process, equally likely to land with the logo facing up or down? To investigate this, a tennis player spun his racquet 100 times, and he obtained 46 "up" and 54 "down" results. Does this provide much evidence against believing that spinning the racquet is really a fair 50-50 process? Produce a graph of these sample results. Identify the statistic and use an appropriate symbol to represent it.

The observational units are the adult American who were interviewed by the GSS. The variable is the number of close friends that the adult American has. This variable is quantitative.

The 2004 General Social Survey (GSS) interviewed a random sample of adult Americans. For one question the interviewer asked: "From time to time, most people discuss important matters with other people. Looking back over the last six months-who are the people with whom you discussed matters important to you? Just tell me their first names or initials." The interviewer then recorded how many names or initials the respondent mentioned. Results are tallied in the following table. The mean was 1.987 friends and the standard deviation was 1.7708 friends. Identify the observational units and variable in the study. Is the variable categorical or quantitative?

b) proportions

The Gallup Organization frequently polls adult Americans to determine whether they approve of the job being done by President Obama. a) means b) proportions

a) There is a bimodal (two clusters) distribution of predicted high temperatures in San Luis Obispo County on July 8, 2012. c) The center of the overall distribution is between 70 and 75 degrees. d) Many predictions fall between 63 and 73 degrees, with another cluster of predictions between 85 and 96 degrees.

The July 8, 2012 edition of the San Luis Obispo Tribunelisted predicted high temperatures (in degrees Fahrenheit) for locations throughout San Luis Obispo (SLO, California) on that date. The following dotplot displays the distribution of these temperatures.​​ Select all the statements below which describe the distribution of these predicted high temperatures. Be sure to select statements about shape, center, variability, and unusual observations. a) There is a bimodal (two clusters) distribution of predicted high temperatures in San Luis Obispo County on July 8, 2012. b) The distribution of predicted high temperatures in San Luis Obispo County on July 8, 2012 has a lot of single value predictions which should be disregarded because they are so far from the center. c) The center of the overall distribution is between 70 and 75 degrees. d) Many predictions fall between 63 and 73 degrees, with another cluster of predictions between 85 and 96 degrees.

a) was there a death on the shift and did kristin work the shift

The categorical variable(s) in this scenario is/are a) was there a death on the shift and did kristin work the shift b) was there a death on the shift c)did kristin work the shift d) none are correct

true

The variables in this scenario are whether or not there was at least one death on a shift and if Kristin Gilbert had worked the shift. true false

c) If you repeatedly play the lottery a very large number of times, in the long run, you will win 0.1% of the times you play.

The probability of winning at a "daily number" lottery game is 1/1000. a) If 1000 people play the lottery, exactly one of those playing will win the lottery. b) Each time the lottery is played, one of 1000 people playing will win. c) If you repeatedly play the lottery a very large number of times, in the long run, you will win 0.1% of the times you play. d) All of these are correct interpretations of probability.

true

The variables in this scenario are the exam scores for each student and the color of paper on which the student took the exam. true false

a) Manny's; a small sample size will result in more variability and hence a wider interval.

To estimate the proportion of city voters who will vote for the Republican candidate in the election, two students, Manny and Nina, each decide to conduct polls in the city. Manny selects a random sample of 50 voters, while Nina selects a random sample of 100 voters. Suppose both samples result in 48% of the voters saying they will vote for the Republican candidate. Whose 95% confidence interval will have the larger margin of error: Manny's or Nina's? How are you deciding? a) Manny's; a small sample size will result in more variability and hence a wider interval. b) Nina's; a large sample size will result in more variability and hence a wider interval. c)Neither; the sample size does not affect the margin of error.

The horizontal axis represents the proportion of heads (equivalently to proportion of times the logo was facing up). From this output, we see that 0.46 is not unusual under the null hypothesis that the probability of the logo facing up is 0.50. Therefore, we would not consider a result of 46 "ups" and 54 "downs" to be surprising if the racquet spinning truly was a 50/50 process. We do not have convincing evidence that this is not a 50/50 process. Conclusion: Based on the sample statistic of 0.46 we obtained a large p-value of 0.4780. This p-value gives us weak evidence against the null hypothesis, so we fail to reject it. We cannot claim that the probability of a spun racquet landing up is not equal to 0.5. In other words, it is plausible to say that the probability of a spun racquet landing up is 0.5.

Use the One Proportion applet to conduct a simulation analysis with at least 1000 repetitions. Report the inputs you use in the applet and draw a sketch of the resulting distribution. Make sure you clearly label the distribution. Based on this distribution, do you believe the observed data provide much evidence against believing that spinning the racquet is really a fair 50-50 process? Explain your reasoning.

- Stronger because the statistic is the same (60%) but the sample size is much larger (100 vs. 20).

What if she had played 100 games and won 60? Would that provide stronger, weaker, or evidence of similar strength compared to 12 wins out of 20, to conclude that her long-run proportion of winning at Minesweeper is higher than 50%? Explain how you are deciding. -Weaker because the statistic is the same (50%) but the sample size is much smaller (20 vs.100). -Weaker because the statistic is the same (60%) but the sample size is much larger (100 vs. 20). - Stronger because the statistic is the same (60%) but the sample size is much larger (100 vs. 20). - Stronger because the statistic is not the same (50%) but the sample size is much larger (100 vs. 20).

- Stronger because the statistic(18/20 = 90%) is much farther away from the null hypothesized value (50%) than before (12/20 = 60%).

What if she had played 20 games and won 18? Would that provide stronger, weaker, or evidence of similar strength compared to 12 wins out of 20, to conclude that her long-run proportion of winning at Minesweeper is higher than 50%? Explain how you are deciding. - Weaker because the statistic (18/20 = 90%) is much farther away from the null hypothesized value (50%) than before (12/20 = 60%). - Stronger because the statistic(18/20 = 90%) is much farther away from the null hypothesized value (50%) than before (12/20 = 60%). - Weaker because the statistic (18/20 = 90%) is much closer to the null hypothesized value (50%) than before (12/20 = 60%). - Stronger because the statistic (18/20 = 90%) is is much closer to the null hypothesized value (50%) than before (12/20 = 60%).

- No, 40% is less than the null hypothesis value of 50%, so this is not evidence that that long-run

What if she had played 30 games and won 12? Would that provide evidence that her long-run proportion of winning at Minesweeper is higher than 50%? Explain how you are deciding. - No, 40% is less than the null hypothesis value of 50%, so this is not evidence that that long-run proportion of wins is more than 50% - No, 50% is more than the null hypothesis value of 40%, so this is not evidence that that long-run proportion of wins is more than 50% - Yes, 50% is more than the null hypothesis value of 40%, so this is evidence that that long-run proportion of wins is more than 50% - Yes, 40% is less than the null hypothesis value of 50%, so this is evidence that that long-run proportion of wins is more than 50%

Replacement: 0.104 Regular: 0.256

What proportion of the 48 games officiated by replacement referees lasted for less than 3 hours (180 minutes)? What proportion of the 43 games officiated by regular referees lasted for this long? Replacement: Regular:

Friend D because this is more evidence against the null hypothesis

When both of the friends played rock the same proportion of times and one of their data provided more evidence against the null hypothesis, which friend's data yielded a larger standardized statistic? Friend E because this is more evidence against the null hypothesis Friend D because this is less evidence against the null hypothesis Friend D because this is more evidence against the null hypothesis Friend E because this is less evidence against the null hypothesis

Friend D because this is more evidence against the null hypothesis

When both of the friends played rock the same proportion of times and one of their data provided more evidence against the null hypothesis, which friend's data yielded a smaller p-value? Friend D because this is less evidence against the null hypothesis Friend D because this is more evidence against the null hypothesis Friend E because this is more evidence against the null hypothesis Friend E because this is less evidence against the null hypothesis

d) The replacement referees tended to have more variability in the game lengths.

Would you say that either type of referee tended to have more variability in game durations? If so, which type of referee tended to have more variability? a) There is no way to tell which group of referees had more variability in the game lengths. b) The two types of referees had the same amount of variability in the game lengths. c) The regular referees tended to have more variability in the game lengths. d) The replacement referees tended to have more variability in the game lengths.

Friend D because a smaller standard deviation leads to a larger

When both of the friends played rock the same proportion of times and one of their data provided more evidence against the null hypothesis, which friend's null distribution had a smaller standard deviation? Friend D because a smaller standard deviation leads to a smaller standardized statistic Friend D because a smaller standard deviation leads to a larger standardized statistic Friend E because a larger standard deviation leads to a larger standardized statistic Friend E because a larger standard deviation leads to a smaller standardized statistic

Always about the parameter only.

When stating null and alternative hypotheses, the hypotheses are:​ Always about the parameter only. Always about both the statistic and the parameter. Sometimes about the statistic and sometimes about the parameter. Always about the statistic only.

step 4

When the researchers infer that self-reported anxiety levels are higher among novice skydivers, it is an example of which of the six steps of statistical investigation? step 3 step 4 step 5 step 6

c) research question

Whether male students are more likely to have a tattoo than female students would be considered which of the following? a) observational unit b) quantitative variable c) research question d) categorical variable

d) categorical variable

Whether or not the student has a tattoo would be considered which of the following? a) observational unit b) quantitative variable c) research question d) categorical variable

c) research question

Whether students majoring in Liberal Arts tend to have more tattoos than students majoring in Engineering would be considered what? a) observational unit b) quantitative variable c) research question d) categorical variable

-students hair color -students with tablets vs laptps -students residence situation

Which of the following are categorical variables that could be recorded on these students? -students' height -students hair color -students with tablets vs laptps -students residence situation -students age

b) Let rolls 1 and 2 represent team B winning a game and 3-6 represent team A. Roll the die and record who wins the game until one team has won two games (two or three times).

Which of the following describes how a six-sided die could be used to simulate one repetition of a best-of-three series between teams A and B? a) Let rolls 1, 2, and 3 represent team A winning a game and 4, 5, and 6 represent team B winning a game. Roll the die and record who wins the game until one team has won two games (two or three times). b) Let rolls 1 and 2 represent team B winning a game and 3-6 represent team A. Roll the die and record who wins the game until one team has won two games (two or three times). c) Let rolls 1 and 2 represent team A winning a game and 3-6 represent team B. Roll the die and record who wins the game until one team has won two games (two or three times). d) There is not a preferred method of the three listed.

observational unit

Which of the following do the 24 skydivers represent?

c. You can be 95% confident that the mean number of close friends in the population of adult Americans is between the endpoints of this interval.

Which of the following is a reasonable interpretation of this confidence interval and its confidence level? a. Ninety five percent of all adult Americans in this sample reported a number of close friends within this interval. b. If you took another sample of 1467 adult Americans, there is a 95% chance that its sample mean would fall within this interval. c. You can be 95% confident that the mean number of close friends in the population of adult Americans is between the endpoints of this interval. d. This interval captures the number of close friends for 95% of the people in the population of adult Americans.

a) To produce similar (experimental) groups so any differences in the response variable can be attributed to the explanatory variable

Which of the following is the primary purpose of randomly assigning subjects to treatments in an experiment? a) To produce similar (experimental) groups so any differences in the response variable can be attributed to the explanatory variable b) To give each subject a 50-50 chance of obtaining a successful outcome c) To produce a representative sample so results can be generalized to a larger population d) To simulate what would happen in the long run Both a) and c)

d) nellie- bc new york temps vary more throughout the year

Who would you expect to have the larger standard deviation of these temperatures? a) can't be determined b) Sandy - because San Diego temperatures do not vary as much as New York's c) same d) nellie- bc new york temps vary more throughout the year

will double

With your original sample, you decided to test a two sided alternative instead of Ha: π>0.50. How will the new p-value compare to the p-value of 0.08 you first obtained? will be smaller will double won't change will be larger

b) Replacement Referees have longer game times with games about 195 minutes on average.

Would you say that either type of referee tended to have longer games than the other on average, and if so, which type of referee tended to have longer games and by about how much on average? a) Regular Referees have longer game times with games about 185 minutes on average. b) Replacement Referees have longer game times with games about 195 minutes on average. c) It is too hard to tell from the graph which type of referee has longer games. d) The game lengths for both referees seem to be the same.

The p-value of 0.1888 is the probability of observing a sample mean of 101.6 or larger values, if the average IQ in the population is really 100 (null hypothesis is true).

Write a sentence interpreting the p-value in the context of this sample and these hypotheses.

Ho: μ = 2.75 Ha: μ > 2.75

Write the appropriate null and alternative hypotheses using symbols.

b) quantitative variable

​Consider the students in your class as the observational units in a statistical study about tattoos How many tattoos the student has would be classified as which of the following? a) observational unit b) quantitative variable c) research question d) categorical variable

sex, washed hands, and location

​In August of 2005, researchers for the American Society for Microbiology and the Soap and Detergent Association monitored the behavior of more than 6,300 users of public restrooms. They observed people in public venues such as Turner Field in Atlanta and Grand Central Station in New York City. For each person they kept track of the person's sex and whether or not the person washed his or her hands along with the person's location. Which of the following are the variables recorded on each observational units?

a) true

Are newborns from couples where both parents smoke less likely to be boys than newborns from couples where neither parent smokes? Answer each of the following questions. In the question "Are newborns from couples where both parents smoke less likely to be boys than newborns from couples where neither parent smokes?" the observational units are newborns. a) true b) false

Part of the validity conditions are met since the sample size is larger than 20 (n = 70). However, we do NOT have the distribution of the sample so we do not know if it is skewed or not. If the sample distribution is not heavily skewed then the validity conditions are met.

Are the validity conditions met if you want to conduct an analysis based on the theory-based approach (one-sample t-test)? Explain. Make sure your answer follow the format below. In other words, copy, paste and fill out with your answers: Answer (yes or no): Explanation:

a) We should never use the word prove.

In the Wrap-up videos I created about Activity 2.2_3.1, I made specific comments related to the word "prove". a) We should never use the word prove. b) We can only use the word prove when the results are statistically significant.

b) There is convincing evidence that less than half of SLO residents dine at restaurants at least once a week, because the interval does not contain 0.50.

According to the 95% confidence interval: a) There is not convincing evidence that less than half of SLO residents dine at restaurants at least once a week, because the interval does not contain 0.50. b) There is convincing evidence that less than half of SLO residents dine at restaurants at least once a week, because the interval does not contain 0.50. c) Neither of the other choices is correct.

b) 47 students

An article in a 2006 issue of the Journal of Behavioral Decision Making reports on a study involving 47 undergraduate students at Harvard. All of the participants were given $50, but some (chosen at random) were told that this was a "tuition rebate," while the others were told that this was "bonus income." After one week, the students were contacted again and asked how much of the $50 they had spent and how much they had saved. Researchers wanted to know whether those receiving the "rebate" would tend to save more money than those receiving the "bonus." What are the observational units? a) one week b) 47 students c) $50 d) amount of bonus

b) Correct

An instructor selected a random sample of students in her school and asked them how many hours per week they expected to spend studying for their courses outside of class. The sample average was 7.00 hours and the sample standard deviation was 4.19 hours. The data were not strongly skewed. The theory-based 95% confidence interval for the parameter of interest was 5.6030 and 8.3970. Judge the correctness of the following interpretation: We are 95% confident that the average hours/week all students in this school will spend studying is between 5.6030 and 8.3970. a) Not correct b) Correct

a) correct

In the Wrap-up videos I created about Activity 2.2_3.1, I talked about the interpretation of a confidence interval. Judge the correctness of the following statement: The confidence interval gives us plausible values for the parameter we are interested in. a) correct b) not correct

a) include 0.76

As part of a class project, a student conducts a survey of 50 students at her school asking whether the students owned a smartphone. Thirty-eight of the 50 students (76%) reported owning a smartphone. A 95% confidence interval for the proportion of all students at her school that own a smart phone will a) include 0.76 b) not include 0.76 c) not enough information to tell

d) The confidence interval is estimating the proportion of all American adults that thought math was the most valuable subject they studied in school.

Based on an August 2013 Gallup poll, a 95% confidence interval for the proportion of American adults that thought math was the most valuable subject they studied in school is 0.31 to 0.37. Explain exactly what the confidence interval is estimating. a) The confidence interval is estimating the proportion of all respondents to the poll that thought math was the most valuable subject they studied in school. b) The confidence interval is estimating the mean of all American adults that thought math was the most valuable subject they studied in school. c) The confidence interval is estimating the mean of all respondents to the poll that thought math was the most valuable subject they studied in school. d) The confidence interval is estimating the proportion of all American adults that thought math was the most valuable subject they studied in school.

a) increase sample size b) increase significance level

In the Wrap-up videos I created about Activity 2.2_3.1, I talked about two ways we can improve the power of a test. What are the two ways? Check all that apply. a) increase sample size b) increase significance level c) increase type 2 error rate

a) SDnull = = 0.009 and z = 9.3

In the Wrap-up videos I created about Activity 2.2_3.1, I talked about calculating the z-statistic using the results of the simulation. According to what I say in the video, what is the value of the standard deviation of the null and the standardized statistic using the simulation approach? a) SDnull = = 0.009 and z = 9.3 b) SDnull = 0.0094 and z = 8.9

The average number of hours students watch TV per day for all Cal Poly (μ/mu).

HOW MUCH TV DO YOU WATCH? According to a 2011 report by the U.S. Department of Labor, civilian Americans spend 2.75 hours per day watching television. A faculty researcher, Dr. Sameer, at California Polytechnic State University (Cal Poly) conducts a study to see whether a greater average of hours applies to Cal Poly students. Define the parameter of interest in context and assign an appropriate symbol to denote it.

a) True

Consider the scenario: "Subjects listened to 10 seconds of the Jackson 5's song "ABC" and then were asked how long they thought the song snippet lasted. Do people tend to overestimate the song length? In this scenario, the subjects are the observational units. a) true b) false

true

Consider the scenario: "There are many different types of diets, but do some work better than others? Is low-fat better than low-carb, or is some combination best? Researchers conducted a study involving three popular diets: Atkins (very low carb), Zone (40:30:30 ratio of carbs, protein, fat), and Ornish (low fat). They randomly assigned overweight women to one of the three diets. The 232 women who volunteered for the program were educated on their assigned diet and were observed periodically as they stayed on the diet for a year. At the end of the year, the researchers calculated the changes in body mass index for each woman and compared the results across the three diets.​" In this scenario, diet and change in body mass index are the variables. true false

Since this is a two-sided test, Dr. Elliot's p-value should be about twice that of Dr. Sameer's. p = 2* 0.127 = 0.254 You can also answer this question a different way: Look at the plot above and count how many dots you are above 3 and below 2.5. If this is the case, they should get a p-value around 0.213 (= 32/150).

DO NOT USE THE APPLET TO ANSWER THIS QUESTION. Another faculty researcher, Dr. Elliot, had hypothesized that Cal Poly students might spend a different number of hours, on average. In other words, she hypothesized that students' average number of hours is different than 2.75 hours/day watching TV. If Dr. Elliot were to use the same data as Dr. Sameer to conduct an investigation, how much would the p-value be? Explain and show your work. Ho: MU= 2.75; Ha: MU not equal 2.75

a) cal poly students

In the Wrap-up videos I created about Activity 2.1, I gave an example of generalization statement. What was the context of that example? a) cal poly students b) plastic ducks c) american citizens

a) correct

If a researcher conducted a randomized experiment, then we are NOT sure if the researcher used random sampling or not. a) correct b) not correct

a) correct

If a researcher conducted an observational study, then we are NOT sure if the researcher used random sampling or not. a) correct b) not correct

a) the p-value

In the Wrap-up videos I created about Activity 2.1, I talked about the importance of the statistic value and I say that we use the statistic value to calculate... a) the p-value b) the null hypothesis

b) 95%

In the Wrap-up videos I created about Activity 3.2, I mentioned that the 2SD method only works for a specific confidence level. For which confidence level does the 2SD method work? a) 90% b) 95% c) 99%

a) the sample size is greater than 20

In Example 3.3, the sample distribution is slightly skewed to the left. We can still calculate a valid theory-based confidence interval for the mean because a) the sample size is greater than 20 b) the population distribution is approximately normal c) the population standard deviation is unknown d) they took a random sample

a) true

In Section 1.4, several factors were listed that impact the size of a p-value. For each case below, indicate whether it significantly impacts the size of the p-value in a study. (select true if it's relevant to the p-value, false if not) how far the statistic falls from the hypothesized value of the parameter a) true b) false

b) false

In Section 1.4, several factors were listed that impact the size of a p-value. For each case below, indicate whether it significantly impacts the size of the p-value in a study. (select true if it's relevant to the p-value, false if not) the number of repetitions of a simulation (assume at least 5,000 repetitions) a) true b) false

b) false

In Section 1.4, several factors were listed that impact the size of a p-value. For each case below, indicate whether it significantly impacts the size of the p-value in a study. (select true if it's relevant to the p-value, false if not) the population size a) true b) false

a) true

In Section 1.4, several factors were listed that impact the size of a p-value. For each case below, indicate whether it significantly impacts the size of the p-value in a study. (select true if it's relevant to the p-value, false if not) the sample size a) true b) false

a) true

In Section 1.4, several factors were listed that impact the size of a p-value. For each case below, indicate whether it significantly impacts the size of the p-value in a study. (select true if it's relevant to the p-value, false if not) whether the alternative hypothesis is one sided or two sided a) true b) false

We should not use the word "away". We should say either "above" or "below".

In the Wrap-up videos I created about Activity 1.2, I am very specific about how you should interpret the z-statistic. I specifically talk about a word that should not be used. Which word is that? Which words should be used instead?

The mistake is using the simulated mean of the null distribution (0.502) instead of the hypothesized mean of the null distribution (0.5)

In the Wrap-up videos I created about Activity 1.2, I called students' attention to a specific part of the computation of the z-statistic. I usually see students making a mistake in that specific part of the calculation. What is this mistake?

a) interpretation is correct

In the Wrap-up videos I created about Activity 1.2, I explain why the p-value increases when the observed result is closer to the null hypothesis parameter values. Based on what you saw in the video, judge the correctness of the following interpretation. The red line was originally at 0.8 but it shifts to be closer to the center of the null distribution when we change the statistic to 0.6. Now we have more data in the tail of the distribution which causes the p-value to increase. a) interpretation is correct b) interpreation if not correct

a) correct

In the Wrap-up videos I created about Activity 1.2, I explained why the p-value increases when we have a smaller sample size. According to my explanation in the video, judge the correctness of the following statement: The p-value increases because there is more variability in the null distribution (larger standard deviation). With more data in the tail of the distribution the p-value will increase. a) correct b) incorrect

a) correct

In the Wrap-up videos I created about Activity 1.2, I explained why the p-value will be larger when we conduct a two-sided test, instead of a one-sided test. According to my explanation in the video, judge the correctness of the following statement: Because the null distribution is pretty symmetric, the two-sided p-value will be about 2 times greater than the one-sided p-value. a) correct b) incorrect

a) we need to conduct the simulation to see if the observed result (statistic) could have happened by chance

In the Wrap-up videos I created about Activity 1.2, I made a comment about the need for performing a simulation and not just using the observed statistic to answer the research question. Based on what I say in the video, explain why do we need to conduct the simulation? a) we need to conduct the simulation to see if the observed result (statistic) could have happened by chance b) we need to conduct the simulation to find the z-statistic c) we need to conduct the simulation. to see if the t-statistic will be smaller than 2

a) observed statistic

In the Wrap-up videos I created about Activity 3.2, I mentioned that when you use the 2SD method or the Theory-based method to find the confidence interval, you will always have the ____________ as the midpoint of your interval. Fill in the blank: a) observed statistic b) p value

b) not correct

In the Wrap-up videos I created about Activity 3.2, I talked about how the intervals change when we change the confidence level. Judge the correctness of the following statement. When we increase the confidence level from 95% to 99%, the midpoint of the interval increases. a) correct b) not correct

a) yes

In the Wrap-up videos I created about Activity 4.1, I talked about confounding variables. Based on what I said, is it correct to say that SEX, HEIGHT, GENE and X-VAR are examples of possible confounding variables? a) yes b) no

a) correct

In the Wrap-up videos I created about Activity 4.1, I talked about the conclusion we are allowed to make based on the study. Based on what I said, judge the correctness of the following statement: If the original study used random assignment and found evidence that patients using the elevating strategy show higher improvement on average than patients using the lowering strategy, then I know that confounding variables are balanced out between the groups and I am allowed to make a causal claim. a) correct b) not correct

b) not correct

In the Wrap-up videos I created about Activity 4.1, I talked about the difference between Random Sampling and Random Assignment. Based on what I said, judge the correctness of the following statements: If a researcher conducted an observational study, then we know for sure that random assignment happened in the study. a) correct b) not correct

sex of the child; binary

In this question "Are newborns from couples where both parents smoke less likely to be boys than newborns from couples where neither parent smokes?" --- is a categorical variable that is --- sex of the child or age of the child binary or nonbinary

do both parents smoke ; two

In this question "Are newborns from couples where both parents smoke less likely to be boys than newborns from couples where neither parent smokes?" ---- is a categorical variable that has ---- possible outcomes do both parents smoke or does one parent smoke two or more than two

a) true

In this question "Are newborns from couples where both parents smoke less likely to be boys than newborns from couples where neither parent smokes?", the variables are the sex of the baby and whether or not both parents smoke. a) true b) false

nonbinary

In this scenario, the categorical variable(s) is/are nonbinary or binary

a) the change in bmi after one year

In this scenario, the quantitative variable is a) the change in bmi after one year b) type of diet c) each woman's bmi d) the overweight women

a) means

Is there evidence that the average daily spending among US adults over the past few days is more than $75 (the average daily spending at the same time last year)? a) means b) proportions

a) true

One of the authors came across an article (USA Today, 2008) that said that on average Americans have visited 16 states in the United States. Recall that in the author's sample of 50 students the average number of states the students had visited was 9.48 and the standard deviation was 7.13.​ The data are not strongly skewed. The 95% confidence interval for the average number of states all students at the author's school have visited is (7.4537, 11.5063). Judge the validity of the following statement: The 95% confidence interval that you calculated provides evidence that the average number of states all students at the author's school have visited is different from 16. a) True b) False

a) true

Random sampling is a more important consideration than random assignment if the research question is whether faculty tend to drive older cars than students drive on your campus. a) true b) false

To find the p-value, we should calculate the proportion of all simulated samples that are at the statistic/observed result (x-bar = 3) and above (more extreme values, according to the alternative hypothesis). Number of simulated samples at and above 3: 19 Total number of simulated samples: 150 p = 19/150 = 0.127

Reconsider Dr. Sameer's research question about how much time Cal Poly students spend on watching television. Suppose that Dr. Sameer surveys a random sample of 100 Cal Poly students, and for this sample the mean number of hours per day spent watching TV turns out to be 3.0 hours. Suppose that to answer the research question Dr. Sameer conducted a simulation under the null hypothesis. The plot below shows the null distribution with a total of 150 simulated samples (Tip: note that I did not run 1000 simulated samples here!). The mean of the null distribution is 2.79 The standard deviation of the null distribution is 1.6. Calculate and report the p-value. Show your calculations.

To compute the standardized statistic students need to use - The observed statistic: 3 - The hypothesized mean of the null distribution: 2.75 (students are NOT supposed to use 2.79) - The standard deviation of the null distribution: 1.6 t = (3-2.75)/1.6 = 0.25/1.6 = 0.15625

Regardless of your answer to the previous question, conduct a one-sample t-test to find and report a standardized statistic (t-statistic). Show/type every step of your calculations. Make sure your answer follow the format below. In other words, copy, paste and fill out with your answers: t-statistic = Type your calculations:

Probability of success (π) = 0.50 Sample size (n) = 124

Simulate Use the One Proportion applet (http://www.rossmanchance.com/ISIapplets.html) to perform a simulation that will help you answer the research question. Use 1000 repetitions of this study assuming each couple is equally likely to lean right or left. Report what values you input into the applet. a) What value did you use for the Probability of success (π)? b) What value did you use for the Sample size (n)?

a) The proportion of students who would read the paper is 0.10, but we decide it is more than 0.10

Specify what Type I error (false alarm) represents in this situation. A Type I error happens when we reject the null hypothesis, but the null hypothesis was true. a) The proportion of students who would read the paper is 0.10, but we decide it is more than 0.10 b) The proportion of students who would read the paper is more than 0.10, but we do NOT find strong evidence that it is more than 0.10.

We conducted a simulation to examine the average number of hours students spend watching TV. The null hypothesis was μ = 2.75 and the alternative hypothesis was μ > 2.75. Based on the observed statistic of x̄ = 3 hours we obtained a large p-value (0.127) and a small t-statistic of 0.156. Both the p-value and standardized statistic (t) do not constitute enough evidence (weak) against the null hypothesis, so we fail to reject the null hypothesis. Therefore, we cannot claim that students are watching more than 2.75 hours of TV, on average.

Summarize the conclusion that you draw from this study based on your analysis. Make sure you explain the reasoning process behind your conclusion by referring to the (1) observed statistic, (2) simulation-based p-value and the theoretical t-statistic (3) strength of evidence against the null model (very strong, strong, moderate, weak), (4) reject/fail to reject the null hypothesis, and (5) overall conclusion in context.

c) The sample proportion

Suppose a 95% confidence interval for a population proportion is found using the 2SD or theory-based method. Which of the following will definitely be contained in that interval? a) The p-value b) The population proportion c) The sample proportion

b) A 99% confidence interval constructed from the same sample proportion will definitely contain 0.50.

Suppose a 95% confidence interval is constructed from a sample proportion and 0.50 is contained in the interval. Which of the following are true? a) A 99% confidence interval constructed from the same sample proportion will definitely NOT contain 0.50. b) A 99% confidence interval constructed from the same sample proportion will definitely contain 0.50. c) A 90% confidence interval constructed from the same sample proportion will definitely NOT contain 0.50. d) A 90% confidence interval constructed from the same sample proportion will definitely contain 0.50.

a) Smaller (more negative) than -9.19

Suppose another student took a random sample of 500 students and also found a sample mean of 2.3 hours with a sample standard deviation of 2 hours. How would this student's t-statistic compare? a) Smaller (more negative) than -9.19 b) Larger (closer to zero) than -9.19 c) Above zero d) Still -9.19

b) The proportion of students who would read the paper is more than 0.10, but we do NOT find strong evidence that it is more than 0.10.

Suppose that you are considering whether to publish a weekly alternative newspaper on campus. You decide to survey a random sample of students on your campus to ask if they would be likely to read such a newspaper. Yourplan is to proceed with publication only if the sample data provide strong evidence that more than 10% of all students on your campus would be likely to read such a newspaper. The parameter of interest is the proportion of all students who would likely read an alternative campus newspaper (π). The hypothesis are: Ho: π = 0.10 and Ha: π > 0.10 Specify what Type II error (missed opportunity) represents in this situation. A Type II error happens when you fail to reject the null hypothesis, but the null hypothesis was false. a) The proportion of students who would read the paper is 0.10, but we decide it is more than 0.10. b) The proportion of students who would read the paper is more than 0.10, but we do NOT find strong evidence that it is more than 0.10.

H0: πFrontTire = 0.25 Ha: πFrontTire > 0.25 (Under the null hypothesis we would expect students to randomly choose between the tires. There are 4 tires and one of them is the front right (success). So under the idea of "just by chance" the probability of choosing the right front tire would be 1/4 = 0.25)

TIRE STORY FALLS FLAT A legendary story on college campuses concerns two students who miss a chemistry exam because of excessive partying but blame their absence on a flat tire. The professor allows them to take a make-up exam and sends them to separate rooms to take it. The first question, worth five points, is quite easy. The second question, worth ninety-five points, asks: Which tire was it? Research question: Do students pick which tire went flat in equal proportions? It has been conjectured that when students are asked this question and forced to give an answer (left front, left rear, right front, or right rear) off the top of their head, they tend to answer "right front" more than would be expected by random chance. To test this conjecture about the right front tire, a recent class of 28 students was asked if they were in this situation, which tire would they say had gone flat. We obtained the following results: Left front: 6 Left rear: 4 Right Front: 14 Right Rear: 4 The parameter is the long-run proportion of students who will pick the right front tire (π). State in symbols the appropriate null and alternative hypotheses to be tested.

a) means

The Harris Polling Organization interviewed American adults and asked how many "close friends" they have a) means b) proportions

The distribution of sentence lengths is slightly skewed to the right due to a few unusually large sentence lengths. A typical sentence length is about 13 words but there are many sentences that consist of 8 to 15 words. The variation of the sentence lengths can be measure by the interquartile range, IQR, of 8 (17-9 = 8) which means that the middle 50% of the sentence length vary in about 8 words. Two sentences were as small as 2 words and the longest sentence contained 39 words. (mention the variability using standard deviation, range, or IQR.)

The dotplot below displays the distribution of sentence lengths (number of words in a sentence) for 55 sentences selected from John Grishham's novel The Confession. Describe what this graph reveals, paying attention to shape, center, variability, and unusual observations. In your description, make sure you use the context of the problem and refer to some of the summary statistics below. summary measures- n: 55 mean: 13.98 std. deviation: 8.03 median: 13 range: 37 min: 2 max: 39 Q1: 9 Q3: 17

c) buzz pushing the correct lever 15 out of 16 times

The statistic in the Doris and Buzz study is a) buzz's long run probability of pushing the correct lever b) whether or not doris and buzz can communicate c) buzz pushing the correct lever 15 out of 16 times

1) The sampling method is biased. 2) overestimate 3) It is important to note that students at the bus stop could use a higher SPF sunscreen because they might be waiting for the bus in the sun, compared to other students in the school who might use a car. Therefore, this study will most likely overestimate the average SPF sunscreen.

WHAT'S THE SPF OF YOUR SUNSCREEN? Most dermatologists recommend using sunscreens that have a Sun Protection Factor (SPF) of at least 30. One researcher wanted to find out whether the SPF of sunscreens used by students at her school (which is in a very sunny part of the United States), exceeds this value, on average. To collect data, the researcher surveyed students that were at the unsheltered bus stop, and found that in a sample of 48 students, the average SPF was 35.29, and the standard deviation of SPF was 17.19. Is the sampling method used biased or unbiased? Are you more likely to overestimate or underestimate the parameter of interest? Explain. Make sure your answer follow the format below. In other words, copy, paste and fill out with your answers: 1) Answer (biased OR unbiased): 2) Answer (overestimate OR underestimate): 3) Explanation:

- what the student was told - how much money each student spent

What are the variables in this scenario? - how long it took the student to spend the bonus - what the students spent the money on - what the student was told - the number of students participating - how much money each student spent

A dot represents the proportion of couples out of 124 that leaned to the RIGHT when kissing. (mention that a dot represents a proportion of couples not the number of times a couple leaned to the right when kissing)

What does a dot on the plot above represent? Answer using the context of the study.

a) Wider, because to be more confident we need to widen the interval.

What proportion of San Luis Obispo (SLO) residents dine at restaurants at least once a week? To investigate, a local high school student, Deidre, decides to conduct a survey. She selects a random sample of adult residents of SLO, asks each participant whether he/she dines at restaurants at least once a week, and records their responses. Then, she uses her data to find a 95% confidence interval for the proportion of SLO adults who dine out at least once a week to be (0.38, 0.44). The 99% confidence interval based on the same data would be: a) Wider, because to be more confident we need to widen the interval. b) We need more information to answer this. c) Narrower, because the more confident we are the narrower the interval.

b) type 2 error

What type of error would you consider to be more serious (i.e. could lead to a more disastrous outcome)? a) type 1 error b) type 2 error

b) be in the 95% confidence interval

When we say that 0.50 is a plausible value for a population proportion (parameter), plausible means 0.50 will _________. a) not be in the 95% confidence interval b) be in the 95% confidence interval

a) We are 95% confident that the percentage of all students at the school that own a smart phone is between 64 and 88%

Which of the following is a correct interpretation of this confidence interval? a) We are 95% confident that the percentage of all students at the school that own a smart phone is between 64 and 88% b) We are 95% confident that the percentage of students in the sample who own a smartphone is between 64 and 88%

Randomly assigning the observational units to different treatment groups

Which of the following must happen in a study to allow us to determine cause and effect?


Kaugnay na mga set ng pag-aaral

303 Hinkle PrepU Chapter 50: Assessment and Management of Patients With Biliary DisordersThe digestion of carbohydrates is aided by

View Set

Chapter 7 Life Situation Interventions: Interpersonal

View Set

5th Grade Teems (What's the Genre?)

View Set

Secord Contextual Articulation Test (S-CAT)

View Set

Chinese 2- U3旅行和交通 Travel & Transportation - L1 Means of Transportation 交通工具

View Set

CCNA Routing and Switching Essentials Chapter 10: DHCP

View Set