Exam 1
A student in a statistics class is about to start a survey sampling project. She has 113 Facebook friends and wants to distribute a questionnaire to 20 of them. Which of the following sampling plans would be the most like a real-world simple random sample?
Select 20 names at random from the 113 Friends and contact those 20 with the questionnaire.
Exhibit 1, Question 2 - The nurse injected the patient four times with a full 0.9 milliliter syringe. What was the nurse's mistake?
She got her decimal points wrong. She injected with 3.6 milliliters instead of 0.36 milliliters, a full ten times as much as she should have.
Exhibit 3, Question 3 - Let's play a game. It costs you $25.00 to play. Here are the rules. You get to pick two wingspans at random, eyes closed, out of either Data Set 1 or Data Set 2 (not both). Your choice. Call your choices x1 and x2. You will receive a reward of $(80.5 − x1)2 + $(80.5 − x2)2. If you pick from Data Set 2, what is the maximum profit you can make?
$25
Exhibit 2, Question 2 noted that for a simple random sample of size 2, all samples of size 2 have the same chance of being chosen. What would the likelihood be of choosing any one of these samples (expressed as a decimal) if there were 100 different samples of size 2?
.01
Exhibit 1, Question 1 - If 1 milliliter equals 100 insulin units, how many milliliters should the patient have been given if she had been prescribed 36 units?
1 milliliter = 100 units so 0.01 milliliter = 1 unit. It follows that 0.36 milliliters = 36 units. So she should have injected the patient with 0.36 milliliters.
If the variance of a set of data is computed to be 4, then the standard deviation is:
2
Suppose you have a data set with 1000 observations in it. 500 of those are all 5 and 500 are 15. Then the standard deviation is about?
2.51
You have enough money to interview 90 residents. Working much the way Gallup did in the 1930s you want your sample of 90 to mirror the distribution of subjects in the population exactly (at least along the lines of gender, income,and political affiliation). How many people would your sample place in the group "Males making between $40,000 and $80,000 yearly?" If your calculation results in a partial person, leave the number as it is-don't round.
20
A population has 50 items, 21 green and 29 red. Turns out there are 2118760 possible samples of size 5 that can be taken from this population. If the parameter of interest is the true proportion green in the population, what is the parameter in this situation?
21/2118760
In the piece on Deathly Hallow's book sales, it was claimed that more than 50,000 were sold per minute on the average. What was wrong with this figure?
300,000 were sold on average each hour. That means only 5000 were sold per minute. Decimal error.
How much does the typical homework assignment (a single BN) count in this class?
5 points
A recent poll of 1500 college-age students found that 885 agreed with U.S. foreign policy toward Israel. What is the corresponding 95% confidence interval (choose closest answer)?
59% plus or minus 3%
You have enough money to interview 90 residents. Working much the way Gallup did in the 1930s you want your sample of 90 to mirror the distribution of subjects in the population exactly (at least along the lines of gender, income,and political affiliation). How many people would your sample place in the group "Male Republicans making over $80,000 per year?" If your calculation results in a partial person, leave the number as it is-don't round.
6.4
In a 2012 Gallup poll, eighty-two percent of adult U.S. Catholics say birth control is morally acceptable. Results for this poll are based on telephone interviews conducted May 3-6, 2012, with a random sample of 1,024 adult Catholics, aged 18 and older, living in all 50 U.S. states and the District of Columbia. What is the corresponding 95% confidence interval for the proportion of all adult U.S. Catholics who feel birth control is morally acceptable?
82% plus or minus 3%
A population has 50 items, 21 green and 29 red. Turns out there are 2118760 possible samples of size 5 that can be taken from this population. Supposed you computed the sample proportion of green in each of these 2118760 samples of size 5 and added them all up. What would you get?
889879.2
Refer to the graphic below. We encountered this summary of the sampling distribution of the sample proportion in class. Suppose n = 100. What are the chances of an SRS of this size yielding a phat that is somewhere between p - 0.1 and p + 0.1?
95 out of 100
What are the two keys to having confidence in your parameter estimate?
The probabilistic nature of the sample selection, and some neat mathematics that follow from this.
What is a simple random sample?
A sample chosen in such a way that all samples of that same size have the same chance of being chosen.
What is a cross-sectional sample?
An attempt to match the sample characteristics exactly to those of the population.
What is a non-sampling error?
An error caused by something other than the fact that a sample was selected instead of the entire population.
Exhibit 2, Question 2 - Initially, the government was reluctant to collect more than the most basic census information of race, sex, and age. Why?
As one representative said, it would "occasion an alarm" among the people, for "they would suppose the government intended something, by putting the union to this additional expense, besides gratifying an idle curiosity."
Which of the following are examples of strategies for reducing non-sampling errors?
Awareness of psychology of question order Use of technology-assisted confidential interview techniques Use of inducements for non-responders
Recall Harris Poll disclaimer mentioned in the Read All About It (or the video). Harris is a major polling organization that refuses to accompany their poll reports with a margin of error. What is one reason that was given for such a bold omission?
Harris recognizes that there are many sources of error that are not addressed by the MOE, so reporting it might be misleading.
What happens to the margin of error as the sample size gets larger?
It decreases
Suppose we measured all of your wingspans in class. What would happen to that average wingspan if we added ex-UK player and NBA star Anthony Davis' to the data?
It would increase
A 1996 Gallup poll of eligible New Hampshire primary voters reported that "of 1200 voters surveyed, 24% would vote for Senator Bob Dole if the primary election were held today". The Gallup organization also reported that the margin of error for a sample of 1200 people is 3 percentage points. If the Gallup organization had wanted to make a confidence statement based on the same data, only with more confidence that the interval had captured the parameter, what do you think would happen to the margin of error? It would be
Larger than 3%.
What does the word "parameter" refer to in statistical science?
Number that describes the population
In the MOE Doesn't Apply Read All About It (or video), what was the issue with the question "Have you often, sometimes, hardly ever, or never felt bad because you were unfaithful to your wife?"
Of the 85% who said they "never felt bad about it" surely a large part of those had never been unfaithful to their wives. But the way the question was asked this wasn't an option for an answer.
What do we mean by "human inference?"
Off-hand phrase taken to mean inference we make from statistical constructs
Refer to Question 3, Exhibit 2. Suppose for a sample of size two to be "representative" of the population, it has to have exactly one man and one woman, and one Democrat and one Republican. What is the chance of selecting a simple random sample of size two from this population that is representative (in this sense of the word)? Assume your samples are (MR, MD), (MR, FR), (MR, FD), (MD,FR), (MD, FD), (FR,FD), where M stands for male, F for female, R for Republican and D for Democrat.
Only (MR,FD), (MD,FR) will work.
The claim has been made that over 4 million women in the U.S. are battered to death each year by a spouse or boyfriend. What is wrong with this claim?
Only about 2.4 million people die in the U.S. each year from all causes.
Consider the following survey question: "The Mac operating system rarely gets infected by viruses and therefore Department of Education should only purchase Mac computers. Please answer Yes or No". What is one objection to this question, as asked?
Only have two possible answers severely limits the breadth of expression for the respondent.
What kind of error does the margin of error address?
Random sampling error
In BN 2.20 we found that out of 594 people asked (students, researchers), 281 reported that the right way to interpret a 95% confidence interval of 0.1 to 0.4 was to say that "the probability that the true proportion is bigger than 0 is at least 95%." What is wrong with that interpretation?
That makes it sound like the parameter is random. It is not.
Robert Niles is a former mathematics geek turned journalist who is continually trying to educate other journalists about how to interpret statistical arguments. He recently noted "Don't overlook that fact that the margin of error is a 95 percent confidence interval, either. That means that for every 20 times you repeat this poll, statistics say that one time you'll get an answer that is completely off the wall." What does Niles mean by this statement?
That the "confidence" is in a repeated sampling sense; and to say one gets an interval that is "right" 95% of the time, is to say one will get a "wrong" one 5% of the time.
Exhibit 1, Question 1 - The word "doubles" is used in the subheading to describe the increase in cocaine use among children. Look carefully at Table 1.2. Where does the word "doubles" come from?
The article says it doubled from 1% to 2% between 2004 and 2005. But the entry for 2004 is 1.4 and 1.9 for 2005. Good guess that the authors just rounded the 1.4 to 1 and the 1.9 to 2.
Exhibit 1, Question 1 - We breathe about 15 times per minute, and a hummingbird flaps its wings about 3,000 times per minute. So a rate of 50,000 copies per minute would truly take our breath away and be faster than we could discern with our eyes. Is the 50,000 figure right?
The figure is not correct. 300,000 copies per hour is 300000/60 = 5000 a minute not 50,000. That's still a lot but the 50,000 wasn't right
Is the letter writer correct to claim that the Times overstated the number of cases of domestic violence against women?
The math isn't correct. An incident every 15 seconds is 4 per minute, 240 per hour, 5760 per day, 40320 per week, 2096640 per year. That's about 2.1 million, not 21 million.
Which of the following statements is true?
The mean and standard deviation are sensitive to outliers
Suppose you have a data set with 1000 observations in it. 500 of those are all 5 and 500 are 15. What of the following are true?
The mean is smaller than the median
What is sampling variability?
The variability seen in statistics from sample to sample
A carefully chosen simple random sample may not be representative of the population. How this could be?
There is always some chance that a random sample won't be representative.There is nothing about how an SRS is taken that guarantees it will be representative. In a class with 10 men and 100 women, an SRS of size 10 gives the same chance to all 10 men being chosen as it does to any other sample of size 10.
Suppose the cross-sectional sample taken above represents a perfect microcosm of the larger population with respect to the legalization of marijuana. Is there any uncertainty involved in using this sample to represent the proportion of people in Gulliver who favor the legalization of marijuana? Say why or why not.
There would be no uncertainty about what the population felt at that very moment in time. Not if you had a perfect microcosm.
What is "response substitution?"
This is the tendency for survey respondents to present their answers in a way that allows them to express their opinions about other issues that aren't the topic of the survey
The well-respected journal Science, in an article on insects and plants, mentioned a California field that produced 750,000 melons per acre. How do you react to that? It may help you know that an acre is 43,560 square feet.
This is unreasonable, suggesting about 17 melons per square foot
What is the goal of sampling?
To make inferences about a population from what we know about our sample.
Recall the sampling distribution of the sample proportion (page 162 in your book). Specify an interval (range) in which 68% of all sample proportions based on samples of size n could be expected to occur.
Within (0.5)x(1/sqrt(n)) on either side of the parameter p.
Recently the Central Kentucky Youth Orchestras started posting rehearsal participation as percentages. So if there are 10 trumpets in an orchestra and 9 showed up to rehearsal, then they'd report a 90% participation rate for trumpets. For sake of simplicity, suppose unbeknownst to the public there are 4 flutes and 10 trumpets. Suppose one flute misses rehearsal and one trumpet misses rehearsal. Would reporting participation results as percentages for each group be potentially misleading? Why or why not?
Yes. In each case only one person missed but the participation rates would be75% for flutes and 90% for trumpets, making trumpets look more compliant.
The University of Kentucky has 21,441 undergraduates, with a gender distribution of 49 percent male students and 51 percent female students. You take a simple random sample of 100 undergraduates (30 males and 70 females) and ask the question "Have you ever attended a date party?" 100% of the males say "yes" and 50% of the females say "yes." If the estimate of all undergraduates who would say "yes" to this question is reweighted to reflect the distribution of males and females in the U.K. population, what would that be in this case?
about 75%
In what sense can a directory-assisted random-digit-dial sample be thought of as a simple random sample?
he numbers selected are chosen at random (by a computer) from all working exchanges. So in that sense any set of working numbers should have the same chance of being chosen as any other set of working numbers of the same size.
What can one say about the sampling distribution of a sample statistic based on a simple random sample?
it is about bell-shaped and peaks above the parameter
You ask a question to a random sample of 1500 adults in Texas (population 18 million people) and to a separate random sample of 500 adults in Indiana (population 5.7 million people). You make separate 95% confidence statements about the percent of all adults in each state who agree. Your margin of error for Indiana is
larger than in Texas, because there are fewer people in the Indiana sample.
If you want a 95% margin of error to be 1%, what will your sample size have to be?
n = 10000
You ask a question to a random sample of 1000 adults in Texas (population 18 million people) and to a separate random sample of 1000 adults in Indiana (population 5.7 million people). You make separate 95% confidence statements about the percent of all adults in each state who agree. Your margin of error for Indiana is
the same as in Texas, because the two samples are the same size.
