Bio 260 Exam 2
Similarly, if a pair of polls were created that avoided the issue above, but one was conducted on a people exiting a church while the other was performed on college students leaving a Women's Studies final exam this is also poor practice. This exemplifies which of the following misuses of statistics?
Biased samples
Values in 3D charts can be visually minimized in which manner?
By rotating the segments representing these values to the back
Every year, in the months when ice cream sales are highest, the number of children that drown is also highest (this is true). Politician Maxwell Dogoody (who owns a frozen yogurt franchise) then proposes a bill to outlaw the sales of ice cream (but not frozen yogurt) to help save children from drowning. What misuse of statistics is he guilty of?
False causality.
The t distribution is wider than the Z distribution because:
It includes the uncertainty in our estimate of the population variance.
Which of the following most accurately describes the Tuskegee experiment?
Pre-existing cases of syphilis were identified in African-American men and they were observed, but not treated, despite effective treatments being available.
If we make a type II error and accept our conclusion as being demonstrated beyond a reasonable doubt, which of the misuses of statistics listed on the website do we commit?
Proof of the null hypothesis.
Jane is comparing the blood pressure values for a control group given a placebo to the values in a treatment where patients were given an experimental drug. Her t test returns a p value of 0.08 so she tells her boss that the drug definitely doesn't work and should be abandoned. Which of the following misuse of statistics might she be making?
Proof of the null hypothesis.
If you do a statistical test and the p value is 0.045 what is your conclusion?
Reject H0 and accept HA.
A type I error occurs when we:
Reject a true null hypothesis.
How can 3D pie charts manipulate the magnitude of values?
Segments at the back appear to represent smaller proportions than they do.
A researcher is interested in whether a drug alters the pH of blood in users. She takes blood samples from a set of individuals before and after taking the experimental medication. Unfortunately a small number of the sample jar labels got switched within each time period and she cannot determine which samples are from the same individuals in a few cases (the "before" and "after" information is correct). What option below best explains her next step?
She cannot do a paired t test, but she can analyze the data with a heteroscedastic t test.
A 95% confidence interval calculated from a sample is the region:
in which there is a 95% probability that the population mean lies.
The best description of what a p value represents is:
p is the smallest we could choose and still reject H0.
All else being equal, if you used a larger sample size, what would tend to happen to your calculated t value and the critical t value you would use?
tcalc increases, tcrit decreases.
Imagine you take a sample of 21 values from a population and the sample mean and standard deviation are 36 and 25 respectively. Which of the following pairs of values are the closest to the 95% confidence interval for the population mean?
{ 24.3 , 47.7 }
Consider conducting a two-tailed homoscedastic t test with two samples of size n=16 from two populations that differ. The standard deviations of the populations are 30% of the mean. Which of the following is closest to the minimum difference in population means that would result in a p value less than 0.05 and us correctly identifying that the population means differ?
22%
Consider a situation in which there is a population with a mean of 25 and a second population with a larger mean that differs. The populations are homoscedastic and have a standard deviation of 3. If we took two samples of size 11 from each population, which of the values below is closest to the minimum value for the mean of the second population that we would be able to detect with a one-tailed homoscedastic t test?
27.0
Consider a situation in which two homoscedastic populations have means that differ by 3.0 and their standard deviations are 3.0. If we take two identically sized samples from each population, which of the values below is closest to the minimum size of each sample we would need to take in order to correctly detect this difference with a two-tailed homoscedastic t test?
9 & 9
Which of the following is the closest to the proportion of the t distribution (for degrees of freedom 20) that lies between t = - 2.00 and t = 2.00.?
90%
If we decided to use the normal distribution instead of the t distribution to make our 95% confidence intervals they would be more narrow and when n is small the difference between the t and normal distributions can be large. If comparing the width of the 95% confidence intervals calculated in these two ways, the interval based on the normal distribution is roughly ____ % as wide as that using the t distribution for n=20. (provide the closest value)
94%
Which of the following is the closest to the proportion of the t distribution (for df=15) that lies between t = - 2.25 and t = 2.25.?
95%
If we decided to use the normal distribution instead of the t distribution to make our 95% confidence intervals they would be more narrow, but when n is large the difference between the t and normal distributions is minor. If comparing the width of the 95% confidence intervals calculated in these two ways, the interval based on the normal distribution is roughly ____ % as wide as that using the t distribution for n=100. (provide the closest value)
98.8%
If we have data from two homoscedastic populations (and the sample sizes are equal) with a calculated t value of 2.10, what is the minimum size each sample must be in order to reject a null hypothesis that the means are the same if doing a two-tailed test?
11
If we have data from two populations (the sample sizes and variances are equal) and the calculated t value for a two-tailed t test is 2.09, what is the minimum sample size of each sample needed for us to conclude that the population means are "significantly different"?
11
If we have data from two populations (the sample sizes and variances are equal) and the calculated t value for a two-tailed t test is 2.05, what is the minimum sample size of each sample needed for us to conclude that the population means are "significantly different"?
15
Consider conducting a two-tailed homoscedastic t test with two samples of size n=14 from two populations that differ. The standard deviations of the populations are 20% of the mean. Which of the following is closest to the minimum difference in population means that would result in a p value less than 0.05 and us correctly identifying that the population means differ?
15%
Consider a situation in which there is a population with a mean of 15.0 and a second population with a mean that differs. The populations are homoscedastic and have a standard deviation of 3.0. If we took two samples of size 14 from each population, which of the values below is closest to the minimum magnitude of the difference between the means of the two populations that we would be able to detect with a two-tailed homoscedastic t test?
2.30
If you do a statistical test and the p value is 0.06 what is your conclusion?
Accept H0 and reject HA.
If you do a statistical test and the p value is 0.5 what is your conclusion?
Accept H0 and reject HA.
Which of the following was NOT a piece of advice I gave during the discussion of making good graphs and figures?
Always use figures instead of data tables.
FDA approval requires that to be an approved "drug" a substance must:
Be effective and relatively harmless.
When researchers deliberately publish studies that support their preferred hypothesis and suppress those that don't support it, we say they are guilty of the following bias:
Cherry-picking.
Continuing the story from the previous question. Due to the large sample size of her study it turns out that the difference in mean blood pressure values is only 0.5%. Nevertheless the boss decides that they market the drug since it has a proven effect on blood pressure. Which of the following misuse of statistics might the boss be making??
Confusing practical and statistical significance
We now have the technology to study hundreds of thousands of genetically variable points in people's genomes (termed SNPs) quickly and affordably. A GWAS study is one in which we look at the frequencies of all these SNP genetic variants in two groups, one with a disorder known to be at least partially genetically influenced and one set of healthy controls. These studies are often successful in identifying places in the genome associated with the genetic disorder, but the results must be interpreted with caution because this method is prone to which of the following misuses of statistics?
Data dredging.
Continuing the story from the previous question. Jane's boss disagrees and tells her to keep working. Jane then notices that her data has some outliers which inflate the variance so she removes them and performs the analysis again and obtains a p value of 0.02. Now she is able to go back to her boss and tell him that the mean blood pressure values in the populations differ. Which of the following misuse of statistics might she now be making?
Data manipulation.
For a given difference between two sample means, the p value associated with that difference:
Decreases as the sample sizes increase and decreases as the sample variances decrease.
What name did the website give to the bias I called "cherry-picking"?
Discarding unfavorable data.
Why is it generally better to perform two-tailed heteroscedastic t tests than one-tailed heteroscedastic t tests?
Doing a two-tailed t test instead of a one-tailed reduces the chance of type I error.
Which of the following accurately described one of the biases I described in medical drug testing is the US?
Drugs are tested more thoroughly on males than on females.
All else staying the same, which of the following would tend to result in smaller p values when doing a t test comparing two samples from two different populations?
Larger sample sizes.
Polls of people's opinions about abortion can differ widely when done by organizations favoring or not favoring legal abortion. Polls that ask whether people "favor laws that allow the destruction of unborn babies" will yield very different results from those that ask whether people "favor laws that allow a woman to decide to undergo a medical procedure that terminates pregnancy." This exemplifies which of the following misuses of statistics?
Loaded questions
I once asked a question on an exam in which I asked students to estimate my age under two different conditions ("Dr. Carter" vs "Ashley Carter"). That little test was an exercise in deliberately trying to observe the effects of which of the following?
Loaded questions.
Polls of people's opinions about abortion can differ widely when done by organizations favoring or opposing legal abortion. Polls that ask whether people "favor laws that allow the destruction of unborn babies" tend to yield different results from those that ask whether people "favor laws that allow a woman to decide to undergo a medical procedure that terminates pregnancy." This exemplifies which of the following misuses of statistics?
Loaded questions.
When we look at tables of t test critical values the values typically get smaller as we move down the table. Which of the following best describes why?
Lower on the table the degrees of freedom increase and the t distribution becomes more normal.
The hypnotic effect caused by closely drawn parallel lines is called the ______ effect.
Moire
Which of the following does NOT accurately described one of the biases I described in medical drug testing is the US?
More drugs are tested on prisoners than on other types of people
Which of the following was NOT one of the biases I described in medical drug testing is the US?
More females enroll in studies than males.
Many psychology studies are performed on undergraduate students in the US who tend to be young, Caucasian, and from families with incomes above the median for the US. Because of this, when these results are used to make statements about the psychology of all humans these studies may be guilty of which misuse of statistics?
Overgeneralization
Many evolutionary psychology studies are performed on undergraduate students in the US who tend to be white and from families with incomes above the median for the US. When these results are used to make statements about the evolutionary history of all humans these studies may be guilty of which misuse of statistics?
Overgeneralization.
Many psychological studies are conducted using large numbers of WEIRD (Western, Educated, and from Industrialized, Rich, and Democratic countries) students. This population has been used for almost all studies of modern psychology and has been used to help us understand how the human mind works. There is a fundamental problem with this approach however; the problem of:
Overgeneralization.
Which of the following was NOT discussed as a bias in the samples we have historically used for drug testing?
Participants are often financially compensated by the researchers.
Which of the following was NOT discussed as a bias in the samples we have historically used for drug testing?
Participants are often personally connected by friendship to the researchers.
Which of the following was observed in the Milgram experiment and related experiments?
Participants were less likely to follow the experiment to the end when ordered via phone rather than in person.
A researcher is interested in whether a drug alters the pH of blood in users. She takes blood samples from a set of individuals before and after taking the experimental medication. Unfortunately a small number of the sample jar labels within each set got switched and she cannot guarantee which samples are from the same individuals in a few cases. What option below best explains her next step.
She cannot do a paired t test, but she can analyze the data with a heteroscedastic t test.
A researcher is interested in whether a drug alters the pH of blood in users. She takes blood samples from a set of individuals before and after taking the experimental medication. Unfortunately, a small number of the sample jar labels within each set got switched and she cannot guarantee which samples are from the same individuals in a few cases. What option below best explains her next step.
She cannot do a paired t test, but she can analyze the data with a heteroscedastic t test.
Global warming is happening, but so is death. Each year many older people pass away and with them the memory of how cold typical winders used to be. The younger individuals that remain do not have this experience. This process can lead to a society (which is just living people after all) not realizing that temperatures have increased as much as they have since things are not that different from what everyone personally recalls. This general phenomenon was described in class and termed which of the following?
Shifting baselines
There is a famous French saying, "plus ça change, plus c'est la même chose" which translates as, "the more things change, the more they stay the same." One way to interpret this concept is to realize that as things change we don't realize it since our reference changes as well. Which of the following concepts mentioned in class most closely parallels this idea?
Shifting baselines.
All else staying the same, which of the following would tend to result in smaller p values when doing a t test comparing two samples from two different populations?
Smaller sample variances.
The Tuskegee experiment involved denying medical treatment to individuals with which disease?
Syphilis
The bias that arises in polling when respondents provide answers that make them look better or are the ones they think the questioner wants to hear was termed which of the following?
The Bradley effect.
The development and implementation of very strict informed consent processes for studies with human subjects was most strongly spurred by which of the following studies?
The Tuskegee experiment showing how the health and welfare of subjects can be ignored.
The bias introduced into science by the selective publishing of results that are statistically significant and clear, and therefore easier to publish with less work, is best described as:
The file drawer problem.
When researchers don't publish studies because they are busy and prioritize other studies with a better chance of publication, we say they are guilty of the following bias:
The file drawer problem.
Why is the F test not recommended in general?
The test provides faulty results if the values are not normally distributed.
The figure from class showing deaths from serial killers had several misleading aspects; which of the following was NOT one of them?
The use of gray instead of color.
If we do a two sample t test which of the following would result in the lowest p value?
The means are not similar and the standard errors are low
Which of the following is the best description of the Central Limit Theorem?
The means from a set of samples taken from a population will be normally distributed.
Which of the following is TRUE?
The normal and t distributions are symmetric whereas the F distribution is skewed.
The best description of what a p value represents is:
The probability of obtaining the test statistic if H0 is true.
The best description of what a p value represents is:
The probability of seeing the sample data if H0 is true.
If you are using a table of critical t values and you accidentally use the values for more degrees of freedom than you really have which of these is true:
The risk of making a type I error is increased and the risk of type II error is decreased
If you are using a table of critical t values and you accidentally use the values for more degrees of freedom than you really have which of these is true:
The risk of making a type I error is increased and the risk of type II error is decreased.
Consider two populations with means that or may not differ. When doing a t test which of the following would make us most likely to reject the null hypothesis?
The sample means are not very close and the variances are fairly small.
The technical definition of a p value is:
The smallest value we could choose and still reject H0.
Which of the following best describes the technical terms "standard deviation" and "standard error"?
The standard deviation describes the spread of a set of data and the standard error describes a range within which we expect the population mean to lie.
Which of the following is correct?
The standard deviation measures spread of a data set whereas the standard error describes the width of a region within which we believe the population mean lies.
The Z distribution has a lower variance than the t distribution because:
The t distribution includes the uncertainty in our estimate of the population variance.
Why do values in the t table decrease as you move downwards?
The t distribution is narrowing to resemble the normal distribution.
Which of the following most accurately describes the Milgram experiment?
The willingness of people to administer lethal punishment when instructed to do so by authority figures was examined.
I mentioned a biological technique in class called a "microarray" which allows the comparison of the expression levels of thousands of genes simultaneously. Typically, a sample is taken from a treatment group (e.g., cancer cells, drug treated cells) and compared to a sample from a control treatment. Consider an example where the expression levels of 3,000 different genes are compared and t tests are used to calculate significant differences. If the tests return 200 significant differences, what is the best interpretation of this result?
There are probably 50 genuine differences and 150 type I errors.
When scientists use the word "significant" in their professional capacity, what is the technically best way to interpret this statement?
They have conducted a statistical test and obtained a p value less than 0.05.
The clinical trial involving 200 children was done in 1996, Pfizer settled the legal case with the city of Kano, Nigeria out of court in 2009. Trovafloxacin has since been essentially pulled from use in developed countries due to a high risk of severe liver damage. Which of the following studies described in class is this the most similar to?
Tuskegee experiment.
Doris thinks that she may be pregnant, but being a good statistician, decides her null hypothesis will be that she is not pregnant when she takes an EPT pregnancy test. The test returns a "false positive"; it tells her she is pregnant when in fact she is not. When Doris then decides that she is pregnant what type of mistake has been made?
Type I error
Which of the following was a piece of advice I gave during the discussion of making good graphs and figures?
Use shades of gray to indicate magnitudes instead of colors.
What is the best description of the logic of statistical testing?
We calculate statistics based on samples and try to infer patterns in the data that are stronger than can easily be accounted for by sampling error.
) Which of the following is an accurate statement?
We perform t tests to decide if two populations have different means,whereas we perform F tests to decide if two populations have different variances.
Which of the following is an accurate statement?
We perform t tests to decide if two populations have different means,whereas we perform F tests to decide if two populations have different variances.
Which of the following is TRUE
When comparing two population means we can always perform a heteroscedastic t test.
Which of the following is the best description of the way the technical term "statistically significant" is most commonly used?
When the null hypothesis of a statistical test can be rejected at the =5% level.
Which of the following is the best description of the way the technical term "statistically significant" is most commonly used?
When the null hypothesis of a statistical test can be rejected at the alpha=5% level.
Why is the F test not recommended in general?
When the values are not normally distributed, the test provides faulty results.
A 90% confidence interval calculated from a set of sample data is the region in which:
there is only a 10% probability the population mean is outside this region.