STA 210 ch. 8-12
5d. Put Barber back in the data set and remove Chouaa. Re-compute the correlation coefficient for the variables Javelin Distance and Hurdles Time. What is it? A. <0.001 B. -0.252 C. -0.025 D. -0.461
D. -0.461
12a. Use the applet above (website) to test the following hypothesis involving the average pain rating for the 0-3 and the 11-hour sleep groups. H0: μ(0-3 hour) = μ(11-hour) HA: μ(0-3 hour) ≠ μ(11-hour) The results are not statistically significant. What is the p-value? A. 0.3013 B. 0.0315 C. 0.0027 D. 0.1102
D. 0.1102
4e. What is the chance the condition will be present, given the test comes back negative, for the Ottawa Ankle Test based on these new screening results (rounded to two decimal places)? (table 8.17) A. 0.08 B. 0.16 C. 0.95 D. 0.81
D. 0.81
1c. Suppose the p-value you got in part a is denoted by the capital letter P. What would be the p-value for testing the hypothesis: H0: p ≥ 0.50 and HA: p < 0.50? A. abs(P) B. also P C. P - 1 D. 1 - P
D. 1 - P
What is the value of the ratio ((Pvytorin)/(1-Pvytorin))/((Pplacebo)/(1-Pplacebo))? A. 0.08 B. 0.65 C. 0.12 D. 1.64
D. 1.64
1e. How often did the error of saying "positive" when the patient really didn't have bowel cancer occur? (table 8.9) A. 2 times B. 1 time C. 182 times D. 18 times
D. 18 times
2c. According to this new table, what is the positive predictive value of the FST with this new rule? (table 8.12) A. 18/29 B. 18/40 C. 245/267 D. 245/256
D. 245/256
5b. What is the standard score associated with the hypothesis shown? A. 2.14 B. 0.05 C. 1.40 D. 4.10
D. 4.10
14a. The correct values for the missing entries in the table are: A. A = 0.010; B = 0.002; C = 0.018; D = 0.057; E = 0.289; F = 0.111; G = 0.385 B. A = 0.002; B = 0.018; C = 0.010; D = 0.289; E = 0.111; F = 0.385; G = 0.057 C. A = 0.010; B = 0.018; C = 0.002; D = 0.111; E = 0.385; F = 0.289; G = 0.057 D. A = 0.002; B = 0.010; C = 0.018; D = 0.057; E = 0.111; F = 0.289; G = 0.385
D. A = 0.002; B = 0.010; C = 0.018; D = 0.057; E = 0.111; F = 0.289; G = 0.385
1f. What might be the consequences of the FOB saying "negative" when the patient really did have bowel cancer? A. The patient would now be much more likely to also have a positive outcome on the gold standard treatment, thereby prolonging an unnecessary stay in the health care system B. This would potentially create unnecessary anxiety for the patient C. There are no immediate consequences since not having bowel cancer is a very positive outcome D. A patient with a potentially fatal disease might be deprived of the quickest post screening intervention possible
D. A patient with a potentially fatal disease might be deprived of the quickest post screening intervention possible
4b. Suppose you could plan this study all over again. Given the seriousness of cancer, you decide you want to be able to detect a small effect size (Cohen's d), with a Type I error rate of 0.05 and a Type II error rate of 0.20. What percentage of the 7500 that were studied in each group would actually have been needed? Assume you were conducting a two-sided test. A. About 50% B. About 500% C. About 0.5% D. About 5%
D. About 5%
8b. After you ran enough people through your study, you were able to report that your results were statistically significant (see table). So you decide to begin seeking funding. What are two reasons why you are still likely to have an unconvincing case? A. Actual difference in treatments is only 1 percent, and the statistical significance is an artifact of decreasing sample size. B. Actual difference in treatments is only 10 percent, and the statistical significance is an artifact of increasing sample size. C. Actual difference in treatments is only 51 percent, and the statistical significance is an artifact of increasing sample size. D. Actual difference in treatments is only 1 percent, and the statistical significance is an artifact of increasing sample size.
D. Actual difference in treatments is only 1 percent, and the statistical significance is an artifact of increasing sample size.
10a. The phrase "statistically significant" is used in Mr. Wolf's testimony. What are the null (H0) and alternative (HA) hypotheses in the context of that testimony? A. H0: Vouchers help graduation rates vs. HA: no they don't B. H0: vouchers help students get jobs vs. HA: no they don't C. H0: Vouchers don't help students get jobs vs. HA: yes, they do D. H0: vouchers don't help graduation rates vs. HA: yes, they do
D. H0: vouchers don't help graduation rates vs. HA: yes they do
1a. What is the alternative hypothesis that is being tested? A. taking pumpkin seed oil will improve symptoms in males by around 11% B. pumpkin seed oil is no better at improving urinary symptoms than a placebo is C. 65% of men who take pumpkin seed oil will like it better than a placebo D. pumpkin seed oil is better at improving urinary symptoms than a placebo is
D. pumpkin seed oil is better at improving urinary symptoms than a placebo is
3a. The CEO of a large electric utility claims that more than 80% of his customers are very satisfied with the service they receive. To test this claim, the local newspaper surveyed 100 customers using simple random sampling. Among the sampled customers, 81% said that they were very satisfied. Do these results provide sufficient evidence to accept or reject the CEO's claim? To answer this question, you will have to test the hypothesis H0: p ≤ 0.80 versus HA: p > 0.80. Assume a Type I error rate of α = 0.05. Report the standard score, the p-value, and state what your decision is. A. z = 0.40; p-value = 0.250; fail to reject the null. B. z = 0.81; p-value = 0.250; fail to reject the null. C. z = 0.80; p-value = 0.401; fail to reject the null D. z = 0.25; p-value = 0.401; fail to reject the null.
D. z = 0.25; p-value = 0.401; fail to reject the null.
3b. Compute the p-value for testing the two-sided hypothesis shown for each of the final three weeks. Which of the following graphs is correct for a plot of the p-values over all four weeks? (table 11.3) Graph A-D?
Graph A
5a. Which of the following is the correct scatterplot of Javelin Distance on the vertical axis versus Hurdles Time on the horizontal axis? Graph A-D?
Graph A
2a. Which of the following is the correct scatterplot of variable y1 on the vertical axis versus variable x1 on the horizontal axis? Graph A-D?
Graph B
3c. Patients with suspected hypothyroidism were screened by Goldstein and Mushlin (J. Gen. Intern. Med. 1987;2:20-24) using thyroxine levels, often abbreviated as T4. Thyroxine is a hormone secreted into the bloodstream by the thyroid. The authors looked at three cutoff values for thyroxine level: 5 or less; 7 or less; 9 or less. With these three cutoffs they found the following: (table 8.15) A plot of sensitivities (along a y-axis) versus FPRs (along an x-axis) for different rules is called a receiver operating characteristic curve (ROC). An ROC is a convenient way of deciding what cutoff rule is best for a particular screening test. Which of the following plots shown below is the correct ROC plot for the hypothyroid study? Graph A - D?
Graph B
3a. Compute Cohen's d for each of the final three weeks. Which of the following graphs is correct for a plot of Cohen's d over all four weeks? (table 11.3) Graph A-D?
Graph C
7a. Which of the following plots is most appropriate for assessing the association between Smoking Status and Myocardial Infarction in 10 years? Graphs A-D?
Graph D
5b. Compute the correlation coefficient for the variables Javelin Distance and Hurdles Times. What is it? A. -0.252 B. -0.461 C. -0.025 D. <0.001
A. -0.252
12b. Use the applet above (or one of your choice) to test the following hypothesis about the average pain ratings for the 5-hour and the 8-hour sleep groups. H0: μ(5-hour) = μ(8-hour) HA: μ(5-hour) ≠ μ(8-hour) The results are statistically significant. What is the p-value? A. 0.0027 B. 0.3013 C. 0.0315 D. 0.1261
A. 0.0027
7c. What is the overall accuracy for this gender screening test based on the screening results above (answer rounded to three decimal places)? (table 8.20) A. 0.809 B. 0 C. 0.948 D. 0.853
A. 0.809
6c. What is the overall accuracy for this cancer screening test based on the screening results above (answer rounded to three decimal places)? (table 8.19) A. 0.922 B. 0.050 C. 0.400 D. 0.949
A. 0.922
7a. What is the sensitivity of the test based on these screening results (rounded to three decimal places)? (table 8.20) A. 0.948 B. 0 C. 0.809 D. 0.853
A. 0.948
4a. What is the sensitivity of the test (rounded to two decimal places)? (table 8.16) A. 0.95 B. 0.16 C. 0.81 D. 0.09
A. 0.95
4d. Now let's change the data and suppose the screen test produced the results shown in Table 8.17. What is the sensitivity of the test based on these new screening results (rounded to two decimal places)? (table 8.17) A. 0.95 B. 0.08 C. 0.16 D. 0.81
A. 0.95
1g. How often did the error saying "negative" when the patient really did have bowel cancer occur? (table 8.9) A. 1 time B. 2 times C. 18 times D. 182 times
A. 1 time
3a. Change the rule as needed and find the remaining six entries of the Table 8.14(rounded to two decimal places). (table 8.14) A-F?
A. 1.00 B. 0.79 C. 0.48 D. 1.00 E. 0.98 F. 0.91
2d. According to this new table, what is the negative predictive value of the FST with this new rule? (table 8.12) A. 18/40 B. 18/29 C. 245/256 D. 245/267
A. 18/40
1b. How many times did the FOB make the right decision? (table 8.9) A. 184 times B. 19 times C. 200 times D. 3 times
A. 184 times
2e. According to this new table, what is the sensitivity of the FST with this new rule? (table 8.12) A. 245/267 B. 18/29 C. 18/40 D. 245/256
A. 245/267
13a. The correct values in the first column (under the True Mean of 10.5) are: A. A = 0.06; B = 0.17; C = 0.89; D = 1.00 B. A = 0.16; B = 0.85; C = 1.00; D = 1.00 C. A = 0.25; B = 0.98; C = 1.00; D = 1.00 D. A = 0.10; B = 0.52; C = 1.00; D = 1.00
A. A = 0.06; B = 0.17; C = 0.89; D = 1.00
13d. The correct values in the fourth column (under the True Mean of 12) are: A. A = 0.25; B = 0.98; C = 1.00; D = 1.00 B. A = 0.06; B = 0.17; C = 0.89; D = 1.00 C. A = 0.10; B = 0.52; C = 1.00; D = 1.00 D. A = 0.16; B = 0.85; C = 1.00; D = 1.00
A. A = 0.25; B = 0.98; C = 1.00; D = 1.00
4a. An experiment like the one described above ultimately has to make a choice between two possible outcomes. What are those outcomes here? A. A choice has to be made between ginko biloba being no better than a placebo and ginko biloba being better than a placebo. B. A choice has to be made between ginko biloba creating a notable change from baseline cognitive ability and ginko biloba not creating a notable change from baseline cognitive ability C. A choice has to be made between ginko biloba not being approved by the University of Virginia and ginko biloba being approved by the University of Virginia D. A choice has to be made between ginko biloba being better for Alzheimer's patients than for patients with other forms of dementia
A. A choice has to be made between ginko biloba being no better than a placebo and ginko biloba being better than a placebo.
2a. This is clearly a case wherein results that were not statistically significant were judged by the Supreme Court to be practically significant. What practical significance was of primary interest to the court in this article? A. At issue is the practical importance of users who risked losing their sense of smell after having used Zicam, even if those numbers were too small to be statistically significant. B. At issue is the practical importance of users who risked losing their sense of smell after having used Zicam, even though those numbers were big enough to be statistically significant. C. At issue is the practical importance of investors who stood to lose large amounts of money over unreported safety issues, even though those numbers were big enough to be statistically significant. D. At issue is the practical importance of investors who stood to lose large amounts of money over unreported safety issues, even if those numbers were too small to be statistically significant.
A. At issue is the practical importance of users who risked losing their sense of smell after having used Zicam, even if those numbers were too small to be statistically significant.
2b. Was the alternative accepted or not? How do you know? A. HA was not accepted because the results were not statistically significant B. HA was not accepted because the results were statistically significant C. HA was accepted because the results were not statistically significant D. HA was accepted because the results were statistically significant
A. HA was not accepted because the results were not statistically significant
6a. What alternative hypothesis was under scrutiny when the phrase "significant difference" was used? A. That Vytorin was better than a placebo with respect to risk of needing valve replacement or having heart failure. B. That Vytorin was no better that a placebo with respect to risk of needing valve replacement or having heart failure. C. That Vytorin was no worse than a placebo with respect to the risk of developing cancer at some point after being taken. D. That Vytorin was worse than a placebo with respect to the risk of developing cancer at some point after being taken.
A. That Vytorin was better than a placebo with respect to risk of needing valve replacement or having heart failure.
6b. What null hypothesis was under scrutiny when the phrase "statistical significance" was used? A. That Vytorin was no worse than a placebo with respect to the risk of developing cancer at some point after being taken. B. That Vytorin was no better that a placebo with respect to risk of needing valve replacement or having heart failure. C. That Vytorin was worse than a placebo with respect to the risk of developing cancer at some point after being taken. D. That Vytorin was better than a placebo with respect to risk of needing valve replacement or having heart failure.
A. That Vytorin was no worse than a placebo with respect to the risk of developing cancer at some point after being taken.
2a. What is the null hypothesis that is being tested? A. Pagoclone is no better at improving stuttering symptoms than a placebo is B. 88 out of 132 patients who take pagoclone will experience improvements with their stuttering symptoms C. pagoclone is better at improving stuttering symptoms than a placebo is D. pagoclone is no better at improving stuttering symptoms than Indevus
A. pagoclone is no better at improving stuttering symptoms than a placebo is
1b. Regardless of what the p-value is in part a., how would it change if the sample percentage of 60% were based on a sample that is a lot bigger than 75, instead of n = 75? A. The p-value would decrease B. It is impossible to know if it would change or not, unless you have a specific n to do the computation with C. the p-value would increase D. the p-value would not change
A. the p-value would decrease
4b. What is the chance the condition will be present, given the test came back negative, for the Ottawa Ankle Test (rounded to two decimal places)? (table 8.16) A. 0.16 B. 0.09 C. 0.95 D. 0.81
B. 0.09
13c. The correct values in the third column (under the True Mean of 11.5) are: A. A = 0.10; B = 0.52; C = 1.00; D = 1.00 B. A = 0.16; B = 0.85; C = 1.00; D = 1.00 C. A = 0.06; B = 0.17; C = 0.89; D = 1.00 D. A = 0.25; B = 0.98; C = 1.00; D = 1.00
B. A = 0.16; B = 0.85; C = 1.00; D = 1.00
1b. Was the alternative accepted or not? How do you know? A. HA was not accepted because the results were statistically significant B. HA was accepted because the results were statistically significant C. HA was accepted because the results were not statistically significant D. HA was not accepted because the results were not statistically significant
B. HA was accepted because the results were statistically significant
What is a reasonable interpretation of the rate (Pvytorin)/(1-Pvytorin)? A. it's the proportion of subjects who did not develop cancer in the Vytorin group B. It's the odds of developing cancer in the Vytorin group C. It's the proportion of subjects who developed cancer in the Vytorin group D. It's the odds of not developing cancer in the Vytorin group
B. It's the odds of developing cancer in the Vytorin group
1d. What might be the consequences of the FOB screening test saying "positive" when the patient really didn't have bowel cancer? A. The patient would now be much more likely to also have a positive outcome on the gold standard treatment, thereby prolonging an unnecessary stay in the health care system B. This would potentially create unnecessary anxiety for the patient C. There are no immediate consequences since not having bowel cancer is a very positive outcome D. A patient with a potentially fatal disease might be deprived of the quickest post screening intervention possible
B. This would potentially create unnecessary anxiety for the patient
4f. The prevalence of a condition is just the probability that the condition will be present in the population studied. What is the prevalence based on each of the two sets of screening data? (table 8.16 and 8.17) A. 0.93 (rounded) in both cases B. 0.93 based on the first; 0.22 based on the second C. 0.22 based on the first; 0.93 based on the second D. 0.22 (rounded) in both cases
C. 0.22 based on the first; 0.93 based on the second
6a. What is the sensitivity of the test based on these screening results (rounded to three decimal places)? (table 8.19) A. 0.949 B. 0.922 C. 0.400 D. 0.050
C. 0.400
2b. Compute the correlation coefficient for the variables x1 and y1. What is it? A. -0.67 B. -0.82 C. 0.82 D. 0.67
C. 0.82
2b. Suppose we change the rule to say that any score in any of the three categories that is a "3" or higher means the roadside test tagged the participant as drunk. This is what the table of counts will look like with this new rule. According to this new table, what is the specificity of the FST with this new rule? (table 8.12) A. 245/267 B. 18/40 C. 18/29 D. 245/256
C. 18/29
1c. What percentage of the time did the FOB make the wrong decision? (table 8.9) A. About 49% of the time B. About 1% of the time C. About 9% of the time D. About 41% of the time
C. About 9% of the time
4b. Which outcome was chosen? How is this related to the Type I Error Rate (false positive rate)? A. Since the results were NOT statistically significant, we know the p-value computed must have been bigger than a Type II error rate (our analogy to an FPR) that we assume was taken to be 0.05. This means that HA was chosen, so not enough evidence to say ginko biloba was effective. B. Since the results were NOT statistically significant, we know the p-value computed must have been smaller than a Type II error rate (our analogy to an FPR) that we assume was taken to be 0.05. This means that HA was not chosen, so not enough evidence to say ginko biloba was effective. C. Since the results were NOT statistically significant, we know the p-value computed must have been bigger than a Type I error rate (our analogy to an FPR) that we assume was taken to be 0.05. This means that HA was not chosen, so not enough evidence to say ginko biloba was effective. D. Since the results were NOT statistically significant, we know the p-value computed must have been smaller than a Type I error rate (our analogy to an FPR) that we assume was taken to be 0.05. This means that HA was chosen, so enough evidence to say ginko biloba was effective.
C. Since the results were NOT statistically significant, we know the p-value computed must have been bigger than a Type I error rate (our analogy to an FPR) that we assume was taken to be 0.05. This means that HA was not chosen, so not enough evidence to say ginko biloba was effective.
3b. Regardless of what the p-value is in part a., how would it change if the sample percentage based on a sample of 100 customers were larger than 81%? A. it is impossible to know if it would change or not, unless you have a specific sample percentage to do the computation with B. the p-value would increase C. the p-value would decrease D. the p-value would not change
C. the p-value would decrease
1a. Report the standard score, the p-value, and state what your decision is. A. z = 0.60; p-value = 0.420; fail to reject the null. B. z = 0.60; p-value = 0.420; reject the null. C. z = 1.73; p-value = 0.042; reject the null. D. z = 1.73; p-value = 0.042; fail to reject the null.
C. z = 1.73; p-value = 0.042; reject the null.
2a. Recall that this study assumed that a BAC of 0.04% or above means that a person is legally drunk. There were 296 participants in the study, and part of the table is already filled out for the participants who were legally drunk. Use the rule that any score in any of the three categories that is a "2" or higher meansthe roadside test tagged the participant as drunk. What are the values of the missing entries in Table 8.11? (table 8.11) A-E?
A. 2 B. 27 C. 29 D. 4 E. 292
3c. Suppose the p-value you got in part a is denoted by the capital letter P. What would be the p-value for testing the hypothesis: H0: p = 0.80 and HA: p ≠ 0.80? A. 2 times P B. Abs(P/2) C. 1 - abs(P) D. 2 times the absolute value of (1-P)
A. 2 times P
1b. We don't know the variance of the measurements in the flibanserin group, but let's assume it is 2.3; similarly, we will assume that the variance in the placebo group was 1.5. Likewise, we don't know exactly how the 1,378 women were divided, but let's assume 700 were in the flibanserin group and 678 were in the placebo group. If 0.8 is the difference in the average number of sexually satisfying events between the flibanserin and the placebo group, what is the value for Cohen's d in this study? A. 0.80 B. 0.58 C. 0.38 D. 1.58
B. 0.58
6b. What is the specificity for the test based on these screening results (rounded to three decimal places)? (table 8.19) A. 0.050 B. 0.949 C. 0.922 D. 0.400
B. 0.949
5c. Remove Barber from the data set and compute the correlation coefficient for the variables Javelin Distance and Hurdles Time. What is it? A. -0.252 B. <0.001 C. -0.025 D. -0.461
B. <0.001
13b. The correct values in the second column (under the True Mean of 11) are: A. A = 0.06; B = 0.17; C = 0.89; D = 1.00 B. A = 0.10; B = 0.52; C = 1.00; D = 1.00 C. A = 0.16; B = 0.85; C = 1.00; D = 1.00 D. A = 0.25; B = 0.98; C = 1.00; D = 1.00
B. A = 0.10; B = 0.52; C = 1.00; D = 1.00
8a. Consider the following hypothesis. Complete the entries in the table below for the different sample sizes shown. Remember that pˆ = 0.51 in all cases. (table 10.2) A. A = 0.11; B = 0.35; D = 0.01; D < 0.00011 B. A = 0.42; B = 0.26; C = 0.02; D < 0.00011 C. A = 0.20; B = 0.63; C = 2.00; D = 6.32. D. A = 0.50; B = 0.51; C = 0.01; D > 1.00.
B. A = 0.42; B = 0.26; C = 0.02; D < 0.00011
4a. The results say that there were 8% fewer TOTAL cancers in the vitamin group than in the placebo group. If there were 1000 total cancers reported in the placebo group, what was the difference in the average number of cancers reported between the two groups? A. About 0.25 B. About 0.01 C. About 0.50 D. About 1500
B. About 0.01
1b. What is the measurement level of the two variables "number of Nobel Prize winners" and "nation's chocolate consumption"? A. Both are nominal. B. Both are ratio. C. Number of Nobel Prize winners is ratio and chocolate consumption is nominal D. number of Nobel prize winners is nominal and chocolate consumption is ratio
B. Both are ratio
1a. What is the measurement level of the two variables "smoking status" and "myocardial infarction"? A. Myocardial Infarction is ratio and smoking status is nominal B. both are nominal C. both are ration D. myocardial infarction is nominal and smoking status
B. both are nominal
7b. What is the specificity for the test based on these screening results (rounded to three decimal places)? (table 8.20) A. 0.948 B. 0.809 C. 0 D. 0.853
C. 0
7b. What is Carmer's V for these data? A. 0.110 B. 59.59 C. 0.073 D. 40.25
C. 0.073
5a. What proportion of Vytorin users in the study developed cancer? A. 0.0728 B. 0.0107 C. 0.1074 D. 102
C. 0.1074
10b. What does the decision that was made have to do with a false positive rate? A. The FPR is essentially the same as a p-value. In this case the FPR must have been bigger for the DC OSP than for non-voucher programs. B. The FPR is essentially the same as a p-value. In this case, the FPR must have been smaller than 0.05. C. The FPR is essentially the same as an assumed Type I error rate, typically taken to be 0.05. In this case, the p-value must have been smaller than 0.05. D. The FPR is essentially the same as an assumed Type I error rate, typically taken to be 0.05. In this case, the p-value must have been bigger than 0.05.
C. The FPR is essentially the same as an assumed Type I error rate, typically taken to be 0.05. In this case, the p-value must have been smaller than 0.05.
5c. Using the standard score table in this book, what is the most you can say about the p-value for this hypothesis? A. it is between 0.001 and 0.01 B. it is larger than 0.00011 C. it is smaller than 0.00011 D. it is larger than 0.01
C. it is smaller than 0.00011
5d. Suppose a new study is done with n Vytorin patients, where n is a lot smaller than 950. If the sample proportion of the Vytorin patients in the new study who developed cancer is the same as the real study with 950 patients, how would the new standard score compare to the one from the real study? A. it would be larger B. you can't answer this if you don't know n C. it would be smaller D. it would be the same since the proportion of those developing cancer is the same
C. it would be smaller
1a. Where does the statement "about 0.8 sexually satisfying events" come from? A. it's the difference between the 3.5 and 2.7 B. it's the result of computing Cohen's d for the full data set C. it's the different between the 4.5 and the 3.7 D. it's the difference between the 4.5 and the 2.8 and averaged over the two groups
C. it's the difference between the 4.5 and the 3.7
1c. Can we be certain that pumpkin seed oil is effective? A. Yes, because the p-value must have been less than the pre-set alpha level B. yes, because the results were statistically significant C. no, because there is always a chance a type 1 error occurred D. no, because the results were not statistically significant
C. no, because there is always a chance a type 1 error occurred
2c. Can we be certain that Pagoclone is ineffective? A. Yes, because there is always a chance a type 2 error occurred B. yes, because the p-value must have been greater than the pre-set alpha level C. no, because there is always a chance a type 2 error occurred D. no, recuasse the results were not statistically significant
C. no, because there is always a chance a type 2 error occurred