PH 223 Biostats
For a 99% confidence interval, there is a 99% probability that the true population parameter lies within that interval.
True False - False
True or False: Suppose that, in a random sample of 100 men between the ages of 60 and 79, you found that 30 had heart disease. According to the central limit theorem, the resulting sample proportion arises from a sampling distribution that is normally shaped with the center of the distribution at 30%.
True False - False -- False, the central limit theorem only tells us that sample statistics such as this sample proportion are expected to come from a normal distribution if the observations are independent and the sample size is reasonably large. The CLT does not indicate anything about the center of the distribution. The correct answer is 'False'
For data following the normal distribution, one would expect the mean to be larger than the median.
True False - False -- False, the mean approximately equals the median for normally distributed data
All else being equal, increasing the sample size of a study will increase the width of a confidence interval representing greater precision in our estimate of the population parameter.
True False - False --Increasing the sample size will decrease the margin of error leading to a narrower confidence interval. The correct answer is 'False'.
True or False: A researcher pre-specifies that they are willing to tolerate a Type I error rate of 0.05. Study results arise for which the p-value is 0.06. Therefore, the researcher can accept the null hypothesis.
True False - False --One can never accept a null hypothesis. With a p-value greater than the pre-specified type I error rate, your study does not provide adequate evidence to reject the null hypothesis. The correct answer is 'False'.
One of the assumptions for the chi-square test of independence is that the observed frequencies must be at least 5 in each cell.
True False - False --The expected frequency must be 5 in each cell. The correct answer is 'False'.
If the null hypothesis that the means of four groups are all the same is rejected using ANOVA at a 5% significance level, then we can conclude that all the means are different from one another.
True False - False --The rejected null hypothesis means that at least one of the means is different. This does not mean that all the means are different from one another. The correct answer is 'False'.
In order for the Central Limit Theorem to be used to justify using a Z-test to test hypotheses about proportions, we must be satisfied that 1) all observed cases are independent of each other and 2) the study is large enough as conventionally indicated by assessing if at least 10 "successes" and 10 "failures" have been observed in each group.
True False - True
The primary purpose of randomization is to control confounding by allocating a similar distribution of patients across comparator groups with respect to background factors that represent potential confounders.
True False - True
True or False: In a study evaluating how long it takes subjects to attain a clinically significant response (e.g. the "event"), time-to-event analysis methods are used to integrate knowledge about 1) each subject's observed time-at-risk for the "event" and 2) whether or not the subject ultimately experienced the "event" during the study.
True False - True
True or False: The following is an example of a one-sided alternative hypothesis: People who take a vitamin C supplement have a lower risk of getting a cold compared to those who don't take vitamin C supplements.
True False - True
True or false: A paired t-test can be treated as an inference about the mean of differences between two experimental conditions for a single sample of independent subjects.
True False - True
True or false: Inclusion of an interaction effect in a multiple regression model enables us to assess if the extent of the primary association of interest depends on the subgroup of another factor.
True False - True
True or false: Inclusion of possible confounding factors in a multiple regression model enables us to get a more precise estimate of the primary association of interest independent of the confounding factors.
True False - True
A 99% CI for a point estimate will be wider than a 95% CI for a given point estimate and standard error.
True False - True -- 99% margin of error = 2.576*SE 95% margin of error = 1.96*SE So, 99% CI is wider because of the larger z-score The correct answer is 'True'.
If we know that the probability that a randomly selected household owns 1 or more cats is 0.3, we can say that the probability that a randomly selected house does not own a cat is 0.7
True False - True -- Because of the complement rule, 1-0.3=0.7 The correct answer is 'True'.
True or False: Researchers are interested in whether two therapeutic alternatives yield different response rates. Assuming they are only willing to tolerate a 5% type I error rate, if it is observed that one therapy has a response rate 30 percentage points higher than the other with a standard error of the difference in proportions of 15 percentage points, then the researchers would reject the null hypothesis that the response rates are the same for the two therapies versus the alternative hypot
True False - True -- The alternative hypothesis is 2-sided since we are asking if there is evidence of a difference. With a type I error rate of 5%, one would therefore reject the null hypothesis if the 2-sided p-value is less than 0.05. In this case, we observe Z=(30-0)/15=2 (2-sided p-value=0.046) so we would reject the null hypothesis. The correct answer is 'True'.
What would the 95% confidence interval be for a sample proportion of 0.45, is the sample proportion was based on a survey of 250 adults?
a. (0.30 - 0.60) b. (0.39 - 0.51) c. (0.40 - 0.50) d. (0.42 - 0.48) - b. (0.39 - 0.51) --p=0.45 sample n=250 critical z associated with 95% CI= 1.96 0.45 +/- 1.96 * sqrt (0.45*1-0.45/250)= 0.45 +/- 1.96*0.031= (0.39 - 0.51) The correct answer is: (0.39 - 0.51)
The 2010 General Social Survey asked 1,259 US residents: "Do you think the use of marijuana should be made legal?" 48% of the respondents said it should be made legal. Suppose legislators have agreed to discuss modifying marijuana possession laws if at least 45% of US residents support legalization. Calculate the p-value to assess whether we have enough evidence to conclude that at least 45% of US residents support legalization.
a. 0.017 b. Not enought information c. 0.016 d. 0.032 - c. 0.016 -- This is a one sided alternative hypothesis (p>0.45). Z = (.48-.45)/sqrt(.45(1-.45)/1259) = 2.14 p-value = Pr(Z>2.14) = 0.0162 Alternately, you can use the "Proportion test calculator" in STATA to derive this value. The correct answer is: 0.016
The prevalence of a disease is 2 in 100 (0.02) and researchers have developed a new test for this disease. With this new test, the probability of testing positive for the disease, given that you have the disease is 0.88. What is the probability that a randomly selected person tests positive and has the disease?
a. 0.018 b. 0.023 c. 0.02 d. 0.88 - a. 0.018 --A=testing positive B=has the disease P(A|B)=0.88 P(B)=0.02 P (A and B) = P(A|B) × P(B) = 0.88 × 0.02 = 0.018 The correct answer is: 0.018
As part of a study of the association between smoking and risk of squamous cell carcinoma, a logistic regression model was estimated as logit = -4.6 + 0.5*SMOKER where SMOKER=1 for smokers and 0 for non-smokers. What is the predicted probability of having squamous cell carcinoma among smokers?
a. 0.094 b. 0.016 c. 0.018 d. 0.047 - b. 0.016 --predicted prob = exp(-4.6+(0.5*1)) / [1+exp(-4.6+(0.5*1))] = 0.0163 which rounds to 0.016 The correct answer is: 0.016
A study is conducted to determine if the difference in mean FEV-1 between treated and control groups after a month of intervention is different than 0. The study observes a between-group difference of 10 percentage points (84% in the treated vs. 74% in the controls) with a standard error of 5. Calculate the p-value.
a. 0.18 b. 0.23 c. 0.046 - c. 0.046 -- H0: difference = 0 HA: difference =/= 0 Z = (10 - 0) / 5 = 2 p-value is two-sided (corresponding to the alternative hypothesis being "different than") = 0.046 The correct answer is: 0.046
A study is conducted to determine if the mean LDL-C levels in a group of secondary prevention subjects (e.g. subjects who have had a prior cardiovascular event) is above 70 mg/dL. The study group is observed to have a mean of 74 mg/dL with a standard error of 3. Calculate the p-value.
a. 0.18 b. 0.91 c. 0.09 - c. 0.09 -- H0: mean LDL <= 70 HA: mean LDL > 70 Z = (74 - 70) / 3 = 1.33 p-value is the one-sided right tail (corresponding to the alternative hypothesis being "greater than") = 0.09 The correct answer is: 0.09
In the 2014 NFL season, passing yards per game were approximately normally distributed with a mean of 225 yards per game and a standard deviation of 30 yards per game. What percentage of games would you expect had between 195 and 255 passing yards? You may round to the nearest whole percentage point.
a. 16% b. 32% c. 0.6% d. 68% - d. 68% --195 and 255 are within -1 and 1 SD of the mean - so you could use the 68%, 95%, 99.7% rule to answer the question. Or z = (255 - 225) / 30 = 1 Pr (z<1) = 0.841 (the left tail corresponding to the probability of values lower than 255). z = (195 - 225) / 30 = -1 Pr (z<-1) = 0.159 (the left tail corresponding to the probability of values lower than 195). To get probability in between: 195 and 255: 0.841 - 0.159 = 0.682 The correct answer is: 68%
Suppose that in a population of adults between the ages of 18 and 49, glucose follows a normal distribution with a mean of 93.5 and standard deviation of 19.8. What is the probability that glucose exceeds 120 in this population?
a. 18% b. 91% c. 9% - c. 9% --Z- Score = (120 - 93.5) / 19.8 = 1.34 An area of 0.91 is below and 1-0.9671=0.09 or 9% is above that value. The correct answer is: 9%
What is the median for the following series of data? 30 36 29 30 35 25 26 24
a. 29.5 b. 32.5 c. 27.5 d. 30 - a. 29.5
Treatment Category of Clinical Response Total Worsened Unchanged Improved Active 8 27 15 50 Placebo 22 23 5 50 Total 30 50 20 100 The cross-tabulation above represents hypothetical results from a clinical trial comparing an active treatment to placebo for which the primary efficacy parameter is an ordinal categorical variable. What is the probability that a study subject did not improve (e.g. worsened or remained unchanged)?
a. 30% b. 70% c. 80% - c. 80% --The question focuses on one random process (category of clinical outcome). The event is comprised of two possible outcomes so P(worsened or unchanged) = P(worsened) + P(unchanged) - P(worsened AND unchanged) = (30/100) + (50/100) - 0 = 80/100 (80%) NOTE: P(worsened AND unchanged) =0 since these are disjoint outcomes (they can not both be observed for a subject) The correct answer is: 80%
What is the 80th percentile of a metric that is normally distributed with mean 27 and standard deviation 4? You can round to the nearest hundredth (e.g. 2nd decimal place).
a. 30.37 b. 0.842 c. 32.13 - a. 30.37 --The Z score for which 90% of the normal curve is below it is 0.842. We need to derive the corresponding value of the metric as Z = 0.842 = (? - 27)/4. Algebra --> ? = 0.842*4 + 27 = 30.37. The correct answer is: 30.37
The Cardiovascular Risk in Young Finns Study showed average baseline triglyceride levels among 920 people of 0.926 mmol/L associated with a total sum of squared deviations of approximately 160 mmol/L. If a linear regression analysis of baseline triglyceride levels on physical activity levels had an R-squared of 0.30, what are the sum of squared deviations explained by the regression model? (HINT: write out the equation for R-squared and plug in the relevant information that is provided in the eq
a. 48 mmol/L b. 112 mmol/L c. Not enough information - a. 48 mmol/L --Since R-squared = (sum of squared deviations associated with the regression model) / (total sum of squared deviations) we know that 0.30 = (sum of squared deviations associated with the regression model) /160 so (sum of squared deviations associated with the regression model) = 160*0.30 = 48 The correct answer is: 48 mmol/L
A strength training program among marine recruits leads to an improvement in the number of push-ups completed in 5 minutes between the beginning and end of the program that is approximately normally distributed with a mean improvement of 9 push-ups (standard deviation = 3). What is the probability that a recruit improves by at least 12 push-ups? You may round to the nearest whole percentage point.
a. 50% b. 16% c. 68% - b. 16% -- For z = (12-9)/3 = 1, the Pr(Z>1) = 0.159 (16%) The correct answer is: 16%
Treatment Category of Clinical Response Total Worsened Unchanged Improved Active 8 27 15 50 Placebo 22 23 5 50 Total 30 50 20 100 The cross-tabulation above represents hypothetical results from a clinical trial comparing an active treatment to placebo for which the primary efficacy parameter is an ordinal categorical variable. What is the probability that a study subject receiving active treatment improved?
a. 50% b. 30% c. 15% - b. 30% --The question revolves around two random process (treatment group and category of clinical outcome). The event is a conditional probability P(improved | active) = P(improved AND active) / P(active) = (15/100) / (50/100) = 15/50 (30%) The correct answer is: 30%
A test has scores that are normally distributed with a mean of 290 and standard deviation of 90. What is the percentile ranking of a student scoring 350? Round off to the nearest whole percentile point.
a. 75th percentile b. 68th percentile c. 93rd percentile d. 81st percentile - a. 75th percentile -- z = (350-290) / 90 = 0.667 By definition, a percentile requires us to assess the probability LESS THAN the observed value which is 0.748 Rounds off to 75th. The correct answer is: 75th percentile
A group of 20 men that followed a higher fiber diet and 50 men that followed a conventional diet undergo a cholesterol test to measure triglyceride levels. The results of the study are in the table below: High Triglyceride Normal Triglyceride High fiber diet 4 16 Conventional diet 15 35 Which of the following statements is not accurate (select one)?
a. 80% of men on a high fiber diet have normal triglycerides b. 20% of the men in this study were on a high fiber diet c. 30% of men on a conventional diet have high triglycerides - b. 20% of the men in this study were on a high fiber diet
Researchers investigated the effects of a new drug on blood pressure. They found that patients taking the drug lowered their blood pressure on average 15 mmHg more than patients taking the placebo ( p-value = 0.013). If the pre-specified type I error rate the researchers are willing to tolerate was set at 5%, what can the researchers conclude regarding the null hypothesis that there is no difference between drug and placebo?
a. Accept the null b. Fail to reject the null c. Reject the null d. Not enough information to determine - c. Reject the null -- Feedback: Since the p-value is <0.05, there is sufficient evidence to reject the null hypothesis. The correct answer is: Reject the null
In a cardiovascular disease (CVD) study, the proportion of those with CVD among non-smokers was 298/3055=0.0978 and the proportion among smokers was 81/744=0.1089. Is proportion of those with CVD different among smokers and non-smokers? Use alpha=0.05
a. CVD rates are not statistically different between smokers and non-smokers b. CVD rates are statistically different between smokers and non-smokers - a. CVD rates are not statistically different between smokers and non-smokers --Two-proportion test, 2-sided H0: psmk=pnonsmk HA: psmk≠pnonsmk psmk=0.1089 , pnonsmk=0.0975 ppooled=(81+298)/(744+3055)=0.0998 z=(0.1089-0.0975)/sqrt((0.0998*1-0.0998/744)+(0.0998*1-0.0998/3055))=0.927 using prob calculator the p-value for the z-score of 0.927 is 0.354, which is >0.05 so we fail to reject the null. The correct answer is: CVD rates are not statistically different between smokers and non-smokers
If a researcher was interested in knowing whether the age distribution of their cardiac study sample was similar to the distribution for a national study, which test would be appropriate?
a. Chi-square test of independence b. 1-proportion Z-test c. 2-proportion Z-test d. Chi-square test for goodness of fit - d. Chi-square test for goodness of fit
A researcher investigates the relationship between blueberry consumption and cognitive performance among students by randomly assigning 20 students to consume a freeze-dried blueberry snack and 20 students to take a placebo snack for 2 months and then having students complete a cognitive assessment. The response variable in this study is:
a. Cognitive performance b. Freeze-dried blueberry snack (yes / no) c. Not enough information to tell - a. Cognitive performance -- Cognitive performance is the response variable while freeze-dried blueberry snack (yes / no) is the explanatory variable in this study. The correct answer is: Cognitive performance
If the probability of either outcome A or outcome B arising during a single random process is 0.8 and we also know that the probability of outcome A is 0.4 while the probability of outcome B is 0.4, how are these two outcomes related?
a. Dependent b. Independent c. Non-disjoint d. Disjoint - d. Disjoint --Based on the General Addition Rule, where P(A or B) = P(A)+P(B)-P(A and B), we can conclude from the provided information that P(A and B) = 0 [0.8 = 0.4+0.4-0] indicating that the two outcomes are disjoint since P(A and B) is equal to 0. The correct answer is: Disjoint
Two processes are said to be __ if knowing the outcome of one provides no useful information about what outcome to expect from the other.
a. Disjoint b. Independent c. Dependent - b. Independent --Feedback: Independence occurs when the outcome from one random process does not affect the probability of an outcome from the other random process. P(A|B) = P(A). Be careful not to confuse this with the concept of disjoint which relates to the probability of two alternative outcomes arising in the context of a single random process.
Suppose a university's asymptomatic COVID-19 testing program indicates that 0.04% of tests are positive. What probability concept motivates us to be able to conclude that the probability of a negative test result is 99.96%?
a. Disjoint event b. Complementary event c. Dependent event - b. Complementary event
Researchers are interested in which comorbidities are associated with mortality from COVID-19, so they conduct a study by identifying adults with a COVID-19 diagnosis in their medical record between March 2020 to June 2020. They then identified from the medical records whether the included adults had comorbidities (yes/no) and whether they died (yes/no). What type of study design is this?
a. Experimental b. Retrospective observational c. Prospective observational - b. Retrospective observational
A study in which researchers assign a treatment to participants is an observational study.
a. False b. True - a. False
Consider the following data values: 3, 6, 7, 8, 9, 9, 9, 10, 10, 11, 12. If I changed the score of 12 to 22, what would happen to the median?
a. Increase b. Stay the same c. Decrease - b. Stay the same
Which assumption does the scatterplot appear to violate?
a. Independent observations b. Linearity c. Nearly normal residuals d. Homoscedasticity - d. Homoscedasticity --The data cloud shows a fan shaped pattern - indicating that the data likely does not have constant variance (homoscedasticity). The correct answer is: Homoscedasticity
The General Social Survey collects data on demographics, education, and work, among many other characteristics of US residents. The histograms below display the distributions of hours worked per week for two education groups: those with and without a college degree. What is the p-value assessing the null hypothesis that the number of hours worked per week by all Americans without a college degree is greater than for those with a college degree? Summary information for each group is shown in the
a. Indeterminate b. 0.007 c. 0.004 - c. 0.004 --The question indicates a one-sided hypothesis test. We will use a two sample t-test. degrees of freedom = smaller sample size - 1 = 505-1 = 504 standard error of difference in means = sqrt (15.1^2/505 + 15.1^2/667) = 0.8907 Because of the large sample sizes, you will get the virtually the same p-value if you use t or z. p-value = Pr (t>(41.8-39.4)/0.8907) = Pr(t>2.6945) = 0.0036 (can be rounded to 0.004) NOTE: Pr (Z>2.6945) = 0.0035 is virtually identical The correct answer is: 0.004
In class, we discussed a quality control inspection procedure for ketchup bottle fill amounts in which they are expected to be normally distributed with mean 36 oz. and standard deviation 0.11 oz and fill volumes between 35.8 oz. and 36.2 oz. are required to pass inspection. What proportion of bottles are expected to pass inspection.
a. Indeterminate b. 0.069 c. 0.931 - c. 0.931 --Lower bound: z = (35.8 - 36)/0.11 = -1.82 Upper bound: z = (36.2 - 36)/0.11 = +1.82 The area under the standard normal distribution curve between -1.82 and +1.82 is 0.931
As part of a study of the association between smoking and risk of squamous cell carcinoma (SCC), a logistic regression model was estimated as logit = -4.6 + 0.5*SMOKER where SMOKER=1 for smokers and 0 for current smokers. What is the estimated odds ratio for smokers risk of SCC versus non-smokers risk of SCC?
a. Indeterminate b. 1.65 c. 1.82 - b. 1.65 -- The odds ratio can be calculated by exponentiating the coefficient; exp(4.6)=99.8. OR = exp(0.5) = 1.65 The correct answer is: 1.65
When an observed distribution of data is left skewed, meaning that a modest number of observations have values far less than values for the bolus of observed cases, what is the expected relationship between the median and mean?
a. Indeterminate b. Mean > median c. Mean approximately equal to the median d. Mean < median - d. Mean < median
In the presence of skewed data, what would be the best set of descriptive statistics to represent the central tendency and spread of the data?
a. Mean and standard deviation b. Median and variance c. Median and interquartile range d. Median and standard deviation - c. Median and interquartile range
Which of the following are measures of central tendency? Select all that apply.
a. Median b. Interquartile range c. Mean d. Standard deviation - b. Interquartile range d. Standard deviation
Which of the following statistics is not considered a robust statistic?
a. Median b. Interquartile range c. Standard deviation - c. Standard deviation
P(A or B) = P(A) + P(B) is the formula for the...
a. Multiplication rule for independent proceses b. Addition rule for disjoint outcomes c. General addition rule for non-disjoint outcomes - b. Addition rule for disjoint outcomes --Feedback: The general addition rule states that P(A or B) = P(A) + P(B) - P(A and B). However, when two events are disjoint, P(A and B) is equal to zero which leaves us with just P(A or B) = P(A) + P(B). The correct answer is: Addition rule for disjoint outcomes
A group of 80 men that followed a higher fiber diet and 50 men that followed a conventional diet were assessed for the presence of colon polyps. The results of the study are in the table below: Colon Polyps No Colon Polyps Total High Fiber 16 64 80 Conventional Diet 15 35 50 Total 31 99 130 Does this study indicate that men following a high fiber diet tend to have a higher risk of colon polyps than men that follow a conventional diet?
a. No b. Yes - a. Yes --20% of men on a high fiber diet had colon polyps (16/80). 30% of men on a conventional diet had colon polyps (15/50). Therefore, the men on a conventional diet tended to have a higher risk of having colon polyps. The correct answer is: No
The table below shows the distribution of cases of depression by amount of coffee consumption among a random sample of over 50 thousand individuals. Coffee Consumption (cups/week) 0 1-6 7-12 13-18 >18 Total Clinical Depression Yes 670 373 905 564 95 2,607 No 11,545 6,244 16,329 11,726 2,288 48,132 Total 12,215 6,617 17,234 12,290 2,383 50,739 Is there evidence to conclude that coffee consumption and depression are not independent at the 5% alpha level? How many degrees of freedom does you
a. No with 4 degrees of freedom b. Yes with 4 degrees of freedom c. Yes with 9 degrees of freedom d. Yes with 8 degrees of freedom - b. Yes with 4 degrees of freedom --1) Input the data as follows into the table calculator: 670 373 905 564 95 \ 11545 6244 16329 11726 2288 2) Select the Pearson chi-squared test statistic 3) p-value = 0.000 which is less than 0.05 so we can reject the null hypothesis (independence) and conclude that the incidence of depression varies depending on coffee consumption (e.g. depression status and coffee consumption are not independent). The correct answer is: Yes with 4 degrees of freedom
Which of the following needs to be pre-specified before conducting a hypothesis test? Select all that apply.
a. Null hypothesis b. Type 2 error rate c. Type I error rate researchers are willing to tolerate d. Alternative hypothesis - a. Null hypothesis c. Type I error rate researchers are willing to tolerate d. Alternative hypothesis -- Null and alternative hypotheses as well as the type I error rate researchers are willing to tolerate (commonly referred to as the alpha level) need to be pre-specified. The correct answers are: Null hypothesis, Type I error rate researchers are willing to tolerate, Alternative hypothesis
Which of the following aspects of statistical inference does not need to be pre-specified before conducting a hypothesis test?
a. Null hypothesis b. Type 2 error rate c. Type I error rate researchers are willing to tolerate d. Alternative hypothesis - b. Type 2 error rate --Null and alternative hypotheses as well as the type I error rate researchers are willing to tolerate (commonly referred to as the alpha level) need to be pre-specified. The correct answer is: Type 2 error rate
Suppose you assessed 100 adults with hypercholesterolemia taking a statin regarding their level of cholesterol control as measured by low density lipoprotein (LDL). What type of variable is LDL?
a. Numerical b. Binary categorical c. Ordinal categorical - a. Numerical
Researchers studying the link between prenatal vitamin use and autism surveyed the mothers of a random sample of children with autism (cases) and children with typical development (controls). The table below shows the number of mothers in each group who did and did not use prenatal vitamins during the three months before pregnancy (periconception period) cross-tabulated with whether or not their child had autism. Autism Typical development Total Periconception prenatal Vitamin use No 111 7
a. One-sample test of proportion b. Two-sample test for difference in proportions c. Test for difference in sample means d. Chi-square test of independence - b. Two-sample test for difference in proportions --We are making comparisons between two groups so a two sample test for difference in proportions is appropriate. We could either test the null hypothesis the proportion of mothers using vitamins is the same among children with autism compared to children with typical development or we could compare the proportion of children with autism between mother who used vitamins vs. those who did not. The correct answer is: Two-sample test for difference in proportions
In a study of geographic variation in hospitalization rates, what type of variable is a patient's zip code?
a. Ordinal categorical b. Categorical c. Discrete numerical - b. Categorical
Using the scatterplot and the regression table results, what would be the equation for the linear regression line?
a. Predicted BAC = 0.0180 + -0.0127 (beer cans) b. Predicted BAC = 0.0126 + 0.0024 (beer cans) c. Predicted BAC = -0.0127 + 0.0180 (beer cans) d. Predicted beer cans = -0.0127 + 0.0180 (BAC) - c. Predicted BAC = -0.0127 + 0.0180 (beer cans) --General formula for linear regression is predicted y = intercept + slope times x intercept estimate=-0.0127 slope estimate is labeled beers=0.0180 The correct answer is: Predicted BAC = -0.0127 + 0.0180 (beer cans)
In the context of a study examining whether carrots can improve eyesight, which of the following would be a type 2 error?
a. Researchers conclude that carrots don't improve eyesight, and the truth is that carrots don't improve eyesight. b. Researchers conclude that carrots don't improve eyesight, but the truth is that carrots improve eyesight. c. Researchers conclude that carrots improve eyesight, but the truth is that carrots don't improve eyesight. d. Researchers conclude that carrots improve eyesight, and the truth is that carrots improve eyesight. - b. Researchers conclude that carrots don't improve eyesight, but the truth is that carrots improve eyesight. -- A type 2 errors is failing to reject the null hypothesis when alternative hypothesis is actually true. H0: Carrots do not improve eyesight HA: Carrots improve eyesight The correct answer is: Researchers conclude that carrots don't improve eyesight, but the truth is that carrots improve eyesight.
In a study assessing the impact of a carbohydrate-protein drink on extent of muscle recovery, what role does the consumption of carbohydrate-protein drink (yes / no) play?
a. Response variable b. Explanatory variable c. Indeterminate - b. Explanatory variable -- In this example carbohydrate-protein drink is the explanatory variable, and muscle recovery is the response variable. The correct answer is: Explanatory variable
A study among student-athletes assessing the association between amount of time pursuing athletics and academic performance randomly samples student-athletes from each of the varsity teams. What type of sampling scheme is this?
a. Simple random sample b. Stratified random sample c. Multi-stage sample d. Clustered sample - b. Stratified random sample
VO2 max is measured among student athletes on the rowing and tennis teams to assess which sport is associated with higher aerobic capacity. Which statement is true?
a. Sport (rowing vs tennis) is the explanatory variable and VO2 max is the response variable b. Sport (rowing vs tennis) is the response variable and VO2 max is the explanatory variable c. Indeterminate - a. Sport (rowing vs tennis) is the explanatory variable and VO2 max is the response variable
A survey on binge drinking by undergraduate students is administered to all students living in randomly selected dorms. What is the sampling method you used?
a. Stratified sample b. Simple random sample c. Clustered sample d. Multi-stage sample - c. Clustered sample
The Great Britain Office of Population Census and Surveys collected data on a random sample of 170 married couples in Britain recording the age and heights of the husbands and wives. The scatterplot on the left shows the wife's age plotted against her husband's age, and the plot on the right shows wife's height plotted against husband's height. Which figure indicates a stronger linear relationship?
a. The couples ages b. The couples heights - a. The couples ages
Does the following distribution follow a normal distribution? (use alpha=0.05) Bin Observed Frequency Expected Proportion for Normal Distribution < 2 SD below mean 10 0.023 1-2 SD below mean 19 0.136 0-1 SD below mean 54 0.341 0-1 SD above mean 70 0.341 1-2 SD above mean 35 0.136 >2 SD above mean 12 0.023 Total200 1.000
a. The observed distribution matches the normal distribution. b. The observed distribution does not match the normal distribution. - b. The observed distribution does not match the normal distribution --X2=25.96, df=5p-value=<0.0001P-value is smaller than alpha - reject null --> observed distribution does not fit the normal distribution. The correct answer is: The observed distribution does not match the normal distribution.
The Central Limit Theorem enables us to use the standard normal distribution as a frame of reference to evaluate probabilities associated with a wide range of summary statistics including, but not limited to, proportions, means, differences in proportions, and differences in means.
a. True b. False - a. True
The Central Limit Theorem states that, if you study a "large" number of independent observations, many resulting summary statistics can be thought of as one outcome from a distribution of possible outcomes that follows an approximately normal distribution.
a. True b. False - a. True
After observing statistically significant ANOVA results, Bonferroni-corrected inferences to determine which groups differ from each other require the following steps (select all that apply):
a. Use the mean square error from the ANOVA when calculating the standard error to be used in the pairwise two-sample t-tests b. Use the degrees of freedom associated with the mean square error from the ANOVA when calculating the p-values for each of the pairwise two-sample t-test c. Compute the type I error rate allowable for each pairwise comparison in order to insure that the type I error rate across all comparisons does not exceed some pre-specified level d. Determine that the difference between a specific pair of groups is statistically significant if the t-test p-value is less than the Bonferroni-corrected type I error rate - a. Use the mean square error from the ANOVA when calculating the standard error to be used in the pairwise two-sample t-tests b. Use the degrees of freedom associated with the mean square error from the ANOVA when calculating the p-values for each of the pairwise two-sample t-test c. Compute the type I error rate allowable for each pairwise comparison in ord
A new study has concluded that athletes eating a gluten free diet had a statistically significantly higher V02 max (an indicator of cardiovascular fitness) than athletes that do not eat a gluten free diet (45 mL/kg/min vs. 40 mL/kg/min; p=0.02). Suppose the truth is that there is no relationship between gluten free diets and cardiovascular fitness. What type of error occurred?
a. no error occurred b. type 3 error c. type 1 error d. type 2 error - c. type 1 error --Incorrectly rejecting the null results is a type 1 error The correct answer is: type 1 error
A study was conducted that examined 30 randomly selected participants for each age group: 15-20 years, 21-35 years and 36-55 years and asked the average numbers of hours of sleep per night. Does the mean hours of sleep per night differ by the three age groups? Given the partially completed ANOVA table - determine the values that belong in the BOLD boxes to determine the p-value. What is the p-value assessing the null hypothesis that the mean hours of sleep in the three age groups is the same?
a. p=0.03 b. p=<0.0001 c. p=0.35 d. p=0.01 - a. p=0.03 --df group = 3 categories -1 =2 MSE=SSerror/df error= 271.10/87=3.12 F=MSG/MSE = 11.94/3.12 = 3.83 The p-value for the F-statistics of 3.83, dfg =2, dfe=87 is 0.03, this is smaller than the alpha level 0.05, so we reject the Null (that the means are the same) and conclude that at least one of the group means is different. The correct answer is: p=0.03
A researcher observed that cholesterol levels tended to be progressively higher among subjects consuming progressively more red meat. How are these two variables related?
a. red meat consumption causes increases in cholesterol b. positive correlation c. negative correlation d. zero correlation - b. positive correlation
The Nurses Health Study began following 121,700 female nurses in 1976 with questionnaires every two years asking about lifestyle factors and whether cancer or heart disease had been diagnosed since the prior questionnaire. Is this study retrospective or prospective?
a. retrospective b. indeterminate c. prospective - c. perspective
If a researcher is interested in knowing whether the proportion of participants in a study sample who currently smoke is lower than the current smoking prevalence reported in a national survey; what test is appropriate?
a. z-test for 2 proportions, one-sided hypothesis test b. z-test for 1 proportion, two-sided hypothesis test c. z-test for 1 proportion, one-sided hypothesis test d. z-test for 2 proportions, two-sided hypothesis test - c. z-test for 1 proportion, one-sided hypothesis test --This question is asking about a single proportion so a 1-proportion test. It is not comparing proportions between two groups. The question also specifies lower - indicating a direction, so this is a one-sided hypothesis test. The correct answer is: z-test for 1 proportion, one-sided hypothesis test