Public Health

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

1. The formula for a paired t-test is as follows: To determine the needed sample size, this formula is rearranged algebraically to solve for N. In the process, z must be substituted for t because: A. t is dependent on degrees of freedom and z is not B. z is dependent on degrees of freedom and t is not C. t provides too large a sample D. t provides too small a sample E. z takes beta error into account

1. A. Both z and t are distributions used to provide unit-free measures of dispersion about a mean value. To know the value of t, degrees of freedom (df) must be known. Unfortunately df depends on N, which is what we are trying solve for, creating a circular problem. A solution to the problem is to substitute z, which is independent of df, not dependent (B). The t distribution is approximated by the z distribution when the sample size is large, but having a large (C) or small (D) sample size does not necessitate substitution with z. Substitution with z confers a slight risk of underestimating the sample size required. Both z and t represent critical ratios to determine the probability of alpha error (false-positive error) if the null hypothesis is rejected. Neither takes beta error (false-negative error) into account (E) when we assume that there is in fact no treatment difference.

1. A joint distribution graph can be used to display: A. Causality B. Correlation C. Kurtosis D. Power E. Specificity

1. B. A joint distribution graph is a plot of the relationship between two continuous variables. The more closely the data points cluster about a line, the greater the linear correlation between the two variables. If two variables are related in a nonlinear manner, the correlation may not be displayed as a line but as a curvilinear distribution of data points (e.g., S-shaped, U-shaped, or J-shaped curve). Methods for calculating calculating the correlation coefficient in such a setting are available, but their description is beyond the scope of this text. Correlation alone does not establish causality (A). For example, someone who owns a car is also likely to own a television set, but ownership of one item does not cause ownership of the other item. Joint distributions do not display kurtosis (C), a measure of how peaked or spread out a probability distribution is; power (D), probability of detecting an association when one exists (also known as sensitivity); or specificity (E), probability of not detecting an association when there actually is no true association.

1. In your work at a local hospital, the first three medical interns you meet report feeling much better since they started taking a commonly prescribed antidepressant. You reluctantly draw the conclusion that internship is associated with depression. Drawing this conclusion is an example of: A. Hypothesis testing B. Inductive reasoning C. Deductive reasoning D. Interpolation E. Extrapolation

1. B. A logical progression from the specific to the general, inductive reasoning is a process in which one draws conclusions about general associations based on specific data. In this example, based on the specific experience of three medical interns (obviously a small and limited sample), you draw a general association between internship and depression. In contrast, deductive reasoning (C) involves making specific predictions based on general rules or hypotheses. For example, if it was established as a general rule that medical interns tend to be miserable and depressed, you might predict there would be a good chance that any specific intern you met would be taking (or needing) an antidepressant. Hypothesis testing (A) involves running an experiment to test predictions. For instance, holding the general rule that medical interns tend to be miserable and depressed, we might predict that antidepressants would benefit such interns, and we could randomize a sample of interns to antidepressant or placebo to test this hypothesis. Interpolation (D) is the process of predicting new data points within a given data range. For example, in a plot of depression severity versus daily work hours, if we know how depressed interns are who work 10 hours per day, and how depressed interns are who work 14 hours per day, we can interpolate how depressed an intern who works 12 hours per day might be. Extrapolation (E) is the prediction of new data points outside of known range. In the example, we would have to extrapolate how depressed an intern who works 16 hours per day would be since this data point would be outside the known range of depression severity when working 10-14 hours per day. 2.

1. A multivariable analysis is needed when: A. Multiple groups of participants are being compared B. Multiple outcome variables are under investigation C. Multiple repetitions of an experiment are planned D. Multiple confounders need to be controlled simultaneously E. Multiple investigators will be repeating analyses for comparability

1. D. Multivariable analyses allow investigators to examine associations among multiple independent variables and an outcome variable simultaneously. Often, only one of the independent variables will be of interest to investigators, and others will be potential confounders. However, independent variables may also exert a truly meaningful effect on the outcome or function as effect modifiers (see Chapter 4). Multivariable analyses allow investigators to estimate the association between the independent variable of interest and the outcome, controlling for or adjusting for the potential influence of possible confounding variables in the regression model. Multiple groups of participants (A), multiple investigators (E), or multiple repetitions of the experiment planned (C) have no bearing on whether investigators choose multivariable analytic techniques. If multiple outcomes are under investigation (B), what is needed is multivariate analysis.

1. In Bayes theorem, the numerator represents: A. False-negative results B. Incidence C. Prevalence D. Sensitivity E. True-positive results

1. E. Bayes theorem describes the probability that a patient has the disease, conditional on the diagnostic test for the disease being positive. The theorem is essentially an expression of the positive predictive value (i.e., given a positive test result, how likely the disease is present). The numerator of the formula for the theorem represents true-positive test results (E); the denominator, as in the formula for the positive predictive value, represents all positive test results (i.e., true-positive and false-positive results). Bayes theorem adjusts prior probability: the probability of disease in a patient before running a test, also known as pretest probability. The pretest probability is analogous to the prevalence (C) of the disease in a population (the probability of disease for individuals in that population). Bayes theorem does not comment on sensitivity (D), the probability of a test being positive conditional on the patient having the disease. However, the Bayes theorem numerator of true-positive test results = sensitivity × prevalence. Bayes theorem is not associated with incidence (B; mathematically the prevalence of disease over duration of illness) or false-negative results (A), which relate to sensitivity.

10. In the trial described in question 7, the difference between pretrial and posttrial cholesterol values ultimately is not statistically significant. Before concluding that oat bran does not reduce cholesterol levels, you would want to consider: A. Type II error B. The alpha level C. The critical ratio D. The p value E. Type I error

10. A. A negative result in a trial may indicate that the null hypothesis is true (that there is in fact no difference) or it may be caused by type II error (false-negative error). Beta error generally receives less attention in medicine than alpha error. Consequently, the likelihood of a false-negative outcome is often unknown. A negative result can occur if the sample size is too small, if an inadequate inadequate dosage is administered (e.g., of oat bran in this case), or if the difference one is trying to detect is too large (e.g., one greater than needed for clinical significance). (See Chapter 12 for further discussion of type II error and the related concept of power.) Changing the alpha level (B)—specifically, increasing it—would make it more likely that an observed difference achieves statistical significance. For example, if the difference between pretrial and posttrial values had p = 0.25, this would not be significant at alpha = 0.05. However, it would be significant had one reset alpha, for example, to 0.3. The problem problem with such a strategy is that one always sets alpha before doing a trial. To reset alpha later would be difficult to justify and in most cases unethical. Moreover, setting alpha at 0.3 means one would be willing to accept a 30% chance of finding a difference when a difference does not actually exist (i.e., 30% risk of type I error) (E). Beta is also set before the trial and cannot be changed after the trial. However, if one truly believes oat bran lowers cholesterol despite a result showing no difference, one should consider beta and use this consideration to redesign a trial with a new, lower beta (i.e., lower chance of false-negative error, higher sensitivity, and higher power). For example, one might run the trial again enrolling a larger number of volunteers or giving a larger dose of oat bran to see if this improves the situation. The critical ratio (C) is set by the statistical test chosen and is unmodifiable. In this case the critical ratio is the difference in pretrial and posttrial means over the standard errors of the pretrial and posttrial means. The p value (D) derives from running the statistical test and informs one how likely a difference as large could have been found by chance alone.

10. Considering the previous findings in questions 7 through 9, based on the results of cardiac catheterization, the probability of coronary artery disease is: A. 11% B. 33% C. 76% D. 96% E. Unknown

10. A. Up to this point, we have been discussing positive predictive value (PPV; i.e., how likely a disease will be present if a test result is positive). Presented with a negative or normal test result as in this question, however, what we are concerned about is negative predictive value (NPV; i.e., how likely a disease is absent if the test is negative). From the NPV we can calculate the likelihood of coronary artery disease (CAD) given a negative test result (1 - NPV). For NPV we can use the formula d/(c + d). To use this formula, we must first set up a 2 × 2 table (Bayes theorem is not helpful in this situation). Assume a sample size of 100. The prior probability, 76%, becomes the prevalence. Cell a is sensitivity × prevalence, or (0.96)(76) = 73. Cells a plus c sum to the prevalence, so cell c is 3. Cell d is specificity × (1 - prevalence), or (0.99)(24) = 23.8. Cells b plus d sum to (1 - prevalence), so cell b is 0.2. The NPV is thus: This is the probability that the patient does not have CAD. What we are interested interested in, however, is the probability that the patient actually has CAD, which is 1 - NPV, or 1 - 0.89 (100% - 89%), or 11%. This is the posterior probability (probability after catheterization) given a prior probability of 76% (probability after stress test, but before catheterization). In other words, at the end of the workup (which included a stress test followed by a catheterization), the final likelihood (posterior probability) of ischemic heart disease is relatively small, about a 1 in 10 chance.

2. A direct correlation is noted between the number of times per year one puts on boxing gloves and the frequency of being punched in the head. The correlation coefficient (r) in this case: A. Cannot be determined because the data are dichotomous B. May be close to 1 C. May be greater than 1 D. Must be less than 0.05 E. Must be statistically significant

2. B. The Pearson product-moment correlation coefficient, also referred to as the r value, is a measure of the linear relationship between two continuous variables. Its value range is the same as that for the slope of a line, from -1 (perfect inverse, or negative, correlation), through 0 (no correlation), to +1 (perfect direct, or positive, correlation). A correlation such as the one in the question could produce a correlation coefficient close to 1. However, the correlation coefficient can never be greater than 1 (C), nor can any correlation coefficient ever be less than -1. The correlation coefficient could theoretically be less than 0.05 (D), but such a value is near zero, implying an essentially nonexistent correlation. The correlation does not have to be statistically significant (E). If the sample size is small, the associations (correlation) cannot be measured with much precision, and even a large correlation could have a wide confidence interval and could be nonsignificant. Although the question does not reveal the actual data, the description of the variables gives enough information to show the data are not dichotomous (A), that is, variables are not constrained to only two possible values. Correlation requires that both the variables being compared be continuous (i.e., data can assume any numerical value) or at least pseudocontinuous/interval (i.e., data can assume any integer value). In this case the variables are the number of times putting on gloves expressed as a natural number (i.e., 0, 1, 2, 3) and the frequency of being hit as a fraction or ratio (e.g., 2 of 3 times).

2. It is important to consider beta error when: A. The difference under consideration is not clinically meaningful B. The difference under investigation is statistically significant C. The null hypothesis is not rejected D. The null hypothesis is rejected E. The sample size is excessively large

2. C. Beta error (also called type II error and false-negative error) is the failure to detect a true difference when one exists. Only a negative result (i.e., failure to reject the null hypothesis) is at risk for being a false-negative result. If the null hypothesis is rejected (D) or if the difference under investigation is statistically significant (B), a difference is being detected. Although alpha error (also called type I or false-positive error) may be a possibility in either case, beta error is not a concern. Beta error is the result of inadequate power to detect the difference under investigation and occurs when the sample size is small, not excessively large (E). Either enlarging the sample or reducing the effect size would increase statistical power and help reduce the risk for beta error. However, reducing the effect size (the difference under consideration) might result in a clinically meaningless difference (A). Beta error is a concern when the difference under consideration is clinically meaningful, but not found.

2. A multivariable model is best exemplified by which of the following statements? A. Height and weight vary together. B. Height and weight vary with age. C. Height varies with weight and age. D. Height varies with gender. E. Height varies with gender, and weight varies with age.

2. C. Multivariable models may be expressed conceptually and mathematically. A conceptual understanding of multivariable analysis is facilitated by a verbal description of the relationships under study. To conform to the requirements for multivariable analysis, the model must postulate the influence of more than one independent variable on one, and only one, dependent (outcome) variable. In general, such a model, expressed in words, would take the following form: dependent dependent variable y varies with independent variables x1, x2, and so on. The only choice provided that fits this pattern is C; the other answer choices are all bivariable associations.

2. As clinicians, we are usually interested in using Bayes theorem to determine: A. Cost effectiveness B. Disease prevalence C. False-negative results D. Positive predictive value E. Test sensitivity

2. D. It is necessary to know how likely disease is when the results of testing are positive in a screening program. Bayes theorem is used to establish the positive predictive value of a screening test, based on an estimate of the pretest probability (i.e., underlying population prevalence of disease under investigation). Disease prevalence (B) must be known or estimated to use Bayes theorem; it is not used to determine prevalence. The test sensitivity (E) is an intrinsic property of the test when applied to a population with given characteristics; it is unaffected by prevalence and is not determined by use of Bayes theorem. The cost effectiveness (A) of a screening program is an important consideration but is distinct from the predictive value of the test. Bayes theorem is based on the proportion of true-positive results to all positive results (true-positive + false-positive results); false-negative results (C) are irrelevant to use of the theorem.

3. When applying Bayes theorem to the care of an individual patient, the prior probability is analogous to: A. Prevalence B. Sensitivity C. Specificity D. The likelihood ratio E. The odds ratio

3. A. Conceptually, Bayes theorem states that the probability of a disease in an individual with a positive test result depends partly on the prevalence of that disease in the population of whom the individual is a part. For a child with fever in New Haven, Connecticut, the probability of malaria is low. The probability (prior probability or pretest probability) is low in this child because the prevalence of malaria is so low in the population. For a child with fever in a refugee population in Nigeria, the pretest probability of malaria might be high because malaria is prevalent in the population. The prior probability of a disease in an individual is an estimate based on the prevalence of that disease in a population of similar persons. In another example to emphasize this point, the prior probability of pregnancy in a male patient presenting with abdominal distention and complaining of nausea in the morning is zero. It is not zero because the status of the individual is known with certainty; rather, it is zero because the prevalence of pregnancy in a population of males is zero. When the prior probability of a condition is either 0 or 100%, no further testing for that condition is indicated. Sensitivity (B) and specificity (C) are important performance characteristics of clinical tests, but do not relate at all to prior probability; these measures are completely independent of prevalence. Likewise, likelihood ratios (D) and odds ratios (E) are independent of prevalence.

3. The basic goal of hypothesis testing in clinical medicine is: A. To confirm the alternative hypothesis B. To determine if there is a meaningful difference in outcome between groups C. To establish a p value D. To establish an alpha value E. To establish a beta value

3. B. The fundamental goals of hypothesis testing in clinical medicine are to determine whether differences between groups exist, are large enough to be statistically significant (i.e., not likely the result of random variation), and substantial enough to be clinically meaningful (i.e., matter in terms of clinical practice). The p value (C) allows you to determine if a result is statistically significant. However, this value does not tell you whether observed differences between groups are clinically important (vs. being too small to affect patient care). Only consideration of trial findings in the context of other clinical knowledge can answer the question, "Is the difference large enough to mean something in practice?" Also, a p value is only interpretable in the context of a predetermined alpha value. Both the alpha value (D) and beta value (E) should be established before hypothesis testing begins and are therefore not themselves goals of conducting hypothesis tests. Alpha and beta are preconditions influencing the stringency of statistical significance and investigators' ability to detect a difference between groups. Both alpha and beta influence the likelihood of rejecting the null hypothesis. Conventionally, trials in clinical medicine are set up to reject the null hypothesis rather than confirm the alternative hypothesis (A).

3. Which one of the following characteristics of a diagnostic test is analogous to the statistical power of a study? A. Positive predictive value B. Negative predictive value C. Sensitivity D. Specificity E. Utility

3. C. The sensitivity of a diagnostic test, or a/(a + c), is the ability of the test to detect a condition when it is present. Statistical power, or (1 - β), is the ability of a study to detect a difference when it exists. The two terms are mathematically equivalent since β, or beta error, equals 1 - sensitivity. Thus power = 1 - (1 - sensitivity) = sensitivity = a/(a + c). Power is essentially unrelated to specificity (D), or positive (A) or negative (B) predictive values. Utility (E) is a term used colloquially to mean "usefulness," and it has other meanings in economics/econometrics, but it is not a term that has statistical meaning.

3. In linear regression, the slope represents: A. The value of x when y is zero B. The value of y when x is zero C. The error in the line describing the relationship between x and y D. The change in y when x changes by 1 unit E. The mathematical value of (y/x) minus the y-intercept

3. D. Simple linear regression is a method of describing the association between two variables using a straight line. The slope of the line describes the unit change in y (the dependent variable) for every unit change in x (the independent variable). If the mathematical formula for a line is y = α + βx (where β is the slope and α is the y-intercept), then the mathematical formula for the slope is (y - α)/x, not (y/x) - α (E). The y-intercept and x-intercept can also be determined from the equation. The y-intercept is the value of y when x is zero (B); the value in this case is y = β(0) + α = α. (Note: The α here has an entirely different meaning than that used to represent false-positive error.) The x-intercept is the value of x when y is zero (A); the value in this case is x = (0 - α)β = -α/β. Most often, the line used to represent an association between two variables is just an approximation; the actual relationship will almost never be precisely linear. To account for the imprecision, our formula should include an error term (e.g., y = α + βχ + ϵ), which, no matter how it is symbolized (ϵ in this case), is separate and entirely different from the slope (C). (Note: Slope symbolized as β here has an entirely different meaning than that used to represent false-negative error.)

3. The basic equation for a multivariable model generally includes a dependent (outcome) variable (y); a starting point or regression constant (a); weights or coefficients (the b terms); and a residual or error term (e). The least-squares solution to a multivariable equation is determined when: A. The model is statistically significant B. The residual is maximized C. The value of a is minimized D. The value of e2 is minimized E. The regression constant is 0

3. D. The basic equation for a multivariable model is as follows: The outcome variable is y, the regression constant or starting point is a, the b terms represent weights, and e is the residual or error term. The goal of the least-squares approach to multivariable analysis is to find the model that produces the smallest sum of squares of the error term, e. The error term is also called the residual, and the least-squares solution minimizes, versus versus maximizes (B), this value. The values of a and the bis that lead to this result, which are not minimized (C), maximized, or set at zero (E), produce the best fit or model. The least-squares model may or may not be statistically significant (A), depending on the strength of association between the independent and dependent variables under investigation.

4. A distinction between one-tailed and two-tailed tests of significance is that a one-tailed test: A. Does not affect statistical significance but does affect power B. Does not affect the performance of the statistical test but does affect the conversion to a p value C. Is based on the number of independent variables D. Requires that the sample size be doubled E. Should be performed during data analysis

4. B. The choice of a one-tailed or two-tailed test of significance should be made before a study is conducted, not during data analysis (E). The choice of a one-tailed test versus a two-tailed test is based on the hypothesis to be tested, not the number of independent variables (C). If the outcome can differ from the null hypothesis in only one direction (e.g., if you are comparing a placebo with an antihypertensive drug that you are thoroughly convinced would not cause the blood pressure to increase), a one-tailed test of significance is appropriate. If the outcome may differ from the null hypothesis in either direction (e.g., if you are comparing a placebo with a type of drug that may cause the blood pressure to increase or decrease), a two-tailed test of significance is warranted. The stipulation of a one-tailed or a two-tailed test affects the associated p value. When a one-tailed test is chosen, statistical significance (i.e., a p value less than alpha) is more readily achieved, because the extreme 5% of the distribution that differs sufficiently from the null hypothesis to warrant rejection of the null hypothesis (when alpha is set at 0.05) is all to one side. When a two-tailed test of significance is chosen, the rejection region is divided into two areas, with half (or 0.025 of the distribution distribution when alpha is set at 0.05) at either extreme of the curve. Thus choosing a one-tailed test affects both significance level and power (A). A one-tailed test inherently has greater power to detect a statistically significant difference in an expected direction and thus would not require a doubling of sample size (D); if anything, a smaller sample size would be required for a one-tailed test. The implications of choosing a one-tailed test or a two-tailed test are discussed in the chapter.

4. Alpha, in statistical terms, represents: A. False-negative error B. Sample size C. Selection bias D. Type I error E. 1 - beta

4. D. The value of alpha represents the probability of committing type I error (i.e., false-positive error; or probability of believing an association exists when in fact it does not). By convention, alpha is set at 0.05, indicating that a statistically significant result is one with no more than a 5% chance of false-positive error occurring (i.e., error caused by chance or random variation). The smaller the value of alpha, the less likely one is to make a type I error. False-negative error (A)—type II error or the probability of believing an association does not exist when one in fact does—is represented by beta. The quantity 1 - beta (E) is power (a.k.a. sensitivity), the likelihood of finding an association when one actually exists. Alpha influences sample size (B) but does not represent sample size directly. Alpha is entirely unrelated to selection bias (C). Note: In addition to representing type I or false-positive error, the Greek letter alpha (α) is also often used in statistics to represent the intercept in regression equations, where it has an entirely different meaning

4. A study is designed to test the effects of sleep deprivation on academic performance among medical students. Each subject serves as his or her own control. A 10-point difference (10% difference) in test scores is considered meaningful. The standard deviation of test scores in a similar study was 8. Alpha is set at 0.05 (two-tailed test), and beta is set at 0.2. The appropriate sample size is: A. [(0.05)2 × 8]/(10)2 B. [(1.96)2 × (8)2]/(10)2 C. [(0.05 + 0.2)2 × (8)2]/(10)2 D. [(1.96 + 0.84)2 × (8)2]/(10)2 E. [(1.96 + 0.84)2 × 2 × (8)2]/(10)2

4. D. This is a study for which a paired t-test is appropriate. The corresponding sample size formula, as detailed in Box 12.1, is: In this example, however, the value for beta also is specified, so that the formula becomes: The value of z1−α when alpha is 0.05 and the test is two tailed is 1.96. The value of z1−β when beta is 20%, or 0.2, is 0.84; z1−β is one tailed by convention. The standard deviation (s) is derived from a prior study and is 8. The difference sought is 10 points. Thus [(1.96 + 0.84)2 × (8)2]/(10)2 is the correct sample size formula. The corresponding value calculated from this formula is 6 (5.02 rounded up to smallest number of whole persons). The sample size is so small in this case because of the large difference sought (10% change in test score is substantial), the small standard deviation, and the use of each subject as his or her own control. If each subject had not served as his or her own control and there were two distinct study groups, a larger number of participants would be needed. In such a case, formula E would be correct, representing the sample size calculation for a student t-test. Formula B would be the correct sample size calculation for a paired t-test based on alpha error only. Such a calculation would necessarily result in a sample size smaller than one considering both alpha error and beta error. A and C are nonsense distracters, substituting alpha and beta values for the expected z values at these thresholds.

4. The application of Bayes theorem to patient care generally results in: A. Greater sensitivity B. Greater specificity C. Higher costs D. Improved selection of diagnostic studies E. More diagnostic testing

4. D. When Bayes theorem is used appropriately in planning patient care, the result is more careful selection and refined use of diagnostic testing. Even when the theorem is used qualitatively, it suggests that if the prior probability of disease is high enough, no further testing is required, because neither a positive nor a negative test result will substantively modify pretest probability of disease or in any way change clinical decision making. Performing a test in such a scenario is pointless and can only result in unnecessary costs (e.g., time, resources) and possible harm from invasive tests (e.g., biopsy, catheterization). At other times, when successive tests fail to establish a posterior probability that is sufficiently high for decisive action to be taken, the theorem compels the clinician to continue testing. Neither sensitivity (A) nor specificity (B) is enhanced by use of the theorem; both depend in part on the choice of diagnostic test. Using Bayes theorem in planning patient care should result in better use of diagnostic tests—sometimes more frequent use (E), sometimes less frequent use. The overall cost (C) varies accordingly. For example, an otherwise healthy 32-year-old man with no significant medical problems visits your office complaining of chronic lower back pain. His symptoms have been fairly constant over the last few months. He works in custodial services but denies any significant trauma or injury on the job or elsewhere. He reports no fever, chills, recent infection, or illicit drug use. He has no night pain or pain that wakes him from sleep. He has no difficulty walking and reports no numbness or tingling or incontinence. He is using no medications. His physical examination reveals tense, tender lumbar paraspinal muscles with no neurologic deficits or functional limitations. In a patient with these findings, you estimate that the prior probability of nonspecific musculoskeletal lower back pain (e.g., lumbar sprain or strain) is greater than 95%. By reflex—the type of reflex acquired only with medical training—you might order some blood tests and a back x-ray film and consider magnetic resonance imaging (MRI). However, you may realize that you have already made up your mind to start with a prescription for back exercises, exercises, stretching, and acetaminophen, and see him again in about 6 weeks (or sooner for worsening). When the prior probability is sufficiently high, further diagnostic testing would not, and should not, alter your course of action and is not indicated. Conversely, it may be inappropriate to take immediate action when you have a considerable amount of uncertainty about what to do next for your patient (e.g., a patient of intermediate vascular risk presenting for chest pain that you are unsure whether sounds mostly like angina or esophageal spasm), and further diagnostic studies would help reduce the uncertainty. If you apply the concept of Bayes theorem to your clinical care planning, without even using the actual formula, the standards of your practice are likely to improve.

4. In a survival study of patients with pancreatic cancer, one group was treated with a chemotherapeutic agent and another group with a placebo. Which of the following procedures would be best for comparing the survival experience of the two groups over time while adjusting for other variables, such as age, gender, and cancer stage? A. ANCOVA B. Linear regression analysis C. Logrank test D. McNemar chi-square test E. Proportional hazards (Cox) model

4. E. In a survival study showing the proportion surviving after a fixed number of months, the outcome variable is dichotomous (live/die). Consequently, a standard chi-square can be used to test the hypothesis of independence, or the McNemar chi-square (D) can be used for paired data. A survival study can provide much more information about the participants in each group, however, than simply the proportion of participants who were alive at the end of the study. In a study that may span years, the timing of death is equally important. Intergroup comparison should address the distribution and the number of deaths. The logrank test (C) is designed for making such a comparison, but it cannot adjust for other independent variables. The multivariable method appropriate for survival analysis is the proportional hazards (Cox) model, a method now usually employed in the analysis of clinical trial data. ANCOVA (A) is a multivariable method used with a continuous outcome variable, when the independent variables are a mix of continuous and categoric. Linear regression (B) is either a bivariable method or, in the case of multiple linear regression, a multivariable method; it is also used to analyze a continuous continuous outcome variable, regardless of the type of independent variables.

5. Statistical significance is achieved when: A. Alpha is greater than or equal to p B. Beta is greater than or equal to alpha C. p is greater than or equal to alpha D. p is greater than or equal to beta E. The result is two tailed

5. A. The value of p is the probability that an observed outcome difference is caused by random variation, or chance, alone. Alpha is the maximum risk that one is willing to take on the observed outcome difference being caused by chance. One rejects the null hypothesis ("no difference") whenever p is less than or equal to the preselected value of alpha. By convention, alpha is usually set at 0.05 so p of 0.05 or less indicates statistical significance in most cases. Answer choice C reverses the relationship between p and alpha. Statistical significance has little to do with beta (B and D); beta is important only in setting how much false-negative error one is willing to accept. By convention, beta is set at 0.2, meaning one is willing to accept a 20% chance of not finding a difference when a difference actually exists. Beta influences power (i.e., sensitivity) and affects the sample size needed to detect the statistically significant difference set by alpha. Whether a test is one tailed or two tailed (E) affects the p at which a result is "statistically significant," but the number of tails (i.e., whether the hypothesis is one tailed and directional or two tailed and nondirectional) does not itself equate to statistical significance.

5. A different investigator plans a study designed to test the effects of sleep deprivation on academic performance among medical students. Each subject serves as his or her own control. A 10-point difference (10% difference) in test scores is considered meaningful. The standard deviation of test scores in a similar study was 8. Alpha is set at 0.05 (two-tailed test), and beta is set at 0.5. The required sample size for this study is: A. [(0.05)2 × 8]/(10)2 B. [(1.96)2 × (8)2]/(10)2 C. [(0.05 + 0.2)2 × (8)2]/(10)2 D. [(1.96 + 0.84)2 × (8)2]/(10)2 E. [(1.96 + 0.84)2 × 2 × (8)2]/(10)2

5. B. The sample size calculation here is the same as for question 4, except this time we can remove the beta term because when the power is 50%, the z beta term is equal to zero. Thus in this case the sample size needed is even smaller. In fact the sample size is 3 (rounded up from 2.46). A sample this small may make intuitive sense if you consider how many participants you would need to show that test scores are higher if the test taker is well rested rather than sleep deprived. However, a study with only 50% power is unusual. Incorrect answer choices for this question are in the explanation for question 4.

5. You are interested in comparing the effects of various agents on the management of pain caused by osteoarthritis. Pain is measured on a continuous pain scale. You design a study in which equal numbers of patients are assigned to three groups defined by treatment (acetaminophen, ibuprofen, or placebo). Other independent variables are gender, age (dichotomous: <50 years or ≥50 years), and severity of arthritis (categoric: mild, moderate, or severe). Which of the following would be the most appropriate statistical method for analyzing your data? A. Logistic regression analysis B. One-way ANOVA C. N-way ANOVA D. ANCOVA E. Wilcoxon matched-pairs signed-ranks test

5. C. As shown in Table 11.1, ANOVA is appropriate when the outcome variable is continuous, and multiple independent variables are categoric. The study described meets these criteria. The method is N-way ANOVA, rather than one-way ANOVA (B), because the model includes several different independent variables. ANCOVA (D) would not be appropriate in this case because at least one of the independent variables has to be continuous to employ this method. Logistic regression (A) is for dichotomous, not continuous, outcomes. The Wilcoxon signed-ranks test (E) is a nonparametric test requiring a single dichotomous independent variable and dependent/matched groups; if investigators had been looking at pain on a scale of 10 for a given group of individuals before and after acetaminophen treatment, the Wilcoxon matched-pairs signed-ranks test may have been appropriate.

5. A study is conducted to determine the efficacy of influenza vaccine. Volunteers agree to participate for 2 years. During the first year, participants are randomly assigned to be injected with either an inert substance (a placebo) or the active vaccine. During the second year, each participant who previously received the placebo is given the active vaccine, and each who previously received the active vaccine is given the placebo. Each participant serves as his or her own control. All incident cases of influenza are recorded, and the occurrence of influenza when vaccinated is compared with the occurrence when unvaccinated. The appropriate test of significance for this study is the: A. Kaplan-Meier method B. Kruskal-Wallis test C. Mann-Whitney U-test D. McNemar test E. Pearson correlation coefficient

5. D. In the study described, the outcome for each participant is binary (dichotomous)—that is, the disease (influenza) occurs or does not occur. The proportion of participants who acquire the disease in the year they were vaccinated is compared with the proportion of participants who acquire it in the year they were not vaccinated. Chi-square analysis is appropriate for this sort of comparison, and because each participant in the study serves as his or her own control, the McNemar test (chi-square test for paired data) is used. The contingency table for such a test contains paired data and would be set up as follows: The numbers placed in the various cells would represent the following: • In cell a, participants who acquired influenza when vaccinated and not vaccinated • In cell b, participants who acquired influenza only when not vaccinated • In cell c, participants who acquired influenza only when vaccinated • In cell d, participants who remained free of influenza whether vaccinated or not Kaplan-Meier method (A) is a type of survival analysis or life table analysis, considering length of survival time (or perhaps influenza-free time, as in this case). Had data been collected such that a vaccinated group and separate unvaccinated group were followed for their time until contracting influenza, then the Kaplan-Meier method may have been appropriate. The Kruskal-Wallis test (B) is appropriate when rather than two dichotomous variables you have a dichotomous (or nominal/ categorical) dependent variable and an ordinal independent variable. If you were comparing who contracted flu among those vaccinated with a low-, medium-, or high-dose vaccine, the Kruskal-Wallis test might be appropriate. The Mann-Whitney U-test (C) could also be appropriate in this case; it is similar to the Kruskal-Wallis test except the dependent variable can only be dichotomous (vs. dichotomous or nominal/categorical). The Pearson correlation coefficient (E) is appropriate for comparing two different (continuous or pseudocontinuous) variables, when the relationship between the variables is approximately linear. If you wanted to compare concentration of flu vaccine to some continuous measure of immunologic response, a Pearson correlation coefficient may be appropriate.

5. A young man complains of reduced auditory acuity on the left side. You take his medical history and perform a physical examination. Before you begin diagnostic testing, you estimate a 74% chance that the patient has a large foreign object in his left ear. This estimate is the: A. Likelihood ratio B. Odds ratio C. Posterior probability D. Relative risk E. Prior probability

5. E. The prior probability is an estimate of the probability of a given condition in a given patient at a given point in time, before further testing. Posterior probability (C) is the revised probability of a given condition in a given patient at a given point in time, after testing. In actual practice, prior and posterior probabilities flow freely into one another. Estimating the probability of sinusitis after interviewing a patient gives you the prior probability of disease before you examine the patient. The revision in your estimate after the examination is the posterior probability, but this revision is also the prior probability before any diagnostic testing, such as sinus x-ray films, that you may be considering. A likelihood ratio (A) describes performance characteristics of a test, such as the diagnostic test in the question that has not yet been performed. Odds ratios (B) and relative risks (D) are ways of comparing probabilities of conditions between two different groups.

6. Concerning question 5, to calculate an F ratio, you must first establish: A. The between-groups mean square and the degrees of freedom B. The between-groups variance and within-groups variance C. The degrees of freedom and the value of p D. The least squares and the residual E. The standard error and the mean for each group

6. B. The F ratio is simply the ratio of between-groups variance to within-groups variance. In ANOVA, variance is often called the mean square, which is different from the least squares or the residual (D) of linear regression. One does not need to know the standard error or the mean for each group (E) to calculate an F ratio. Likewise, degrees of freedom (A and C) are not needed to calculate an F ratio. However, degrees of freedom are needed to determine the p value from an F ratio (e.g., using a table of the F distribution).

6. To apply Bayes theorem to a screening program to estimate the posterior probability that a patient has a disease, given that the patient has tested positive, which of the following information must be known? A. Prior probability, sensitivity, and false-negative error rate B. Prevalence, sensitivity, and specificity C. Prevalence, specificity, and posterior probability D. Prior probability, false-positive error rate, and incidence E. False-positive error rate, sensitivity, and false-negative error rate

6. B. The only choice that includes everything that must be known to use Bayes theorem, in terms appropriate for a screening program, is (B). The prevalence (if discussing a population) or prior probability (if discussing an individual patient) must be known, ruling out (E). Sensitivity must be known, ruling out (C) and (D). Specificity (or 1 - false-positive error rate) must be known, ruling out A. The numerator for Bayes theorem is the sensitivity of the test being used, multiplied by the prevalence. The denominator contains the numerator term plus a term that is the false-positive error rate, or (1 - specificity) × (1 - prevalence). The numerator of Bayes theorem describes cell a in a 2 × 2 table (see Table 7.1), and the denominator includes the term from the numerator (cell a, or true-positive results) and adds to it the term for the false-positive results (cell b from 2 × 2 table). Bayes theorem can be rewritten as a/(a + b), which is the formula for the positive predictive value. This is what Bayes theorem is used to determine, so it is not necessary to know the predictive value before using the theorem.

6. For a test of the latest athlete's foot treatment, alpha is set at 0.01 and beta is set at 0.30. In a two-tailed test, the new treatment is superior to the standard of care at p = 0.04, producing feet that are 1 point less malodorous on a 100-point olfactory assault scale in 5% of patients. This result: A. Shows a statistically significant difference between therapies B. Shows a clinically meaningful difference between therapies C. Would be less significant if the test were one tailed D. Favors continued use of the standard of care in clinical practice E. Would allow investigators to reject the null hypothesis

6. D. The p value in this case is not less than the predetermined alpha of 0.01. Thus the result shows no statistically significant difference between therapies (A) and would therefore not allow investigators to reject the null hypothesis of no difference between therapies (E). The new treatment does not appear to be different from usual therapy in a statistically significant way; thus the results of the study favor continued use of the standard of care. The difference between therapies would have been more significant by a one-tailed test (C), although even if p were then less than alpha, the clinical difference

6. Two groups of participants are assembled on the basis of whether they can identify a newt in a pondlife sample. The groups are then asked to rate the probability that industrial emissions cause global warming, using a scale with five choices, ranging from "improbable" to "highly probable." The industrial emissions data in this study are: A. Continuous B. Pseudocontinuous C. Dichotomous D. Nominal E. Ordinal

6. E. Ordinal data are data that can be ranked from lowest to highest, but on scales that are subjective and that do not necessarily have equal intervals between values, as interval/pseudocontinuous data (B) do, or that can assume any value between intervals, as with continuous data (A). Unlike nominal (D) or dichotomous data (C), which are merely categoric without direction (e.g., red, yellow, blue or yes, no, respectively), the scale in the described study clearly has directionality (with choices ranging from "improbable" to "highly probable"). 8. E. Nonparametric methods of statistical analysis are often used for ordinal data and are based on the ranking of the data. To analyze the

6. A study designed to test the effects of sleep deprivation on academic performance among medical students is conducted by yet another group. The investigators use separate intervention and control groups. A 10-point difference (10% difference) in test scores is considered meaningful. The standard deviation of test scores in a similar study was 8. Alpha is set at 0.05 (two-tailed test), and beta is set at 0.2 (one-tailed test). The values for z1−α and z1−β, for the variance, and for the minimum difference remain the same. The required sample size for this study is: A. [(0.05)2 × 8]/(10)2 B. [(1.96)2 × (8)2]/(10)2 C. [(0.05 + 0.2)2 × (8)2]/(10)2 D. [(1.96 + 0.84)2 × (8)2]/(10)2 E. [(1.96 + 0.84)2 × 2 × (8)2]/(10)2

6. E. The formula for this calculation is shown in Box 12.2. The difference between this equation and the equation for the before-and-after study in the previous question is the 2 in the numerator. This equation calculates the sample size for a two-sample t-test (sample needed for each of two separate groups: test and control) rather than a paired t-test (number needed for a single group serving as its own control). The total number of participants in this case is not N (number of participants in each group; such as single group for paired t-test) but 2N (number of participants needed in total). In this case the total number of participants needed is 22 (10.04 rounded to 11 and doubled), almost four times the number needed for the analogous paired study. Incorrect answer choices for this question are in the explanation for question 4.

7. A study is designed to test the effects of sleep deprivation on academic performance among medical students. A 10-point difference (10% difference) in test scores is considered meaningful. The standard deviation of test scores in a similar study was 8. Alpha is set at 0.05 (two-tailed test), and beta is set at 0.2. The before-after study in the previous question is revised to detect a difference of only 2 points (2%) in test scores. All other parameters of the original study remain unchanged, except investigators are undecided as to whether to use a single group of participants for test and control conditions or two separate groups of participants. Given this uncertainty, the required sample size would: A. Be unaffected regardless of whether the study had a single group of participants (serving as their own controls) or two separate groups of participants (test group and control group) B. Increase regardless of whether the study had a single group of participants (serving (serving as their own controls) or two separate groups of participants (test group and control group) C. Increase only if the study had a single group of participants serving as their own controls D. Decrease regardless of whether the study had a single group of participants (serving as their own controls) or two separate groups of participants (test group and control group) E. Decrease only if the study had a single group of participants serving as their own controls

7. B. As already demonstrated, if investigators switch from a single study group to two groups, the sample size will necessarily increase considerably (see explanation to question 6). Even if they decide to keep a single study group, however, reducing the difference to be detected necessarily means that the sample size will need to increase substantially. Such increase makes intuitive sense because it is more difficult to detect a smaller difference, so more study participants are required. The sample size will not stay the same (A) or decrease (D and E) in either possible scenario; it will increase in both scenarios, even—and especially—if a single group serving as their own controls is not used (C). In fact, although in the scenario with one study group the sample size needed to detect a 2% difference would be 126 (125.44 rounded up), with two groups the sample size needed for a study with two groups would be 502 (250.88 rounded up and doubled).

7. In regard to the scenario in question 6, to analyze the responses of the two groups statistically, researchers might compare the data on the basis of: A. Means B. Ranking C. Standard deviation D. Standard error E. Variance

7. B. Researchers often use the nonparametric methods of statistical analysis for ordinal data, such as comparative approaches based on rankings. Ordinal data do not have a mean (A) or a definable variance (E) and cannot be characterized by a standard deviation (C) or standard error (D).

What is the standard error for these data? A. 33.8√7 B. 33.8√6 C. 33.8√5 D. √33.8/6 E. √33.8/5

7. B. The standard error (SE) is the standard deviation (SD) divided by the square root of the sample size, or SE = SD/√3N. In this case, SD is 33.8 and the sample size is 6. Thus answer choices A, C, D, and E are incorrect. The SE is smaller than the SD. This is to be expected conceptually and mathematically. Conceptually, SD is a measure of dispersion (variation) among individual observations, and SE is a measure of variation among means derived from repeated trials. One would expect that mean outcomes would vary less than their constituent observations. Mathematically, SE is SD divided by the square root of the sample size. The larger the sample size, the smaller the SE and the greater the difference between the SD and SE.

7. A 62-year-old woman complains that during the past several months she has been experiencing intermittent left-sided chest pain when she exercises. Her medical records indicate she has a history of mild dyslipidemia, with a high-density lipoprotein (HDL) cholesterol level of 38 mg/dL and a total cholesterol level of 232 mg/dL. She says she has a family history of heart disease in male relatives only. Her current blood pressure is 132/68 mm Hg, and her heart rate is 72 beats/min. On cardiac examination, you detect a physiologically split second heart sound (S2) and a faint midsystolic click without appreciable murmur. The point of maximal impulse is nondisplaced, and the remainder of the examination is unremarkable. You are concerned that the patient's chest pain may represent angina pectoris and decide to initiate a workup. You estimate that the odds of the pain being angina are 1 in 3. You order an electrocardiogram, which reveals nonspecific abnormalities in the ST segments and T waves across the precordial leads. You decide to order a perfusion stress test, the sensitivity of which is 98% and the specificity of which is 85% for ischemia. The stress test results are positive, showing reversible ischemia in the distribution of the circumflex artery on perfusion imaging, with compatible electrocardiogram changes. The prior probability of angina pectoris is: A. 2% B. 15% C. 33% D. 67% E. Unknown

7. C. Prior probability (for an individual) is analogous to prevalence (for a population) and represents an estimate of the likelihood of disease. In this case, the prior probability is your estimate of the likelihood of angina pectoris before any testing. You estimate that the pretest odds of the patient's pain being angina pectoris is 1 in 3, or a 33% (1/3) probability or 33% prior probability. This value is easily calculated and thus not unknown (E). The probability of the pain not being angina pectoris is 1 - (1/3) = (2/3), or 67% (D). The 2% (A) is 1 - sensitivity of the perfusion stress test and represents the false-negative error rate (FNER) for the test; 15% (B) is 1 - specificity of the perfusion stress test and represents the false-positive error rate (FPER) for the test.

8. In the trial described in question 7, the appropriate test of statistical significance is: A. The critical ratio B. The odds ratio C. The two-sample t-test D. The paired t-test E. The z-test

8. D. A t-test is appropriate whenever two means are being compared and the population data from which the observations are derived are normally distributed. When the data represent pretrial and posttrial results for a single group of subjects (i.e., when participants serve as their own controls), the paired t-test is appropriate. Conversely, when the two means are from distinct groups, a two-sample t-test (C) is appropriate. The paired t-test is more appropriate in detecting a statistically significant difference than the two-sample t-test when the data are paired because the variation has been reduced to that from one group rather than two groups. A critical ratio (A) is not a specific test of statistical significance, but a metric common across many tests of statistical significance: a ratio of some parameter over the standard error for that parameter allowing for the calculation of a p value. In the case of t-tests (paired or two-sample tests), the parameter in question is the difference between two means. The odds ratio (B) relates to the odds of observing a critical ratio as large or larger than one observed (see Chapter 6). The odds ratio would be appropriate if, for example, one were looking at participants with low versus high cholesterol values and comparing who had eaten oat bran and who had not. The z-test (E) is appropriate when considering differences in proportions rather than means (e.g., proportion of high-cholesterol participants having eaten oat bran vs.

8. Considering the details from question 7, the posterior probability of angina pectoris after the stress test is: A. 10% B. 33% C. 67% D. 76% E. Unchanged

8. D. Either Bayes theorem or a 2 × 2 table can be used to calculate the posterior probability. Use of Bayes theorem: The sensitivity is provided in the vignette as 98%, or 0.98. The prior probability estimate is substituted for the prevalence and is 33%, or 0.33. The false-positive error rate is (1 - specificity). The specificity is provided as 85%, so the false-positive error rate is 15%, or 0.15. In this case, (1 - prevalence) is the same as (1 - prior probability) and is (1 - 0.33), or 0.67. Use of a 2 × 2 table: To use this method, an arbitrary sample size must be chosen. Assuming that the sample size is 100, the following is true: Cell a is the true-positive results, or sensitivity × prevalence. prevalence. The prior probability becomes the prevalence. Cell a is (0.98) (33) = 32.3 (rounded to 32). Cells a plus c must sum to 33, so cell c is 0.7 (rounded to 1). Cell d is the true-negative result, which is specificity × (1 - prevalence), or (0.85)(67) = 57. Cells b plus d must sum to 67, so cell b is 10. When the 2 × 2 table is established, the formula for positive predictive value, which is a/(a + b), can be used to calculate the posterior probability, as follows: Given either method of reasoning or calculation, we arrive at the same answer choice and all other answer choices—(A), (B), (C), and (E)—can be excluded.

8. In regard to the scenario in question 6, the appropriate statistical method for the analysis would be: A. Chi-square analysis B. Linear regression analysis C. Nonparametric analysis D. Fisher exact probability test E. Two-sample t-test

8. c. Nonparametric methods of statistical analysis are often used for ordinal data and are based on the ranking of the data. To analyze the probability that industrial emissions cause global warming (ordinal responses provided by the two groups of participants), the Mann-Whitney U-test or Kruskal-Wallis test would be appropriate (see Table 10.1). The chi-square test (A) and its nonparametric counterpart, the Fisher exact probability test (D), are appropriate when dealing with nominal/categoric data (including dichotomous data). Linear regression (B) requires a continuous outcome. If in this case we were looking not at the presence or absence of newts but the number of newts in a pond sample, and we had some measure of global warming as an independent variable, then linear regression might be appropriate provided certain assumptions were met (although Poisson regression, describing uncommon events, would probably be more appropriate in such a case; see Chapter 11). A two-sample t-test (E) could be used if we had a continuous measure of global warming as our dependent variable (e.g., concentration of carbon emissions in the atmosphere) and a dichotomous outcome (e.g., newts vs. no newts).

9. In the trial described in question 7, by mere inspection of the data, what can you conclude about the difference in pretrial and posttrial cholesterol values? A. Even if clinically significant, the difference cannot be statistically significant B. Even if statistically significant, the difference is probably not clinically significant C. If clinically significant, the difference must be statistically significant D. If statistically significant, the difference must be clinically significant E. The difference cannot be clinically or statistically significant

9. B. Statistical significance and clinical significance are not synonymous. A clinically important intervention might fail to show statistical benefit over another intervention in a trial if the sample size is too small to detect the difference. Conversely, a statistically significant difference in outcomes may result when the sample size is very large, but the magnitude of difference may have little or no clinical importance. In the cholesterol-lowering diet described, inspection of the pretrial and posttrial data suggests that the data are unlikely to result in statistical significance, but one cannot be certain of this without formal hypothesis testing (and formal testing confirms there is no effect). Regardless, the data do not show a clinically significant effect, because greater changes in cholesterol would be needed to affect clinical outcomes (e.g., heart attacks) or even intermediate outcomes (e.g., reductions in cholesterol toward target levels). In contrast to statistical significance, which is purely numerical, clinical significance is the product of judgment. Statistical and clinical significance are separate, nondependent considerations considerations and thus answer choices C and D are incorrect. Answers A and E are incorrect because the difference could conceivably be statistically significant, but there is no way to tell by mere inspection alone.

9. Based on the previous findings in questions 7 and 8, you decide to treat the 62-year-old female patient for angina pectoris. You prescribe aspirin, a statin, and a beta-adrenergic receptor antagonist (beta blocker) to be taken daily, and you prescribe sublingual nitroglycerin tablets to be used if pain recurs. Pain does recur, with increasing frequency and severity, despite treatment. After you consult a cardiologist, you recommend cardiac catheterization. Assume that the sensitivity of cardiac catheterization for coronary artery disease is 96%, and that the specificity is 99%. When the procedure is performed, it yields negative results. At this stage of the workup, the prior probability of angina pectoris is: A. 11% B. 33% C. 76% D. 96% E. Unknown

9. C. As discussed in the explanation for question 5, prior and posterior probabilities may flow freely into one another in the sequence of diagnostic workups. In this case the disease probability after the stress test (posterior probability) is the pretest or prior probability before the catheterization. For this reason, the explanation for question 9 and its supporting logic are the same as for question 8. After the stress test, the posterior probability of angina is 76%. This is also the probability of angina before any further diagnostic studies, such as catheterization. For these reasons, answer choices (A), (B), (D), and (E) are incorrect.

9. Given the following data comparing two drugs, A and B, you can conclude: A. Drug A shows superior survival at 5 years B. Drug B shows superior survival at 5 years C. A rational patient would choose drug A D. A rational patient would choose drug B E. Drug A is initially less beneficial than drug B

9. E. The figure shows a life table analysis with perplexing results: survival curves that cross. At 5 years, neither drug A (A) nor drug B (B) shows superior survival. Five years is the intersection point where survival is momentarily the same for the two drugs: 75%. With drug A, a number of patients seem to die rapidly, but then survival for remaining patients is stable at about 75%. Drug B shows immediate but short-lived high survival. With drug B, almost all patients live until about year 4, when a large percentage start dying precipitously. Deaths in patients taking drug B do not plateau until about year 6, and then the surviving 50% of patients go on for the next 4 years with no fatal events. From the figure we can conclude that drug A is initially less beneficial than drug B. However, it is unclear overall which drug is better. If a rational patient had to choose one drug or the other, whether he or she would choose drug A (C) or drug B (D) would depend on individual preferences, life considerations, and priorities. It would also depend on knowing what the natural history of survival is for the disease untreated (e.g., if the natural history of the disease is that almost 100% of patients survive to year 8 and 90% survive to year 10 untreated, then neither drug is a good option because both produce premature mortality).

1. In drafting Goldilocks and the Three Bears, before settling on the ordinal scale of "too hot, too cold, just right," story authors first considered describing the porridge in terms of (a) degrees Kelvin (based on absolute zero), (b) degrees Fahrenheit (with arbitrary zero point), and even (c) "sweet, bitter, or savory." Respectively, these three unused candidate scales are: A. Ratio, continuous, nominal B. Nominal, ratio, ordinal C. Dichotomous, continuous, nominal D. Continuous, nominal, binary E. Ratio, continuous, ordinal

A. Scales based on true or absolute zero, such as the Kelvin temperature scale, are ratio. On a ratio scale, 360 degrees (hot porridge) would be twice as hot as 180 degrees (subfreezing porridge). The same would not be true on a scale such as Fahrenheit with an arbitrary zero point; 200°F (hot porridge) would not be twice as hot as 100°F (lukewarm porridge). The Fahrenheit scale is not ratio, but only continuous. Data in named categories without implied order, such as "sweet, bitter, or savory," are nominal (i.e., categoric). For these reasons, answer A is correct and B, C, D, and E are incorrect. None of the candidate scales demonstrates binary (i.e., dichotomous) variables (C or D). Dichotomous (or binary) is just a special case of nominal (or categoric) when there are only two possible categories (e.g., yes/no, true/false, positive/negative, yummy/yucky).

to determine whether this diet is effective in promoting weight loss, you intend to perform a statistical test of significance on the differences in weights before and after the intervention. Unfortunately you do not know how to do this until you read Chapter 10. What you do know now is that to use a parametric test of significance: A. The data in both data sets must be normally distributed (Gaussian) B. The data must not be skewed C. The distribution of weight in the underlying population must be normal (Gaussian) D. The means for the two data sets must be equal E. The variances for the two data sets must be equal

C. All so-called parametric tests of significance rely on assumptions about the parameters that define a frequency distribution (e.g., mean and standard deviation). To employ parametric methods of statistical analysis, the data being analyzed need not be normally distributed (A) and may be skewed (B). Neither the means (D) nor the variances (E) of two data sets under comparison need to be equal. However, the means of repeated samples from the underlying population from whom the samples are drawn should be normally distributed. To employ a parametric test of significance, one should assume that the distribution of weight in the general population is normal, but even this assumption usually can be relaxed because of the central limit theorem

8. Regarding question 3, for distribution A in Fig. 8.8 (see text): A. The distribution is normal (Gaussian) B. Mean > median > mode C. Mean < median < mode D. Mean = median = mode E. Outliers pull the mean to the right

C. The distribution in Fig. 8.8 is left skewed. Outliers in the data pull what might otherwise be a normal distribution (A) to the left (not right, E). Whereas for normally distributed data, the mean = median = mode (D); for left-skewed data, the mean < median < mode, not the other way around as would be the case for a right-skewed distribution (B).

2. When 100 students were asked to read this chapter, the mean amount of time it took for each student to vaguely comprehend the text was 3.2 hours, with a 95% confidence interval of 1.4 to 5.0 hours. The 95% confidence interval in this case represents: A. The sample mean of 3.2 ± 1 standard deviation B. The population mean of 3.2 ± the standard error C. The maximum and minimum possible values for the true population mean D. The range of values for which we can be 95% confident contains the sample mean E. The range of values for which we can be 95% confident contains the population mean

E. The 95% confidence interval is the range of values around which we can be reasonably certain the population mean should be. Specifically, an investigator can be 95% confident that the true population mean lies within this range. The sample mean (D) serves as the midpoint of the 95% confidence interval. The sample mean can be determined with complete confidence (it is calculated directly from known sample values); thus there is a 100% chance that the 95% confidence interval contains the sample mean, and a 0% chance that the value falls outside this range. The sample mean ± 1 standard deviation (A) is a convention for reporting values in the medical literature. The sample mean ± 1 standard error is also sometimes seen in medical literature, but not the population mean ± standard error (B), because usually the population mean is not known. To determine the maximum and minimum possible values for a population mean (C), you would need to have a 100% confidence interval (i.e., the sample mean ± infinity standard errors). In other words, all possible values would have to be known.

the mode of the weights before the intervention is: A. Greater than the mode of weights after the intervention B. Less than the mode of weights after the intervention C. The same as the mode of weights after the intervention D. Undefined E. Technically equal to every value of weight before the intervention

E. The mode is the most frequently occurring value in a set of data and is thus not undefined (D). Before the intervention, there are 10 different values for weight. Each value occurs only once. Thus there is a 10-way tie for the most frequently occurring value in the set, and technically each point can be considered a mode (i.e., the data set is decamodal). Since there is no one mode before the intervention, it would be incorrect to speak of a single value as being greater than (A), less than (B), or equal to (C) the mode of weights after the intervention

2. When a patient is asked to evaluate his chest pain on a scale of 0 (no pain) to 10 (the worst pain), he reports to the evaluating clinician that his pain is an 8. After the administration of sublingual nitroglycerin and high-flow oxygen, the patient reports that the pain is now a 4 on the same scale. After the administration of morphine sulfate, given as an intravenous push, the pain is 0. This pain scale is a: A. Continuous scale B. Dichotomous scale C. Nominal scale D. Qualitative scale E. Ratio scale

E. The scale described is a ratio scale (i.e., for a continuous variable with a true 0 point). The pain scale has a true 0, indicating the absence of pain. A score of 8 on the scale implies that the pain is twice as severe as pain having a score of 4. Of course, the exact meaning of "twice as much pain" may be uncertain, and the concept of "pain" itself is quite subjective. Thus as opposed to other ratio scales (e.g., blood pressure), which have greater comparability between individuals, on a pain scale one person's 4 may be another's 7. For the clinician attempting to alleviate a patient's pain, this potential difference is of little importance. The pain scale provides essential information about whether the pain for a given individual is increasing or decreasing and by relatively how much. Although the pain scale is indeed continuous (A), it is a special and more specific case of continuous (i.e., ratio). Dichotomous scales (B) are binary (only two options) and are a special case of nominal scales (C). Neither binary nor nominal applies to the 0 to 10 pain scale discussed in this question. Qualitative measures (D) are completely devoid of objective scales by definition.

6. Regarding question 3, assuming a mean weight of m, the variance of the weights before the intervention is: A. (81 + 79 + 92 + 112 + 76 + 126 + 87 + 75 + 68 + 78) - m2/10 B. [(81 + 79 + 92 + 112 + 76 + 126 + 87 + 75 + 68 + 78) - m]2/(10 - 1) C. [(81 - m) + (79 - m) + (92 - m) + (112 - m) + (76 - m) + (126 - m) + (87 - m) + (75 - m) + (68 - m) + (78 - m)]2/(10 - 1) D. [(81 - m) + (79 - m) + (92 - m) + (112 - m) + (76 - m) + (126 - m) + (87 - m) + (75 - m) + (68 - m) + (78 - m)]2/(10 - m) E. [(81 - m)2 + (79 - m)2 + (92 - m)2 + (112 - m)2 + (76 - m)2 + (126 - m)2 + (87 - m)2 + (75 - m)2 + (68 - m)2 + (78 - m)]2/(10 - 1)

E. The variance is the square of the standard deviation. The numerator for variance is . In other words, after the mean is subtracted from the first observation and the difference is squared, the process is repeated for each observation in the set, and the values obtained are all added together. The numerator is then divided by the degrees of freedom (N - 1). In this case of observed weights before the intervention, N is 10. Choice D uses the wrong degrees of freedom; as does A, which in addition to B and C, uses the wrong calculation for the numerator.


संबंधित स्टडी सेट्स

Anatomy: Chapter 7 and 8 Homework

View Set

Language Learning Materials Development

View Set

Consideration and Promissory Estoppel

View Set

International Marketing Test 1 (Ch. 1-7)

View Set

Synonyms & Antonyms [ Match, Write]

View Set