Biostatistics: Correlations and Regression Analysis as Related to GI Cases + Q(Jung)

Ace your homework & exams now with Quizwiz!

zero

A confidence interval that doesn't overlap ____ means that the correlation coefficient is significantly different from ____.

one

A confidence interval that doesn't overlap ____ means that the odds ratio is significant for regressions.

linear

A correlation coefficient is a measure of ______ association between 2 variables.

no

A correlation coefficient of 0 indicates __ correlation.

negative

A correlation coefficient of negative 1 (r=-1) indicates a perfect ________ correlation.

positive

A correlation coefficient of positive 1 indicates a perfect ________ correlation.

McNemar's test (Rationale: a. Incorrect. In the given study design, the data are paired, so an assumption for a chi-square test is violated. b. Incorrect. Student's t-test is a method for continuous type data, while the given data type is categorical.) c. Correct. In the given study design, the data are paired, so an appropriate method is McNemar's test. (d. Incorrect. Analysis of variance is a method for continuous type data, while the given data type is categorical. e. Incorrect. Mann-Whitney U test is a method for nonparametric continuous type data, while the given data type is categorical.)

A group of researchers conducted a research study about different screening strategies for colorectal cancer on 2000 patients. They specifically compared FOBT and sigmoidoscopy in their neoplasia detection rates. They had all of the eligible patients complete both procedures. What statistical test would have been used in this comparison? A. Chi-square test B. Student's t-test C. McNemar's test D. Analysis of variance E. Mann-Whitney U test

C. Chi-square test is a right statistical test since the data type is categorical comparing two independent groups (Rationale: a. Incorrect. The data from Fecal DNA Panel are not paired with the data from Hemoccult II. b. Incorrect. They want to compare a categorical type data between the two independent groups.) c. Correct. Chi-square test is a right statistical test since the data type is categorical comparing the two independent groups. (d. Incorrect. They want to compare a categorical type data between the two independent groups. e. Incorrect. They want to compare a categorical type data between the two independent groups.)

A group of researchers conducted research about different screening strategies for colorectal cancer on 2000 patients. They specifically compared Fecal DNA Panel and Hemoccult II in their detection rates. They randomly assigned eligible patients to either Fecal DNA Panel or Hemoccult II but not both. Which of the following statements describes correctly the study design and a desirable statistical test? A. The data from Fecal DNA Panel are paired with the data from Hemoccult II B. They want to compare a categorical type data between the two dependent groups C. Chi-square test is a right statistical test since the data type is categorical comparing two independent groups D. Analysis of variance is a right statistical test since the date type is continuous comparing three independent groups E. McNemar's test is a right statistical test since the data type is categorical comparing two dependent groups

Chi-square test

A group of researchers conducted research about different screening strategies for colorectal cancer. They specifically compared FOBT and sigmoidoscopy in their neoplasia detection rates. They randomly assigned eligible patients to either FOBT or sigmoidoscopy but not both. What statistical test would have been used in this comparison? A. Chi-square test B. Student's t-test C. McNemar's test D. Analysis of variance E. Kruskal-Wallis test

A. Tobacco smoking is a significant risk factor showing 2.5 times higher odds of early onset (Rationale: a. Correct. b. Incorrect. Alcohol drinking is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1. c. Incorrect. Obesity is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1. d. Incorrect. Gender is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1. e. Incorrect. None of the alcohol drinking, obesity, and gender are not significant risk factors with the 95% confidence intervals of odds ratios that overlap the null value 1 .)

A group of researchers were interested in the role of smoking in early onset of colorectal pathology, and compared the age of colorectal cancer diagnosis between the groups of smokers and non-smokers with other probable risk factors including alcohol drinking, obesity, and gender. To analyze data, they performed a multiple logistic regression to predict an early onset (before age 60 yr) of colorectal cancer from the four risk factors, and the results were presented in the table. Which of the following statements interprets correctly the odds ratios and their 95% confidence intervals in the last column of the table? (see attachment) A. Tobacco smoking is a significant risk factor showing 2.5 times higher odds of early onset B. Alcohol drinking shows a significant difference from no-alcohol drinking to predict an early onset C. Being obese is a significant risk factor to predict an early onset D. Being a male shows a significant difference from being a female to predict an early onset E. Alcohol drinking, obesity, and being a male showed significantly protective effects delaying the onset

significant

A p-value greater than α or 0.05 means null hypothesis is accepted, and data is not:

correlation coefficient

A p-value less than α (= 0.05) means that the _______ ___________ is significantly different from zero. (hint: r≠0)

regression model

A p-value less than α (= 0.05) means that the __________ _____ fits the data well, significantly better than using the average value. (hint: r^2 is better fit than avg values)

.05

A p-value less than α or ____ means null hypothesis can be rejected, aka significance.

curvilinear

A strong ___________ relationship may be identified as no LINEAR correlation, which is a problem with linear correlation.

small

According to the guidelines offered by Cohen, an r value of 0.1 indicates a _____ correlation.

medium

According to the guidelines offered by Cohen, an r value of 0.3 indicates a _____ correlation.

large

According to the guidelines offered by Cohen, an r value of 0.5 indicates a _____ correlation.

mis-specified

Be cautious to determine which variables to include, because if important variables are overlooked or irrelevant variables are included, the regression model is:

unexplained

For example, if r = 0.922, then r^2 = 0.850, which means that 15% of the total variation in y is __________ by the linear relationship between x and y.

explained

For example, if r = 0.922, then r^2 = 0.850, which means that 85% of the total variation in y is ________ by the linear relationship between x and y.

92

If r = 0.96, and R2=0.92 from the previous example. Which number to refer in the context below? __% of the variability in weight loss (y) has been caused by the exercise time (x)

Chi-square test

Imperiale et al. conducted a research study about different screening strategies for colorectal cancer. They specifically compared Fecal DNA Panel and Hemoccult II in their detection rates. They had all of 4,404 eligible patients to complete both procedures. What statistical test would have been used in this comparison? A. Sign test B. Binomial test C. Fisher's exact test D. Chi-square test E. McNemar's test

A. Tobacco smoking is a significant risk factor showing 2.49 times higher odds of early onset Rationale: a. Correct. Tobacco smoking is a significant risk factor showing 2.5 times higher odds of early onset. (b. Incorrect. Alcohol drinking is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1. c. Incorrect. Alcohol drinking is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1. d. Incorrect. Obesity is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1 e. Incorrect. Gender is not a significant risk factor with the 95% confidence interval of odds ratio that overlaps the null value 1.)

In an article titled "Tobacco Smoking: A Factor of Early Onset of Colorectal Cancer", the authors were interested in the role of smoking in early onset of colorectal pathology, and compared the age of colorectal cancer diagnosis between the groups of smokers and non-smokers with confounding factors including alcohol drinking, obesity, and gender. Specifically, they performed a multiple logistic regression analysis to predict an early onset (before age 60 yr) of colorectal cancer from the four risk factors, and the results were presented in the table. Which of the following interprets correctly the odds ratios and their 95% confidence intervals in the last column of the attached table? A. Tobacco smoking is a significant risk factor showing 2.49 times higher odds of early onset B. Alcohol drinking is a significant risk factor showing protective effect delaying the onset C. Alcohol drinking is a significant risk factor showing 0.77 times higher odds of early onset D. Obesity is a significant factor showing protective effect delaying the onset E. Being a male showed a significantly higher risk of early onset than being a female

Multiple logistic regression (Rationale: mult x)

In an article titled Tobacco Smoking: A Factor of Early Onset of Colorectal Cancer, the authors were interested in the role of smoking in early onset of colorectal pathology, and compared the age of colorectal cancer diagnosis between the groups of smokers and non-smokers with confounding factors including alcohol drinking, obesity, and gender. Specifically, they performed a statistical method to predict an early onset (before age 60 yr) of colorectal cancer from four risk factors, and the results were presented in the table below. Which statistical method would have been performed to get the results presented in the table? A. Chi-square test B. Student's t-test C. Mann-Whitney U test D. Pearson product moment correlation coefficient E. Multiple logistic regression

tobacco smoking

In table, a multiple logistic regression was performed to predict an early onset (before age 60 yr) of CRC from four risk factors. Based on the odds ratios and their 95% confidence intervals in the last column of the table, which risk factor is significant to predict an early onset of CRC? (hint: p value <0.05, and Odds ratio can't have 1)

odds ratio

Ratio of the odds occuring in one group vs. the odds occurring in another group

log (rationale: categorical y variable = early or not early onset)

Should linear or log regression should be performed to predict an early onset (before age 60 yr) of CRC from four risk factors: tobacco, alcohol, obesity, and gender?

multiple (rationale: four x's)

Should simple or multiple regression should be performed to predict an early onset (before age 60 yr) of CRC from four risk factors: tobacco, alcohol, obesity, and gender?

practical

Statistical significance of regression does not always imply _________ significance.

predicted variance

The R^2 value is used as an index of:

sign

The ____ of r indicates the direction of correlation.

absolute value

The ________ _____ of r indicates the strength of correlation.

100

The coefficient of determination (R^2)is a decimal between 0 and 1 that must be multiplied by ___. (hint: R^2 = *%*)

negative and positive

The correlation coefficient is a decimal number somewhere between ________ and ________ 1.

r=0

The null hypothesis for assessing correlation coefficient significance is: (hint: no correlation when r=_)

average

The null hypothesis for assessing regression models is the regression model is no better at predicting variance in the dependent variable than the ______ value of dependent. (hint: x does not predict y)

confidence interval

The range of values within which a population parameter is estimated to lie. (hint: 1-α)

interchangeable

The variables in correlation have no specific roles (ie: dependent or independent), therefore are _______________. (hint: called A and B, not X and Y)

cor*relation* coefficient (aka linear or Pearson correlation coeff)

To *quantify* the *relation*ship between 2 variables, calculate a ___________ ___________.

determination

To quantify the relationship of two variables in a cause-and-effect situation, calculate the coefficient of _____________.

scatter

To visualize the relationship between 2 variables, draw a _______ plot.

logistic regression

Type of regression for a *categorical dependent* variable: (ie: y= hair color, race, ethnicity)

linear regression

Type of regression for a *continuous dependent* variable: (ie: y=any number from neg to pos infin, including 0)

simple regression (aka bivariate)

Type of regression for one independent variable (one x):

adjusted regression

Type of regression for relevant covariates is called covariate _______ ______.

multiple regression (aka multivariable)

Type of regression for two or more independent variables (multiple x's):

p-value

What value of the statistical hypothesis test tests the null hypothesis, which if rejected determines SIGNIFICANCE?

regression

When we look at the relationship of two variables in a cause-and-effect situation, draw a scatter plot with a __________ line.

E. Odds ratio = 0.99 and its 95% confidence interval = 0.41 to 2.36

Which of the following shows "not significant odds ratio" based on its 95% confidence interval? A. Odds ratio = 1.66 and its 95% confidence interval = 1.25 to 2.20 B. Odds ratio = 3.58 and its 95% confidence interval = 2.49 to 5.14 C. Odds ratio = 0.60 and its 95% confidence interval = 0.45 to 0.80 D. Odds ratio = 4.13 and its 95% confidence interval = 1.70 to 6.43 E. Odds ratio = 0.99 and its 95% confidence interval = 0.41 to 2.36

A. Odds ratio = 1.7 and its 95% confidence interval = 1.2 to 2.2 Rationale: a. Correct, since the confidence interval does not contain 1 which is the null value for odds ratio (b. Incorrect, since the confidence interval contains 1 which is the null value for odds ratio c. Incorrect, since the confidence interval contains 1 which is the null value for odds ratio d. Incorrect, since the confidence interval contains 1 which is the null value for odds ratio e. Incorrect, since the confidence interval contains 1 which is the null value for odds ratio)

Which of the following shows a significant odds ratio based on its 95% confidence interval? A. Odds ratio = 1.7 and its 95% confidence interval = 1.2 to 2.2 B. Odds ratio = 2.8 and its 95% confidence interval = 0.5 to 5.1 C. Odds ratio = 0.9 and its 95% confidence interval = 0.2 to 1.5 D. Odds ratio = 3.6 and its 95% confidence interval = 0.7 to 6.4 E. Odds ratio = 1.4 and its 95% confidence interval = 0.4 to 2.4

odds ratio, confidence interval

With logistic regression, consult the ____ ______ with the associated ________ ________ for significance.

Outliers

________ can have a large influence on the correlation.

alpha (α)

another term for SIGNIFICANCE LEVEL

discrete variable

data that has specific values and cannot have values between these specific values (ie: can't have half a person!)

beta coefficient

regression coefficient that can be standardized; Regression coefficient that is the degree of change (slope) in the outcome variable (y) for every 1-unit of change in the predictor variable (x) (hint: y=_x+c)

coefficient of determination

the proportion (%) of the variability in the dependent variable (x) that has been accounted for by the independent variable (y): (pic: green=unexplained variance, pink= total variance)

y (Rationale: contin-linear; cat-log)

when determining, linear vs log regression, look to what variable?


Related study sets

Temporal Fossa, Infratemporal Fossa & TMJ

View Set

Chapter 08: Security Management Models

View Set

Nutrition for Health and Fitness Exam 2

View Set

Bio Exam Accumulative Final - Geneseo - Non-majors

View Set

BIO 112 Lab Practical Final Review - Combined Sets

View Set

Supply Chain Chapter 8 operations management

View Set

Psyc Chapter 2: Psychological Research

View Set