Modules 3 and 4 - Hypothesis Testing and Correlation/Predictive Techniques

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

What are the common features t and z-tests share?

- their hypotheses refer to a population parameter, ie the population mean - their hypothesis concern numerical data - they make certain assumptions that the population data are normally distributed

What is the purpose of clinical trials?

- to evaluate the effect of an intervention (usually a treatment but could be a diagnostic procedure that is then used to guide treatment) - randomized controlled double-blind trials are considered the most robust

What is the chi-square test?

- used for testing hypothesis about nominal scale data (categorical) - tells us whether the proportions of observations falling in different categories differ significantly from those that would be expected by chance - one variable defines the rows and the categories while the other variable defines the columns - tests whether there is an association between the row variable and the column variable

What are response/outcome/dependent variables?

- variables of interest; 'depend' on values of independent variables - usually on y-axis of graphs

What is the outcome of logistic regression?

A binary outcome. Results are reported as an Odds Ratio (OR)

What is a variable?

A characteristic or attribute of a population that could differ from individual to individual

Researchers are testing the safety profile of a new chemotherapeutic drug and set the level of significance (alpha) at α ≤ 0.001. The chance of incorrectly rejecting H0 is called which of the following? A. Type I error B. Type II error C. Alternative hypothesis D. Zone of acceptance

A. Type I error

What is a statistical population?

A complete set of individuals, objects or measurements that share at least one common observable characteristic that is subject of statistical analysis For example: first year medical students share class, university etc; residents of LA share geographic location, environmental conditions etc.

What is a statistical sample?

A subset or subgroup drawn from the population to represent the population in a statistical analysis For example: medical students with last names starting with letters A, M, Z

Researchers are testing the safety profile of a new chemotherapeutic drug. If they set the level of significance (alpha) at α ≤ 0.001, then the chance of incorrectly rejecting H0 is which of the following? A. 1 chance out of 1000 or fewer samples drawn B. 1 percent C. 0.001 percent D. 95% E. 5%

A. 1 chance out of 1000 or fewer samples drawn

Researchers have designed a study investigating the effect of a type of treatment on chronic back pain. Their preliminary proposal shows a power of 0.72 to detect a significant difference between the treatment groups. They need a power of 0.8 or greater in order for the IRB to approve the study. Which of the following is the BEST method to increase the power of their study: A. Increase the sample size B. Decrease the effect size C. Decrease precision of their measurement D. Decrease their alpha level of acceptance

A. Increase the sample size

Researchers wish to define the association of migraine headaches in patients with Systemic Lupus Erythematosus (SLE) and volume of gray matter (VGM) in cm3 estimated from MRI images. Which of the following is the BEST representation of the Null Hypothesis (HO) in this study? A. There is no difference in VGM between SLE patients with and without migraine headaches B. SLE patients with migraine headache have a larger VGM than patients without migraine headaches C. SLE patients with migraine headache have a smaller VGM than patients without migraine headaches D. There is a difference in VGM between SLE patients with and without migraine headaches

A. There is no difference in VGM between SLE patients with and without migraine headaches

Which statistical analysis would you use to compare the effects of 3 different blood pressure medications administered to 3 different groups of patients?

ANOVA

Dr. Junkins wishes to predict student course grades based on student performance on the first Chapter quiz. Given the information in the table below, which of the following is the BEST statistical test to use? A. Pearson Correlation B. Spearman correlation C. Logistic regression D. Multiple linear regression E. Simple linear regression

E. Simple linear regression

T/F: A correlation between two variables demonstrates a causal relationship between the two variables, no matter how strong it is.

F: A correlation between two variables *does not* demonstrate a causal relationship between the two variables, no matter how strong it is.

T/F: Statistical significance infers clinical significance.

F: statistical significance does not equate to clinical significance

You would like to determine if a history of living in a certain climate is a risk factor for developing amyotrophic lateral sclerosis (ALS), a neurologic disorder diagnosed in two per 100,000 people per year. The best study design would be to collect a cohort of 1000 individuals living in a tropic rainforest region and another in a subartic region and follow the groups prospectively for new cases of ALS. True False

False. Case-control studies are usually more practical for studying rare diseases. With the study design proposed, you might have zero cases of ALS over several years of follow-up

In which situations would you decrease the level of significance?

If the cost of making a Type I error is high. For example, you are conducting a trial that tests the efficacy of a new, expensive and highly toxic cancer treatment

What is the two-tailed test?

If you are using a significance level of 0.05, a two-tailed test allots half of your alpha to testing the statistical significance in one direction and half of your alpha to testing statistical significance in the other direction. This means that 0.025 is in each tail of the distribution of your test statistic. Used if statistical question is non-directional. *For example*: H0: VGM (cm3) in SLE patients does not differ between those with and those without headaches. HA: VGM (cm3) in SLE patients differs between those with and those without headaches. A study is statistically significant if p-value < alpha

Human studies must be reviewed by _____ before collecting data or making changes.

Institutional Review Board (IRB); each hospital or academic institution often has its own IRB

Which statistical analysis would you use to predict volume from body weight in healthy adult males?

Linear regression

Which statistical analysis would you use to determine whether patients' cholesterol levels improve after a statin regiment?

Paired t-test (each person serves as their own control)

Which statistical analysis would you use to determine whether there is an association between undergraduate GPA and preclinical GPA?

Pearson correlation

You are hoping to do a study on hypertension in prisoners. In order to encourage participation, you arrange with the prison to allow extra family visitation time to prisoners who participate in the study. Your study may violate which of the following ethical principles? Justice Respect for persons Beneficence

Respect for persons. Coercion, by providing excessive, unwarranted or inappropriate reward to obtain compliance, violates autonomy, which is a component of respect for persons

T/F: Correlation is merely a measure of the variable's statistical association, not of their causal relationship

T/F: Inferring a causal relationship between two variables on the basis of a correlation is a common and fundamental error

What is the alternative hypothesis?

There is a difference between the two means

What does an OR = 1 mean?

There is no correlation between the two variables

What is the null hypothesis?

There is no difference between the two means

What is the purpose for control groups, randomization and blinding techniques in clinical trials?

To control for confounding variables

Unpaired t-test

Which statistical analysis would you use to determine whether the mean COMLEX Level 1 score are lower now than 5 years ago?

Unpaired t-test

What is regression?

Used to express the functional relationship between two variables so that the value of one variable can be predicted from knowledge of the other

VGM (cm3) - numerical, continuous, ratio Migraines - categorical, nominal

VGM - outcome migraine - predictor

When should a paired t-test be used?

When testing for differences between two groups (two means) For example: - before and after treatments - effects of two different drugs on patient outcomes - mean survival times of patients receiving two different treatments

Explain censoring in survival analysis.

Patients who are lost to follow up are either included for the duration of their participation (intention to treat) OR they are eliminated from the proportion calculation at the next time period (censored)

What is the difference between Pearson vs Spearman correlation coefficients?

Pearson - for interval or ratio scale data - i.e. salt intake and blood pressure Spearman - for ordinal scale data (categorical) - birth order or class position at school *Both correlational techniques are linear: they evaluate the strength of a straight line relationship between two variables*

What is a p-value?

The probability that if Ho were true, we would observe the data that we did or more extreme data

What are the axes of survival analysis?

X-axis: time from start of study Y-axis: proportion/percentage of event of interest (survival, without disease, remission, marriage etc)

What is the range within which we would expect the majority of sample means to fall called?

the area of acceptance

What is the critical value?

the p-value, aka the area of rejection

What are some sampling techniques?

*simple random sample* - each individual has same probability of being chosen at any stage during the process *systematic sample* - pick according to a pattern (ie every 10th person) *stratified sample* - randomize within categories (men/women) to ensure that sample is representative of the population for those categories *cluster sample* - randomly select groups (cities, hospitals) and then randomly select individuals within those groups

What is a Type II error?

- *false negative error:* a negative conclusion has been drawn about a hypothesis that is actually true (known as a *beta* error) - the null hypothesis is accepted when it is in fact false

What is a Type I error?

- *false positive error:* a positive conclusion has been reached about a hypothesis that is actually false (an *alpha* error) - the probability that a type I error is being made is equal to the value of p - the null hypothesis is rejected when it is true

What is a one-tailed test?

- A test of statistical hypothesis, where the region of rejection is on only one side of the sampling distribution. - The alternative hypothesis is said to be directional - The one-tailed test provides more *power to detect an effect* so one may be tempted to use a one-tailed test whenever you have a hypothesis about the direction of an effect but prior to doing so, consider the consequences of missing an effect in the other direction *For example*: H0: VGM (cm3) in SLE patients with headaches is smaller than in those without headaches. HA: VGM (cm3) in SLE patients with headaches is larger than in those without headaches.

Which test should be used to compare more than two groups?

- ANOVA: used when researchers wish to compare sample means from more than two groups. ANOVA takes into account the variability within the groups and between the groups. *F = between groups MS/within groups MS(mean sum square)* *One-way* - when subgroups are defined by just one factor - F-test *Two-way* - subdivision is based upon more than one factor

What is linear regression?

- If two variables are highly correlated, you can predict the value of the dependent variable from the independent variable using regression techniques - The regression equation quantifies the straight-line relationship b/w two variables: X is used to predict the value of Y by means of simple linear mathematical function in simple linear regression - regression line is the line of best fit - simple linear regression is tied to t-test for significance - variables must be continuous

What is correlation?

- Measures the closeness of the association b/w two continuous variables by used the correlation coefficient *r* (Pearson's correlation coefficient) - the values of *r* range between +1 and -1 - perfect correlation = 1 (positive correlation) or -1 (negative correlation) - no correlation = 0 - It is used to establish and quantify the strength and direction of the relationship b/w two variables

What is a research hypothesis?

- a testable question or educated prediction that attempts to answer a question about a clinical, pharmacological, biological, social process - often leads to statistical hypothesis development

What is data safety monitoring?

- check safety - to see if treatment is so much better trial can stop early - to see if treatment is so ineffective, it is futile to continue

What is hypothesis testing?

- conducting a test of statistical significance quantifying the chance of random sampling variations that may account for the observed results - you are asking whether the sample mean is consistent with a certain hypothesis value for the population mean - method of assessing whether a result is likely to be due to chance or due to a real effect

What is the power of a test?

- equal to 1 - beta. Indicates the probability of not making a Type II error - the likelihood that if we do not find evidence to reject Ho, this result is really true *high yield* - a study is required to have a power of 0.8 or beta = 0.2 to be acceptable (study that has less than 80% chance of detecting a false null hypothesis is judged to be unacceptable)

What is the coefficient of determination?

- expresses the proportion of the variance in one variable that is accounted for or explained by the variance in the other variable - found by squaring the value of r (symbol is *r^2*)

How can you increase power?

- increased sample size (n) - increased expected effect size - increased precision of the measurementt

What is the p-value dependent on?

- number of observations (sample size) - magnitude of difference between samples, strength of association etc (effect size) - level of variation among individuals

What is the F-test

- one-way ANOVA used to assess whether the expected values of a quantitative variable within several pre-defined groups differ from each other provided that: means of all populations are normally distributed and all populations have the same standard deviation

What are the ethical principles for research ethics?

- respect for persons (autonomy/informed consent/protection of vulnerable people)* - beneficence (benefits > risks/harms)* - justice (fairness in selecting research subjects)* Clinical equipose - there should be genuine uncertainty in the expert medical community over whether a treatment is beneficial

What is a statistical hypothesis?

- speaks to the generalizabiilty of a result or observation - based on a research hypothesis - null and alternative hypothesis

What are non-parametric tests?

- tests that do not assume the population is normally distributed (called distribution-free tests) - they are used to test categorical (nominal or ordinal) data - the most important nonparametric test is the chi-square test

What is multiple regression?

- the dependency of a dependent variable on several independent variables - test of significance used is the F-test, which can assess multiple coefficients simultaneously, unlike t-tests

What is the medial survival time in survival analysis?

- the time at which half the patients are expected to be alive - the smallest survival time for which the survivor function is less than or equal to 0.5

What are predictor/explanatory/independent variables?

- variables that affect the outcome of interest; 'cause' of change - usually on x-axis of graphs

What are confounding variables?

- variables that correlate with both the independent and dependent variable creating the false impression of a direct correlation between the two

Interpret "significance at p < 0.05."

- you are 95% sure that the result was not obtained by chance - there is a 5% probability that the result was obtained by chance - although the null hypothesis is being rejected, there is a 5% chance that the data did actually come from the population specified by the null hypothesis

In medicine, we usually consider that differences are significant if the probability is less than...

0.05; this means that if the null hypothesis is true, we would make an incorrect decision less than 5 times out of 100 chances by accepting the alternative hypothesis

What are the steps involved in hypothesis testing?

1. State the null and alternative hypotheses 2. Select the decision criterion or level of significance 3. Establish critical values 4. Draw a random sample from the population and calculate the mean of that sample 5. Calculate the standard deviation and estimated standard error of the sample 6. Calculate the value of the test statistic, t, that corresponds to the mean of the sample 7. Compare the calculated value of

Researchers wish to define the association of migraine headaches in patients with Systemic Lupus Erythematosus (SLE) and volume of gray matter (VGM) in cm3 estimated from MRI images. Which of the following is the BEST representation of the appropriate test of significance? A. A two-tailed test using categorical variables B. A two-tailed test using numerical variables C. A one-tailed test using categorical variables D. A one-tailed test using numerical variables

B. A two-tailed test using numerical variables Reason: There is a single category - headaches. The basis of the study is on the volume of gray matter and its link to migraine headaches, which is a numerical value. Since you just want to know the *association* between the two variables, and are not inferring any direction, you should use a two-tailed test, which would tell you if there is a negative OR positive association between the two variables.

Researchers wish to predict the success of medical students, defined as passing vs. not passing the boards, based on a variety of variables, including: GPA, class rank, cumulative exam scores and MCAT scores. Which of the following represents the dependent variable in their predictive model? A. Grade point average B. Passing vs. Not passing the boards C. Cumulative exam score D. MCAT score E. Class rank

B. Passing vs. Not passing the boards

The previous results showing the linear relationship of patient height and weight is an example of which of the following types of analyses? A. Spearman correlation B. Pearson correlation C. Multiple regression D. Logistic regression E. Chi-square

B. Pearson correlation

Researchers wish to define the association of migraine headaches in patients with Systemic Lupus Erythematosus (SLE) and volume of gray matter (VGM) in cm3 estimated from MRI images. Which of the following is the BEST representation of the Alternative Hypotheses (HA) in this study? A. There is no difference in VGM between SLE patients with and without migraine headaches B. SLE patients with migraine headache have a larger VGM than patients without migraine headaches C. SLE patients with migraine headache have a smaller VGM than patients without migraine headaches D. There is a difference in VGM between SLE patients with and without migraine headaches

B. SLE patients with migraine headache have a larger VGM than patients without migraine headaches C. SLE patients with migraine headache have a smaller VGM than patients without migraine headaches D. There is a difference in VGM between SLE patients with and without migraine headaches

Researchers wish to compare the performance of two antibiotics at lowering the erythrocyte sedimentation rate (ESR; measured in mm/h) of patients being treated for osteomyelitis. They include 12 patients in one group and 11 patients in the other. Which of the following is the most appropriate test of significance for this study: A. Paired, Student's t-test B. F-test for ANOVA C. Unpaired, Student's t-test D. Chi-squared test E. Pearson's correlation coefficient

C. Unpaired, Student's t-test

Administrators at WesternU conducted a convenience sample survey regarding the safety of the Pomona-based campus. The results of the survey are listed in the table below. Which of the following is the most appropriate test of significance for this study? A.Paired, student's t-test B.Unpaired, student's t-test C. F-test for ANOVA D. Chi-squared test E. Pearson's correlation coefficient

D. Chi-squared test

How do you calculate the expected value and degree of freedom from a chi squared table?

Degree of freedom = (row-1)(column-1) E = total of row x total of column/grand total

Modules 3 and 4 - Hypothesis Testing and Correlation/Predictive Techniques

Set pelajaran terkait

Rebecca Blank: How to Improve the Poverty Measurement

Session 3: Elbow, forearm and wrist

chapter 7 test

AP HUG quiz 2

Identify the Inference Methods

Practice Accounting Chapter 7&8

Human Rights Mid-Term

MasteringBiology Ch.4 A Tour of the Cell

Mindtap Macroecon Ch 5

Chapter 20

chapter 11 The Cardiovascular System, Blood

Economics Unit 7

A&P Test

Microeconomics Final Exam

PE: Health Habits & Stress Management

Mini-Quiz for M12: Step 2

Omnichannel marketing

Chapter 48 Medsurg PrepU Questions

Sociology

Intermediate Micro Exam 3, Exam 2, Econ S321 Exam 1