Biostats Final
Positive Control
Changes are expected. Standard agent against which to measure difference in severity amongst groups. Used to determine that a response can be detected.
Cross sectional study
present time, now. Involves descriptive statistics but cannot determine cause and effect relationships.
Nominal
Categorical: does not have a clear ordering/hierarchy
type 1 Error
a false positive, found a difference when there really isn't one. Alpha
Continuous
no definite value, forms a continuum
Vehicle Control
A substance is used as a vehicle for a solution of the experimental compound. Determines if vehicle alone causes any effects.
Increases, increasing
As power _____________, you have a higher chance of detecting the significant difference as well as reducing the likelihood of making a type 2 error, while _________ the likelihood of making a type 1 error.
Standard Deviation
Dispersion measurement, average distance from the mean; degree of dispersion. As it gets larger it is harder to detect statistical differences.
William's Correction
Divide the Chi^2 value or g-value by q = 1+ (a^2 -1)/ 6nv a = # of categories n = sample size v= DOF Do with data with > 1 DOF
retrospective studies
Focus on a risk factor or exposure factor for some outcome of the past. Used to evaluate association and uses odds ratios.
Prospective studdies
From present time to the future. Start with an individual disease free with an exposure factor.
Ratio
Interval data with a true zero
interval
Intervals of equivelant magnitude, Does not have a true zero
Significance
Likelihood that change you observe is due to chance. Alpha level = 0.5 usually. Likelihood of making a type 1 error.
Cofounding Variables
May affect the dependent variable . Need to control , match, randomize, etc. Can give a false association or make it more difficult to determine the relationship.
Sham Control
Mimic a procedure or treatment without the actual use of the procedure on the subject. ex: placebo
Effect Size
Minimum derivation from the null hypothesis. It determines if the statistical significance is big/important enough in the decision making hypothesis. Unique to particular experiments.
Negative Control
No changes are expected. Ensure that unknown variable is not adversely affecting the subjects resulting in a false-positive.
Ordinal
Ordered data (ranks with hierarchies)
Power
Probability of finding a true difference. Likelihood of not making a "type 2 error". Probability of rejecting the null that there is statistical difference.
Randomization
Random assignment to treatment groups
experimental Studies
Study and control groups with independent variables and dependent variables
Epidemiological Studies
Study the distribution and determinants of health and disease in populations. Largely observable and natural experiment
Yates Correction
Subtract 0.5 from each observed value greater then the expected, add 0.5 to ecah observed value that is less than the expected. Then do a chi^2 or G-test. Only with one DOF
Cohort Study
Use relative risk as a measure of risk exposure and disease association
Type 2 error
a false negative, found no difference when one really exists. Beta
Comparative Control
a positive control with a known treatment compared directly with a different treatment.
Odds ration
an effect size statistic that gives information about which treatment apporach has the best odds fora patient. Evaluates whether the odds of an event or outcome is the same for two groups.
Discrete
definite number ex: number of children
Matching
independent variable is nominal, observe as pairs, good for before/after experiments
Negative Skew
left tailed (mode > median> mean)
Positive Skew
right tailed (mean > median> mode)
Probability
the likelihood of an occurrence of events due to chance
Alpha Level
the probability of rejecting the null hypothesis when you should accept it
Bonferroni Correction
used when performing multiple corrections, controls the familywise error rate by dividing P value by the number of tests.
Parameter
values about a population
Statistics
values about a sample of a population
Central limit theorem
when you take a large group of random numbers, the means of those umbers are approximately normally distributed.
Arithmetic Mean
• Average value of all data points • Most common • Works for values that fit normal distribution • Sensitive to extreme values.
Range
• Difference between largest and smallest observations • Not informative for statistical purposes • Directly proportional to sample size
Sum of Squares
• Forms the basis of variance and standard deviation, but not a dispersion in of itself • Σ(xn-μ)2
Confidence limits
• How accurate the estimate of your mean is likely to be
Median
• Middle value when data is ranked highest to lowest • if "n" is even, then you average the middle values • Useful for highly skewed data and when it is impractical to measure every single value
Mode
• Most common value in a data set • Used for descriptive purposes (anecdotal stuff) • used to distinguishing the multimodal status of the distribution • useful for epidemiologist, indicates multiple cause and effect relationships
Simple variance (sigma^2)
• SS/(n-1) • Better than parametric variance, because very rarely do you have measurements for the entire population • Unbiased
Parametric variance
• SS/n • Increases as observations get spread out
Standard deviation (sigma)
• Square root of the sample variance • Summarizes how close observations are to the mean • A correction factor you can use, as you're taking the square root of the sample variation rather than the population, you're underestimating it.
Geometric Mean
• nth root of products of y-values • if n=5, (y1*y2*y3*y4*y5)1/5 • use when data who's effects are multiplicative • 0 or (-) values will cause the mean to be undefined • will be slightly smaller unless data is highly skewed best used with outliers
Harmonic Mean
• reciprocal of arithmetic mean reciprocal • n/Σ(1/yn) • like parallel resistors, kind of • less sensitive to outliers • use for highly skewed data • 0 values will cause mean to be undefined
Coefficient of variation
• σ/μ • summarizes variation as a percentage or a