Stats
What is the range of possible correlation values?
(-1 and 1)
a z score equals
(observation - mean) / standard deviation
If n =17, approximately how big is the 95% CI of the correlation?
.05
Let's say you had a study with 5 groups, one of which was a control group, and you compared each other group to the control group. If one of the p-values came out to be 0.02, the Bonforroni-corrected p-value would be:
.08
In 10 flips of a fair coin, what is the probability of getting at least one head?
.999
If a data value is 1.5 standard deviations from the mean value, its z-score will be
1.5
The probability of getting three heads in three coin flips is
1/8
What is the standard error of the correlation?
1/sqrt(n-1)
If you had an experiment with 5 groups, how many meaningful 1 group vs. 1 group comparisons could you make?
10
If a z-score was +2 then, using our 68/95/almost-all rule, what percentage of the normal distribution would be above this z-score?
2.5
Plus/minus one standard deviation encloses how much of a normal distribution?
68%
The true mean would be expected to fall within a confidence interval of +/- 1 standard error about what percentage of the time?
68%
If a z-score was -1 (negative 1), what percentage of the normal distribution would be above that score?
84
A z score of 1 will always be above
84% of the distribution and below 16%
The sampling distribution of the mean is _______, as guaranteed by Central Limit Theorem.
Gaussian
In statistics, the word "significant" means
a probability smaller than some arbitrary "significance level".
sampling distribution
a theoretical distribution of multiple sample statistics (SES)
In the equation y = a*x2 + b*x + c, which of the following is a coefficient?
all, a, b, and c
Central Limit Theorem tells us that the sample means from repetitions of an experiment
are normally distributed.
the average or mean standard deviation
best estimate of true standard deviation
The distribution of the number of things that fall into one of two categories is called the
binomial distribution
For a two group t-test experiment, the p-value is computed
by assuming the difference between the groups is really zero
whenever we collect data and compute standard deviation
chances are that it is to small
hopefully our sample is
close to the true mean, a representative sample, standard deviation is close to the true standard deviation
If a z-score is around 4, you would probably
consider the null hypothesis is incorrect
Confidence Intervals cost commonly use error bars
donating 95% CI's
Graphically, we represent confidence intervals using
error bars
In the laboratory, two group experiments generally have
experimental and control group
Given what we know about bacteria and interest rates, U.S. population growth might very well look like
exponential growth
Strictly speaking, regression is a procedure for
fitting a pattern to data
If you did an experiment several times, and recorded the mean each time, your means would
form a normal distribution centered on the true population mean. (samples means that are estimates of the true mean.)
68-95-99.7 rule
in a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean, and about 99.7% fall within 3 standard deviations of the mean
If we plot two variables against one another (y vs. x) and we see a distinct pattern, then the correlation must be
it would depend on the pattern silly
if its complicated
its normal
A best fit line is the two-dimensional analog to the
mean
all normal distributions can be described by
mean and standard deviation
measure of central tendencies
mean, median, mode
a small standard of error
means it is closer to the true mean
The number that divides a distribution in half (so that there are an equal number of observations above and below that value) is called the
median
normal distribution
most things are normally distributed, and have a bell shaped/Gaussian shape.
f you count enough things that fall into one of two categories, the distribution becomes really really close to a
normal distribution
A correlation of 0.82 means that
not enough info
The hypothesis used to compute p-values is generally the ____ hypothesis
null hypothesis
An "interaction" is when
one variable influence the effect of another.
"Rarity", when expressed as a probability given a hypothesis, is called a
p-value
If a post hoc comparison yielded a Bonforroni-corrected p-value of 0.002, it needs to be
replicated
Which two patterns look almost like upside-down versions of one another?
saturation and exponential decay
You can easily compute the standard error of the mean with your
standard deviation and sample size
standard error of mean can be calculated by
standard deviation of means =SD/ sort of sample (n)
A first order polynomial corresponds to a
straight line
standard error of the mean
tells you the precision in which your sample means estimates the true mean.
The cloud that stat_smooth() puts around the data is
the 95% confidence interval around your fit.
As always, a very small p-value tells you that
the data you got would be very rare if the null hypothesis were true
The basic question the ANOVA answers is whether
the group means are more spread out than would be predicted by chance.
For the ANOVA, F is computed assuming
the null hypothesis is true
The denominator for a two group t is
the pooled standard error
Linear Regression:
the procedure for fitting polynomials to data
true mean and true standard deviation is not equal to
the sample mean and sample standard deviation
In a straight line fit, y = a*x + b, the "a" coefficient represents the
the slope of the line
the standard error refers to
the standard deviation of the sampling distribution of the mean, it's the standard deviation of a sampling distribution (multiple experiments).
If there is no "main effect" of a variable, that means
the variable might or might not have an effect.
If a distribution is positively skewed, the mean will be ______ the median.
to the right
f our standard deviation from our sample underestimates the real (True / Population) standard deviation, the our z-score will be
too big
If our standard deviation from our sample underestimates the real (True / Population) standard deviation, the our p-value will be
too small
A sample of 20 observations (n = 20) would be considered a "small" sample.
true
F is basically the scaled by the sample size.
variance of the means over the mean of the variances.
If fit a straight line to data and got a slope of 8 and a standard error of 2, about what percentage of the time would you expect to see a negative slope if you repeated the experiment a kazillion times?
virtually never
standard error
width of sampling distribution of the mean
the median (50/50 split) standard deviation
will be to the left of the average standard deviation, we usually underestimate true value of standard deviation, average of means. Your sample standard deviation has to be higher than the 50/50 split to be considered close to the true standard deviation.
sample mean
x bar
second order polynomial
y = ax2 + bx + c parable
A p-value is the probability that
you would get a mean or score at least as big as yours, given the null hypothesis.
All normal distributions can be converted into a single distribution called the __________.
z score
When we compute a correlation, we are assuming that
a linear relationship exists
Regression:
a general set of procedures for fitting patterns to data
first order polynomial
a line is a polynomial y = ax + b
What is t?
Put another way, T is simply the calculated difference represented in units of standard error. The greater the magnitude of T, the greater the evidence against the null hypothesis. Degrees of separation in standard error from the reference mean. Standard deviation/standard error.
A key difference between the correlation coefficient, r, and R2is that
R2 is useful for straight and curved line fits.
Sample Deviation
S
sample distribution
a frequency distribution of a sample (SD)
the concept of sampling distribution is
a fundamental concept in statistics