statistics final
Central Limit Theorem
- As our sample size gets sufficiently large, the distribution of the sample mean will become approximately normal. The law of large numbers goes hand in hand with this and states that as our sample size approaches the population, our sample statistics (x-bar and s) will approach the population parameters (µ and σ)
the Pearson correlation coefficient
we use the t-distribution again, but our degrees of freedom changes just a bit. The degrees of freedom for this will be df = n-2, where n is the number of data pairs we have.
Which statement accurately describes the proportions in the body of a normal distribution?
Body proportions are always greater or equal to .50
If α = .05 for a two-tailed hypothesis test, how are the boundaries for the critical region determined?
Boundaries are drawn so there is 2.5% (.025) in each tail of the distribution.
A two tailed test
For a two tailed test, the area will be the sum of the area to the left of -Z and the area to the right of +Z.
Why are t statistics more variable than z-scores?
The extra variability is caused by variations in the sample variance.
When we use the t-distribution, we are introduced to a concept called degrees of freedom
The number of degrees of freedom (df) for a t-test will be as follows: df = n - 1 where n is our sample size.
What is standard error of estimate?
The standard deviation of the observed yi-values about the predicted -
percentiles
They represent the same things, regardless of distribution used. We just have a few different methods we use to find them
What is a strong correlation?
absolute value of the correlation coefficient is close to 1 (approximately between 0.8 and 1).
The things that will impact the power of a hypothesis test are
adjusting the value of alpha (as alpha increases, so does power), adjusting the sample size (as sample size increases, so does power), adjusting the type of hypothesis run (one tailed tests have higher power than two tailed tests), and the assumed alternative distribution used.
When is the Scheffe test is performed
after an ANOVA has been completed to see if any of the pairs of means are different from one another
As sample size increases what happens to df?
and thus the df increases, the critical values of my t-distribution will decrease as they approach the critical values for the z-distribution
formula for standard error?
is . As our sample size increases, our standard error will decrease at a rate of the square root of n
What if there is extreme value or outlier in data set?
it can have a significant impact on the value of our correlation coefficient
There are three types of hypotheses we can set up:
left tailed, right tailed, and two tailed hypothesis tests.
Running normal hypothesis test using the t distribution our df is?
n-1
when we do a t test on the pearson correlation coefficient the df is?
n-2
In an analysis of variance, differences between participants contribute to which of the following variances?
oth between-treatments variance and within-treatments variance
right-tailed test
p-value will be the area to the right of Z.
Which of the following is an accurate definition of a Type I error?
rejecting a true null hypothesis
Type 1 error (alpha)
rejecting the null hypothesis when it is true (false positive)
Msb and Msw
represent the mean variability of between and within conditions
A hypothesis test involves a comparison of which two elements?
research results from a sample and a hypothesis about a population
Which is the standard error of M?
the standard deviation of the distribution of sample means
The correlation coefficient falls between?
-1 and +1
create rejection regions to stop the hypothesis test process one step sooner
. If our Z-value falls in the rejection region, we reject the null. If our Z-value does not fall in the rejection region, we fail to reject the null.
Weak correlation
A weak correlation exists when the absolute value of the correlation coefficient is less than 0.5
medium correlation
A moderate correlation exists when the absolute value of the correlation coefficient is approximately between 0.5 and 0.8.
The sign of the correlation coefficient tells us what?
If the slope is posititive or negative
How does sample variance influence the estimated standard error and measures of effect size such as r2 and Cohen's d?
Larger sample variance increases the estimated standard error but decreases measures of effect size.
Can you use tukey test or scheffe test on 2 treatment conditions?
Neither the Tukey or Scheffe test are necessary for an analysis of variance comparing only two treatment conditions
To run a t-test, we need to ensure the following assumptions are met
We do not know the population standard deviation, our data comes from a random sample, the elements in our sample are independent of one another, our data comes from a normal distribution or assumed normal through the CLT, and we have no outliers. The t-distribution can often be used at smaller sample sizes if the assumptions are met.
To run a Z test, we need to ensure the following assumptions are met
We know the population standard deviation, our data comes from a random sample, the elements in our sample are independent of one another, and our data comes from a normal distribution, or we can assume normality through the Central Limit Theorem (usually n > 30)
All else constant, which combination of factors would increase the width of a confidence interval most?
a smaller sample size and a larger confidence interval
A researcher is interested in having as much ability as possible to identify a treatment effect if one really exists. Which of the following strategies should they employ?
change the sample size from n = 25 to n = 100
Anova test?
compares 3 means while t test can only compare 1-2 means
A researcher uses analysis of variance to test for mean differences among three treatments with a sample of n = 12 in each treatment. The F-ratio for this analysis would have which df values?
df= 2,33
type 2 error
failing to reject a false null hypothesis
Standard error of estimate
gives a measure of the standard distance between the predicted y values on the regression line and the actual y values in the data.
The things that will impact the width of a confidence interval are
sample size (as the sample size increases, the width will decrease) and chosen confidence level (as confidence increases, the width will also increase)
Consider a researcher who is exploring new potential treatments for specific forms of cancer. This researcher is extremely focused on avoiding mistakes in concluding that treatments that may very well be effective are ineffective when conducting their research. What should this researcher do?
set a higher alpha level
Each of these bits of information is obtainable from the results of a t-test written up in a statistical report.
the alpha level used in the hypothesis test whether the null hypothesis is rejected or fails to be rejected the degrees of freedom used in the hypothesis test
In ANOVA Ms represents?
the mean of squared deviations
The Tukey test compares what?
the means of every treatment to the means of every other treatment, looking at things in groups of 2.
a left tailed test
the p-value will be found as the area to the left of Z
Which of the following is consistent with what r2 represents as a measure of effect size?
the portion of variability in a sample attributable to a treatment effect relative to the total variability in the sample
What is the F ratio?
the ratio of MSbetween to MSwithin