501 Stats test number 2
For a chi-square test, degrees of freedom are determined by :
Multiplying (rows-1) by (columns-1)
Odds
Number of events divided by number of non events (This measure does not use the marginal totals)
t =
(Observed difference between sample means) minus ( expected difference between population means "usually 0, if null hypothesis is true" ) divided by (the estimate of the standard error of the difference between two sample means)
Two types of Independence
1) A basic assumption of chi-square deals with "independence of observations " and is the assumption that one participants choice has no effect on another 2) in the case of contingency tables, it is independence of variables, because, independence is what is being tested.
The chi-square statistic.
= the sum of (observed values - expected values) ² / expected values.
Sampling Error
Expected variability from sample to sample that is due to chance
t-test
Is used for samples where the population parameters are not known.
Z-test
Is used when we know the population parameters.
The cut offs for t's are higher than that of z to avoid make what type error?
Type 1 errors ( rejecting the Ho, when it is true )
Risk
Value of event divided by total number (of people) in that group.
t distribution varies as a function of degrees of freedom, in such a way that as the number of degrees of freedom increase the skew ness and tendency for s ² to overestimate σ ² ......
Decreases, and the distribution becomes more normal approaching infinity and will eventually be normally distributed and equivalent to z.
s ²
Denotes the variance of the sample means.
True or False: Regarding confidence intervals at : a = .05 If we were to repeat the experiment over and over, 95% of the time the true mean falls within the confidence interval.
FALSE!!! The correct way to state the interpretation is: If we were to repeat the experiment over and over, then 95% of the time the confidence intervals contain the true mean.
Confidence Interval for Matched Sample.
For t use the critical value from table at the correct degrees of freedom, and solve for μ.
Independent Samples t-test
For which t-test do we use the difference between means and the standard error of the differences to calculate the confidence interval?
Expected Frequencies in a contingency table represent those frequencies that we would expect if the two variables forming the table were independent or dependent?
Independent :-)
If standard error is large ....
Large differences in the sample means are more likely.
Dividing by the smaller s ² value than what σ ² would be, if we knew it, ultimately causes the value of t to be a larger, or smaller value than z would be?
Larger
Odds ratio
Odds of one condition divided by the odds of another condition
The Shape of the sampling distribution for s ² is positively skewed it negatively skewed?
Positively skewed!
Risk ratio
Risk of one condition divided by the risk of another condition
Chi-Square Expected values ( Model )
Row Total x Column Total / n
Size of the Confidence Interval is a function of the......
Sample size
Chi-square test statistic is an overall result. This can be broken down using standardized residuals.
Standardized residuals have a direct relationship with the test statistic. ( they are a standardized version of the difference between the observed and expected frequencies). These standardized residuals are Z scores. ( if their value lies outside of +or- 1.96 then it is significant at p<.05)
Contingency table
Table with rows (i) and columns (j) recording the frequency of data used to tabulate, expected ( model ) values.
Fishers exact test
Takes all the possible 2x2 tables that can be formed from the fixed set of marginal totals, then some of the probability of those tables who's results were as extreme or more extreme than the table obtained in our data. If the sum is less than Alfa, we reject the null hypothesis that the variables are independent and conclude that there is a statistically significant relationship between the two variables that make up the contingency table. AKA: conditional test.
Sampling distributions
The most basic concept underlying all statistical tests. It tells us what values we might or might not expect to obtain for a particular statistic under a set of predefined conditions
Standard error
The standard deviation of the distribution of differences between sample means
In Matched Samples, repeated measures, or related samples AKA: Matched Sample t -test
The t statistic is calculated using the Difference scores.
Variance Sum Law
The variance of a sum or difference of 2 independent variables is equal to the sum of their variances.
Pooled Variance
The weighted average (indicated by the multiplication of s ² (n-1) in the numerator) of the two sample variances.
Independent samples t-test
This t-test is used when there are two experimental conditions and different participants were assigned to each condition.
Cohen's d in terms of sample standard deviation
To put computations in terms of statistics, rather than parameters, we substitute sample means and standard deviation's instead of population values. ( the above eg. Shows an effect size of an increase of nearly one and a half standard deviations )
Chi Square analyzes frequencies: True or False
True: The mean of a categorical variable is meaningless. And the numeric values you attached to different categories are arbitrary the meaning of those numeric values will depend on how many members each category has: therefore we analyze frequencies
True or False: The confidence interval does not allow one to make probability estimates about hypotheses.
True: just as with NHST, the confidence interval doesn't allow one to make probability estimates about the hypotheses but rather the probability is on the Method, not a particular interval.
Because the shape of the sapling distribution for s ² is positively skewed we are more likely to underestimate, or overestimate the population variance (σ ²) ? ( especially for small samples )
Underestimate!
Pearson's chi-squared test
Used to determine whether there is a relationship between two categorical variables.
Paired Samples t-test
Used when there are two experimental conditions and the Same participants took part in both conditions of the experiment. AKA Matched Pairs, &. Dependent t-test.
If standard error is small....
We expect that most samples to have similar means.
Weighted average
When sample variances are weighted by their degrees of freedom (n-1) as in the pooled variances of paired sample t-tests.
Homogeneity of Variance
is the assumption that the variance within each of the populations is equal.
What sample size is large enough for the t-statistic (t-obtained) to be reasonably compared to the t-distribution?
n=25 to 30 is considered to be sufficiently large to produce a "normal distribution"