Statistics chapter 15-Chi-Square test
The Chi-Square test is typically used to analyze the relationship between two variables under the following conditions:
1) Both variables are qualitative in nature (that is, measured on a nominal level). 2) The two variables have been measured on the same individuals 3) The observations on each variable are between-subjects in nature.
Inference of a relationship using the Chi-Square Test:
1) State a null hypothesis 2) Obtain a set of observed frequencies 3) Derive a set of expected frequencies based in the assumption that the null hypothesis is true 4) Compare the expected frequencies with the observed frequencies 5) Reject the null hypothesis if the overall difference between the observed frequencies and the expected frequencies is large (as defined by a given alpha level)
Several unique issues the analysis of 2 X 2 contingency tables raises :
1) Yates' correction for continuity 2) Computational formula for 2 X 2 tables
Strength of the relationship
*A number of indexes have been proposed for measuring the strength of the relationship between two variables in a contingency table but the most common are: 1) Fourfold point correlation coefficient: when it is applied to the relationship between variables with two levels each 2) Cramer's statistic: When one or both variables have more than two levels. *Can range from 0 to 1.00 with 0 indicating no relationship and 1.00 indicating a perfect relationship.
Expected frequencies and the Chi-Square statistic
*Application of the chi-square test requires the computation of an expected frequency for each cell of a contingency table under assumption that there is no relationship between the two variables in the population. *Expected frequencies tend to increase as the size of the overall sample increases.
Nonparametric statistics
*Are a class of statistical tests that focus on distributions of scores rather than means and the measures of variability that are central to the parametric tests that we have considered previously.
Cell
*Each unique combination of variables in a contingency table is referred to as this.
Alternatives to the chi-square test
*Fischer's exact test
Computational formula for 2 X 2 tables
*If both variables have only two levels, the chi-square statistic can be derived using a different computational formula.
Sampling distribution of the chi-square statistic
*If two variables are unrelated in the population, the population value of the chi square statistic will equal 0. -However, because of sampling error, a chi-square statistic score that is computed from sample data might be greater than 0 even when the null hypothesis is true.
Yates' Correction for Continuity
*Involves subtracting .5 from each absolute O - E value before these quantities are squared, divided by E, and summed across cells. *Research on this issue clearly indicates that Yates' correction should not be used because it tends to reduce the power of the chi-square test below what it would otherwise be while adding little control over type I errors.
Chi-Squre test
*Is nonparametric in nature *Is designed to analyze the relationship between two variables using frequency information.
Chi-square statistic
*Reflects the overall differences between the observed and expected frequencies.
Chi-square distribution
*Takes on different shapes depending on how many degrees of freedom are associated with it.
Two-Way Contingency Tables
*The basis of analysis for the chi-square test is a contingency table. *Called a two way table because it examines two variables *Also referred to as a 2 X 3 (two by three)
Assumptions of the Chi-square test
*The chi-squared test is typically used to analyze the relationship between two qualitative variables, however, it an also be applied when one or both variables are quantitative. -Assumptions: 1) The observations are independently and randomly sampled from the population of all possible observations. 2) The expected frequency for each cell is nonzero.
Observed Frequencies
*The entries within the cells represent the number of individuals who are characterized by the corresponding values of the variables and are referred to as this.
Expected frequencies
*The logic underlying the chi-square test focuses on this concept. -These expected frequencies can be compared with the frequencies that are observed when you actually flip the coin
Null and Alternative Hypotheses
*The null hypothesis states that these two variables are not related in the population *The alternative hypothesis states that there is a population relationship between them.
Marginal Frequencies
*The numbers in the last column and the bottom row indicate how many individuals have each separate characteristic. *Are the sums of the frequencies in the corresponding rows or columns.
Sampling distribution of the chi-square statistic:
*The sample chi square statistic values,if computed for all possible random samples of a given size, would constitute this.
Chi-Square tests of homogeneity
*When the marginal frequencies are random for one variable and fixed for the other, the analytical procedures are identical but the test is referred to as this.
Chi-Square tests of independence
*When the marginal frequencies for both variables are random.