Two-Way Contingency Table Analysis Using Crosstabs

Ace your homework & exams now with Quizwiz!

Cramer's V

If both the row and the column variables have more than two levels, phi can exceed 1 and, therefore, be hard to interpret. Cramér's V rescales phi so that it also ranges between 0 and 1: For 2 × 2, 2 × 3, and 3 × 2 tables, phi and Cramér's V are identical:

Understanding a Two-Way Contingency Table Analysis

The same chi-square test for a two-way contingency table may be applied to studies investigating both the independence and homogeneity hypotheses. If the resulting chi-square value is significant and the table includes at least one variable with more than two levels, a follow-up test may be conducted. These follow-up tests are particularly crucial for studies assessing homogeneity of three or more proportions. The methods employed in these follow-up tests are conceptually similar to those used to conduct follow-up tests for a one-way ANOVA to evaluate differences in means among three or more levels of a factor.

Three types of studies can be analyzed with the use of two-way contingency tables

*Independence between variables.* Research participants are sampled and then measured on two response variables—the row and the column variables. In other words, in these studies, the total number of participants is controlled, but not the number of individuals in the columns or in the rows. The relationship between the row and the column variables in the population is being evaluated. *Homogeneity of proportions.* Samples are drawn to represent populations of interest, either by sampling from different populations or by randomly assigning subjects to groups and treating them differently. The individuals in these samples are measured on a response variable. The rows might represent the different populations, and the columns might be the response categories on the response variable. For these studies, the total number of research participants in each row is controlled, but not the total number of participants in each column. The contingency table analysis evaluates whether the proportions of individuals in the levels of the column variable are the same for all populations (i.e., for all levels of the row variable). *Unrelated classification.* Here, we control the total number of subjects in each row and in each column. Researchers infrequently design studies investigating unrelated classifications.

Two-Way Contingency Table

A two-way contingency table analysis evaluates whether a statistical relationship exists between two variables. A two-way contingency table consists of two or more rows and two or more columns. The rows represent the different levels of one variable, and the columns represent different levels of a second variable. Such a table is sometime described as an r × c table, where r is the number of rows, and c is the number of columns. For example, a 3 × 4 contingency table is a table with a 3-level row variable and a 4-level column variable. The cells in the table, which are the combinations of the levels of the row and the column variables, contain frequencies. For a 3 × 4 contingency table, there are 12 cells with frequencies. A cell frequency for a particular row and column represents the number of individuals in a study who can be cross-classified as belonging to that particular level of the row variable and that particular level of the column variable. Analyses of two-way contingency tables focus on these cell frequencies to evaluate whether the row and column variables are related.

Two-Way Contingency Table Analysis: Independence between Variables

Application Example: Carrie is interested in assessing the relationship between religion and occupation for men. She samples a group of 2,000 men between the ages of 30 and 50 and asks them to complete a demographic questionnaire. On the basis of their answers, she classifies each man as practicing one of four religions (Protestant, Catholic, Jewish, and Other) and employed in one of six occupations (Professional, Business-Management, Business-Nonmanagement, Skilled Tradesman, Laborer, and Unemployed). Religion and occupation are the row and the column variables, and the frequencies are the number of men who are classified as belonging to the 24 (4 × 6) combinations of religious and occupational categories. Using the standard structure, Carrie would have 2,000 cases to represent the 2,000 men. Each man would have scores on at least two variables, religion with four levels and occupation with six levels. With the weighted cases structure, Carrie would have 24 cases, representing the 24 combinations of the two variables. Each case would have scores on at least three variables: the two research variables of religion and occupation and a weight variable. A score on the weight variable for a case would represent the number of individuals who are classified as belonging to the cell represented by that case.

Two-Way Contingency Table Analysis: Homogeneity of Proportions

Application Example: Claude wants to determine whether young men are unjustifiably more likely to treat elderly people in a condescending manner. To address this hypothesis, he recruits a young woman from his class who indicates that both her mother and her grandmother are physically, emotionally, and intellectually active. The daughter, mother, and grandmother are 20, 43, and 72 years old, respectively. Claude gains the cooperation of all three women to participate as confederates in his study. The three women take a two-hour lesson to learn a computer game called Wits. At the end of the two hours, they pass a criterion indicating total mastery of the game. In addition, they learn a script so that they can all act in the same manner as confederates in the study. Claude recruits 90 male college students between the ages of 17 and 22 to participate in his study. All 90 males take a 20-minute lesson to learn the game of Wits and pass a test indicating minimal knowledge of the game, although on the basis of their test scores none of the males show total mastery. Next, he tells each student that he is now going to instruct a woman who has no experience with the game. The students are then asked to switch seats from the chair in front of the computer to the one behind it. Each student is then introduced to one of the women, as determined by random assignment. A third of the men meet the daughter, a third meet the mother, and a third meet the grandmother. The men next give verbal instructions from their seats on how to win at Wits. Regardless of the instructions given by the men, all three women show exactly the same rate of improvement for all 90 participants. All sessions are videotaped. The tapes show only the faces of the male participants and exclude all verbal comments made by the women so that observers of the tape are completely blind to the age of the woman being "trained." A reliable judge observes the tapes and concludes whether each male college student acts condescendingly or not. For this study, the row variable is age of women with three levels (young, middle-aged, and elderly) and the column variable is condescension with two levels (not condescending and condescending). The frequencies are the number of participants who are classified as belonging to the six combinations of age and condescension (3 levels × 2 levels).

Assumptions Underlying a Two-Way Contingency Table Analysis

Assumption 1: The Observations for a Two-Way Contingency Table Analysis Are Independent of Each Other To meet this assumption, studies should be designed to prevent dependency in the data. If this assumption is violated, the test is likely to yield inaccurate results. Assumption 2: Two-Way Contingency Table Analyses Yield a Test Statistic That Is Approximately Distributed as a Chi-Square When the Sample Size Is Relatively Large There is no simple answer to the question of what sample size is large enough. However, to answer this question, the size of the expected cell frequencies rather than the total sample size should be examined. For tables with two rows and two columns, there is probably little reason to worry if all the expected frequencies are greater than or equal to 5. For large tables, if more than 20% of the cells have expected frequencies that are less than 5, you should be concerned about the validity of the results.

SPSS data file for a two-way contingency table analysis

Regardless of the design of the study, an SPSS data file for a two-way contingency table analysis can be structured in one of two ways. With the standard method, the SPSS data file contains as many cases as individuals. There are two variables, each of which can have two or more values that represent the two or more categories for that variable. With the weighted cases method, the SPSS data file contains as many cases as category combinations (cells) across the two focal variables. For example, there are 12 cases for a 3 × 4 contingency table. For the weighted cases method, there are three variables. Two are the focal variables with values representing their categories, and the third variable is a weight variable containing frequencies for the category combinations (cells) across the two focal variables.

Effect Size Statistics for a Two-Way Contingency Table Analysis

SPSS provides a number of indices that assess the strength of the relationship between row and column variables. They include the contingency coefficient, phi, Cramér's V, lambda, and the uncertainty coefficient. We will focus our attention on two of these coefficients, phi and Cramér's V. The phi coefficient for a 2 × 2 table is a special case of the Pearson product-moment correlation coefficient for a 2 × 2 contingency table. Because most behavioral researchers have a good working knowledge of the Pearson product-moment correlation coefficient, they are likely to feel comfortable using its derivative, the phi coefficient. Phi, , is a function of the Pearson chi-square statistic, 2, and sample size, N. SEE GRAPHIC CLIP Like the Pearson product-moment correlation coefficient, the phi ranges in value from −1 to 11. *Values close to 0 indicate a very weak relationship, and values close to 1 indicate a very strong relationship.* If the row and column variables are qualitative, the sign of phi is not meaningful and any negative phi values can be changed to positive values without affecting their meaning. By convention, phi's of .10< .30, and .50 represent small, medium, and large effect sizes, respectively. However, what is a small versus a large phi should be dependent on the area of investigation.


Related study sets

22-23 Biology Fall Semester Exam Review

View Set

13.5 - BIOS/UEFI Security - Terms/Notes

View Set

Chapter 1-6 Strategic Management

View Set

Early Childhood: Psychosocial Development

View Set

Chapter 5- Upper Extremity Part #1 ANATOMY

View Set

Intercultural Exploration of Families (test)

View Set

Molecular Biochem Test 2 Chp 6-8

View Set