Quiz 3: Ch 9- Chi Square test of Independence

Ace your homework & exams now with Quizwiz!

Conditional Probability

the probability that one event happens given that another event is already known to have happened the probability of an event ( A ), given that another ( B ) has already occurred. INVOLVES A COMBINATION ON AN INNER CELL NUMBER (NOT IN OUTSIDE ROWS OF TABLE) AND AN OUTER CELL NUMBER

Marginal Probability

the values in the margins of a joint probability table that provide the probabilities of each event separately *depends only on totals found in the margins of the table* INVOLVES NUMBERS FOUND ONLY IN THE TWO MOST OUTER ROWS example of a marginal probability given the table in the picture: 22/30 or 8/30 or 23/30 or 7/30

Expected frequencies, alpha, degrees of freedom, chi critical value, chi distribution, and test statistic

*Expected frequencies:* *Expected value = (row sum * column sum) / table sum.* *Alpha:* 0.05 (ALWAYS right tailed) *Degrees of Freedom:* df= (number of rows−1) (number of columns−1) ---- get chi critical value from df and alpha ---- get chi distribution graph from df *test statistic:* We calculate the χ 2 statistic to test whether these discrepancies are greater than expected *by chance.* χ2=∑(Observed−Expected)^2/Expected *decision rule:* REJECT NULL if χ2> critical, then p-value < α FAIL TO REJECT NULL If χ2< critical, then p-value > α

contingency analysis

estimates and tests for *an association between two or more categorical variables* use when we want to determine to what extent one variable is "contingent" on the other. Analysis of contingency data can be used to answer questions such as the following: ■ Do bright and drab butterflies differ in the probability of being eaten? ■ How much more likely to drink are smokers than nonsmokers? ■ Are heart attacks less likely among people who take aspirin daily? At the heart of contingency analysis is the investigation of the *independence of variables*. recall that independent events have the same probability of occuring. ex. chi square test of independence

chi-square test for independence

hypothesis-testing procedure that examines whether the distribution of frequencies over the categories of one nominal variable is unrelated to the distribution of frequencies over the categories of a second nominal variable TWO CATEGORICAL VARIABLES! *LARGER SAMPLES HAVE MORE POWER* *Tests the frequency of outcomes in a contingency table to see if two categorical variables are related/are or are not independent* Compares the observed counts (from a random sample) in a contingency table to what we would expect if the *two variables were independent*

Assumptions for Chi-square test of independence

1.Data is from a random sample 2.Cases must be independent observations Therefore the sum of all cell frequencies in the table must be the same as the number of subjects in the experiment. 3.Must have a sufficient sample size •All expected counts ≥ 1 •At least 80% of expected counts ≥ 5 If these rules are not met, if the table is bigger than 2 × 2, then two or more row categories (or two or more column categories) can be combined to produce larger expected frequencies. This should be done carefully, though, so that the resulting categories are still meaningful. expected value: we will calculate the expected values for each cell in the contingency table using the following formula: *Expected value = (row sum * column sum) / table sum.*

Chi-square test of independence steps:

1.State the hypotheses 2.Calculate expected counts (under H0) and check assumptions 3. State alpha, degrees of freedom, critical chi value and Calculate the test statistic (𝜒!) and compare to the critical value 4.Make a conclusion in context, citing the appropriate statistics (𝜒!, df, p-value/alpha)

chi-square test for independence vs independent samples t-test

Chi-Square Test for independence: allows you to test whether or not not there is a statistically significant association between two categorical variables. When you reject the null hypothesis of a chi-square test for independence, it means there is a significant association between the two variables. t-Test for a difference in means: allows you to test whether or not there is a statistically significant difference between two population means. When you reject the null hypothesis of a t-test for a difference in means, it means the two population means are not equal. The easiest way to know whether or not to use a chi-square test vs. a t-test is to simply *look at the types of variables you are working with.* *If you have two variables that are both categorical, i.e. they can be placed in categories like male, female and republican, democrat, independent, then you should use a chi-square test.*

Null and Alternative hypothesis for Chi-square test of independence

H0: there is NO association between the two variables. (they are independent) Ha: there IS an association between the two variables. (they are not independent) *write these in context* alternate way of writing that means the exact same thing: H0: The two variables are independent. H1: The two variables are not independent. (i.e. they are associated)

odds ratio for categorical analysis (effect size)

IS KNOWN AS THE EFFECT SIZE FOR THE TEST OF INDEPENDENCE! *ODDS category 1/ ODDS category 2* where ODDS= p/1-p (probability of success over probability of failure). *calculate odds for each group then place them over eachother*. ODDS interpretation can be written 2 ways and both are acceptable; ---if final odds ratio is greater than 1, state "the odds of _____ improving is ______ times *higher* for (numerator)" --- if final odds ratio is a decimal (less than one) TAKE THAT DECIMAL AND SUBTRACT IT FROM ONE! State " the odds of ______ improving is _____ times *lower* for numerator group. If the *odds ratio is equal to one*, then the odds of success in the response variable are independent of treatment; the odds of success are the same for both groups. If the *odds ratio is greater than one*, then the event has higher odds in the first group than in the second group. Alternatively, if the *odds ratio is less than one*, then the odds are higher in the second group. The odds ratio measures the magnitude of association between two categorical variables when each variable has only two categories. *The odds of success are the probability of success divided by the probability of failure between the two groups* *The odds ratio is the odds of success in one group divided by the odds of success in a second group.*

decision rule for chi-square test for independence

REJECT NULL if χ2> critical, then p-value < α FAIL TO REJECT NULL If χ2< critical, then p-value > α *state conclusion in context citing X^2, p alpha relation, and df*

chi-square goodness of fit test vs chi-square test of independence

The χ 2 contingency test is a special case of the χ 2 goodness-of-fit test You may have noticed that, once we specified the expected values, the χ 2 contingency test was very similar to the χ 2 goodness-of-fit test introduced in Chapter 8. This resemblance is no accident, because the χ 2 contingency test is a special application of the more general goodness of-fit test for which the probability model being tested is the independence of variables. The number of degrees of freedom for the contingency test obeys the same rules as those for the goodness-of-fit test. *ONE CATEGORICAL VARIABLE: GOODNESS OF FIT TWO CATEGORICAL VARIABLES: TEST OF INDEPENDENCE*

key points and end of chapter summary:

■ The odds of success are the probability of success divided by the probability of failure, where "success" refers to the outcome of interest. ■ The odds ratio is the odds of success in one of two groups (the treatment group, if one is present) divided by the odds of success in the second group (the control group, if one is present). The odds ratio is used to quantify the magnitude of association between two categorical variables, each of which has two categories. ■ The χ 2 contingency test makes it possible to test the null hypothesis that two categorical variables are independent. ■ The sampling distribution of the χ 2 statistic under the null hypothesis is approximately χ 2 distributed with (r − 1)(c − 1) degrees of freedom. The χ 2 approximation works well, provided that two rules are met: no more than 20% of the expected frequencies can be less than five, and none can be less than one.


Related study sets

Psychology Chapter 11, Psych. 210 Chapter 11, Chapter 11 Motivation and Emotion Learn Smart

View Set

Introduction to Sociology Midterm Review: Chapter 3 - True/False & Multiple Choice

View Set