Stats 1430 Chapter 4: Two-Way Tables
What do we call the 4 boxes inside a two way table
Cell
Is X related to Y?-Method 2(Categorical Variables)
-Compare conditional distribution to marginal (overall) distribution. ~Compare row 1 and row 3 ~Or Compare row 2 and row 3
Is X related to Y?-Method 1(Categorical Variables)
-Compare the conditional distributions ~If the same = no relationship ~If Different = is a relationship -Use % to describe it -Use conditional distribution -Look at row 1 and 2
Separate (Marginal) Distribution
-In the margins of the table (marginal distribution of) AD: -SAW: 120/200 = 0.60 -Didn't See: 80/200 = 0.40 - 0.4+0.6 = 1.0 Purchase: -Yes: 60/200 = 0.30 -No: 140/200 = 0.70 -0.30+0.70 = 1.00
Joint ("and") Distribution
-Overall percentage in each cell -Sums to one (in lower right hand corner)
Exploring relationships between categorical variables
-Should we market BMW's more to men or women? Or is it about the same? -Which age group purchases more iPods? -Is political affiliation related to gender? -If a child's parents both smoke, are they more likely to smoke also? -Does taking aspirin reduce heart attacks?
Simpson's Paradox
Def:When one relationship appears in a two way table, but reverses when a 3rd compounding variable is added -Watch for Simpson's Paradox ~Break down results further, not just by one variable ~Look for lurking confounding variables and collect data on them also ~Which is the most informed data set? The one that is further broken down - to a point ~This is a common phenomena in the real world
Whenever it says this is... it is what comes first
EX: Probability that a female student is wearing a backpack. P(B|F)
What type of distribution shows all the "AND" Cells (percentages)
Joint Distribution
WHat shows one single variable at a time?
Marginal Distribution
Two Way Table Example (Did they see the AD in the Newspaper and make a purchase?)
Purchase: No Purchase Total: Saw AD: 36 84 120 Didn't See: 24 56 80 Total: 60 140 200
Joint Distribution of Ad and Purchase
Purchase: No Purchase Total: Yes: 36/200=.18 84/200=.42 120/200=.60 No: 24/200=.12 56/200=.28 80/200=.40 Tot: 60/200=.30 140/200=.70 200/200=1.0
Yes: No: Total: Male: 10 20 30 Female: 30 40 70 Total: 40 60 100
Q: What is the joint distribution of gender and opinion? A: 10/100 20/100 30/100 40/100 n=100 Q: What is the marginal distribution for gender A: 30/100=0.3 male 70/100=0.7 female n=100 Q:What is the marginal distribution for opinion A: 40/100=0.4 yes 60/100=0.6 no n=100 Q:Conditional distribution of gender given yes
Seeing ads vs. making a purchase
-Survey of 200 people leaving a retail outlet ~Did they see the ad in the newspaper? (Y/N) ~Did they make a purchase? (Y/N) -Overall Distribution: 1) 36 saw the ad and made a purchase (36/200=18%) 2) 84 saw the ad but did not purchase (84/200=42%) 3) 24 didn't see the ad and made a purchase (24/200=12%) 4) 56 didn't see the ad did not make a purchase (56/200=28%) Purchase: No Purchase Total: Yes: 36/200=.18 84/200=.42 120/200=.60 No: 24/200=.12 56/200=.28 80/200=.40 Tot: 60/200=.30 140/200=.70 200/200=1.0
Conditional Distribution
1) Conditional Distribution of purchase for those who saw the AD (Purchase/AD) n=120 Purchase: 36/20=0.3 No Purchase: 84/120=.7 2)Conditional Distribution of purchase for those who didn't see the AD (Purchase/No AD) n=80 Purchase: 24/80=0.3 No Purchase: 56/80=0.7 **AD had no effect because the percentages didn't change**
Sample Space
is a set of all possible outcomes of some random process.