Chapter 8
In a study of red/green color blindness, 500 men and 2100 women are randomly selected and tested. Among the men, 45 have red/green color blindness. Among the women, 6 have red/green colour blindness (based on data from USA Today). Is there sufficient evidence to support the claim that men have a higher rate of red/green color blindness than women? Use a 0.01 significance level. a) What is the parameter of interest? Provide notation. What is the hypothesis of interest? b) What is the appropriate test? Explain. What is the name and value of the test statistic? c) What is the pvalue? d) What do you conclude? e) Construct the 98% confidence interval for the difference between the color blindness rates of men and women. Does there appear to be a substantial difference?
1. Men: x1 = 45, n1=500; x̃1=47, ñ1=504 --> p̃1 =47/504 = 0.09325 2. Women: x2 = 6, n2=2100; x̃2=8, ñ=2104 --> p̃2 =8/2104 = 0.00380 Use the p̃-bar method = (x1+x2+2) / (n1 +n2+ 4) = 53 / 2604 = 0.02035 a) p1: True population of men with red/green colour blindness p2: True population of women with red/green colour blindness Ho: p1=p2 vs Ha: p1>p2 Ho: p1-p2=0 vs Ha: p1-p2>0 'do' is the value of difference in the population under Ho One sided right tail test b) Z-test by CLT (approx) Use p̃ method Z-statistic = (p̃1-p̃2) - do / [p̃-bar(1-p̃-bar)(1/ñ1 +1/ñ2)] = 12.77 c) One sided right tail test Pv = P(Z>Z-stat) Pv = (Z>12.77) Pv = normcdf(12.77,1e99,0,1) Pv = 1.25e-17 ~0 d) α = 0.01 Since Pv<α we reject Ho, so conclude that the proportion of men with colour blindness is greater than proportion of women with color blindness (Ha) e) 2-propZ-int Only adjust for proportion questions x̃1 as x1 x̃2 as x2 ñ1 as n1 ñ2 as n2 (0.05916,0.11975) ^^Since does not include 0, and is positive difference is significant as p1-p2 >0. Proportion of red/green colour blindness is higher in men
Tree diagram difference?
If any sigma is unknown, use the T-interval regardless of n
An experiment is designed to test the effectiveness of paroxetine for treating bipolar depression. Subjects were measured using the Hamilton Depression Scale; the data summary is given below: Placebo: n=43, x-bar = 21.57, s=3.87 Paroxetine: n=33, x-bar = 20.38, s=3.91 The objecttive is to test the claim that the placebo and paroxetine groups come from populations with the same mean. Using a significance level of 5%, test the above claim.
Let placebo group be index 1 (average depression score for placebo group) Let treatment group be index (average depression score for treatment group) Ho: μ1=u2 Ha: μ1≠μ2 ^^ By questions ^^ By opposite Ho: μ1-μ2 = 0 Ha: μ1-μ2 ≠0 'do' is the value of the difference under the null hypothesis. Here do = 0 Pool test: (3.91/3.87)^2 = 1.02 1.02 is <= 2 so is pooled 2.Sample-T-Test by tree diagram t-stat = 1.32275 with df = 74 and p = 0.190 Since 0.190 > 0.05, we fail to reject Ho and conclude that the groups have the same mean Also validate via CI 2.-Sample-T-Interval = (-0.6026,2.9826) Since this value contains 0, there is a possibility that the difference is 0, therefore we do not have sufficient evidence to reject the hypothesis that the two groups have the same mean
P. 397, #11 AN experiment conducted to test the effects of alcohol. The errors were recorded in a test of visual and motor skills for a treatment group that rank ethanol and another group that was given a placebo. Claim is that alcohol group will make more errors than placebo group. Ethanol: n1 = 22 x-bar1 = 4.20 s1 = 2.20 Placebo: n2 = 22 x-bar2 = 1.71 s2 = 0.72
Let μ1 be the average error score for the ethanol group, and μ2 denote the average error score for the placebo group Ho: μ1 = μ2 vs Ha: μ1 > μ2 Use tree to see 2-sample-T-test WAIT NO BECAUSE NEED NORMALITY. CANT ANSWER!! Okay now assume normality Pool test = (2.2/0.7)^2 = 9.33 > 2 t-statistic = 5.0245 with df= 25.477 and p=1.579x10^-5 Since 0 < 0.05 we have enough evidence to reject the Ho and conclude that μ1 > μ2 with regards to errors made Also validate with CI 2-Sample-T-Interval Pooled = no (1.4745,3.5055) Since does not contain 0 we reject null, and since both are positive we have evidence that μ1-μ2 >0
What kind of populations are compared?
Populations that are dependent e.g. Behaviour of individual before and after intervention. Like smoking of certain students from high school to university (with intervention in between). Dependency is based on the same population having a change/intervention over time. If not it is independent Populations that are independent (like placebo vs treatment)
As part of the National Health Survey conducted by the Department of Health and Human Services, self-reported heights and measured heights were obtained for males aged 12-16. Listed on the next page are sample results. Is there sufficient evidence to support the claim that there is a difference between self-reported heights and measured heights of males aged 12-16? Use a 0.05 significance level. Assume the differences are normally distributed. Reported Height 68 71 63 70 71 60 65 64 54 63 66 72 Measured Height 67.9 69.9 64.9 68.3 70.3 60.6 64.5 67 55.6 74.2 65 70.8 a) What is the parameter of interest? Provide notation. What is the hypothesis of interest? b) What is the appropriate test? Explain. What is the name and value of the test statistic? c) What is the pvalue? d) What do you conclude? e) Construct a 95% confidence interval estimate of the mean difference between reported heights and measured heights. Interpret the resulting confidence interval.
Reported height = L1 Measured height = L2 L3 = L1-L2 (can be reverse) a) μR = true mean of reported heights μM = true mean of measured μd = μR-μM is true average difference in reported and measured heights Ha: μR≠μM << from question (Ha: μd≠0) Ho: μR=μM (Ho: μd=0) Two-tailed test b) Use tree w/ single population Sigma unknown and n<30 and normality given t-stat = -0.984 df = 12-1=11 <DATA,0,L3,1,≠μ> c) Pv = 0.346 d) α = 0.05 Since 0.346 < 0.05, so fail to reject Ho, so there is no significant difference between reported and measured heights e) T-int via tree diagram <DATA,L3, C.level=0.95> (-3.236,1.236) ^^ Contains zero. So differences between reported and measured heights of men are not significantly different. Same as (d)
What is μd?
The mea for the population difference μd = μA-μB (or μB-μA) We use μd as our single data value for tree diagrams under Chapter 6 and Chapter 7, with the same conditions applying (normality, n, etc)
What data do we use for dependent populations?
Use the difference between the two populations So d = A - B (or B-A) ^^ Just use context of question, only diff is one way may give negative differences. ^^ Just *be consistent* with how you define 'd' in the question.
Many studies have been conducted to test the effects of marijuana use on mental abilities. In one such study, groups of light and heavy users of marijuana in college were tested for memory recall, with the results given below. Items sorted correctly by light: n = 64, x̄ = 53.3, s = 3.6 Items sorted correctly by heavy marijuana users: n = 65, x̄ = 51.3, s = 4.5 Use a 0.01 significance level to test the claim that the population of heavy marijuana users has a lower mean than the light users. Should marijuana use be of concern to college students? a) What is the parameter of interest? Provide notation. What is the hypothesis of interest? b) What is the appropriate test? What is the name and value of the test statistic? c) What is the pvalue? d) What do you conclude? e) Construct a 98% confidence interval for the difference between the two population means. Interpret this confidence interval.
a) μ1 denotes the true average mental abilities of light marijuana use μ2 denotes the true average mental abilities of heavy marijuana use Ha: μ1 > μ2 (from question)/(since we made 1=light,2=heavy) Ho: μ1=μ2 (by opposite) b) 2-sample T-test ^^ n>30 but don't know sigma so CLT normality "pool" check by [(bigsx)/(smallsx)]^2 <= 2 = pool, >2 = not pool 1.502 = pool t-statistic <STATS,μ1>μ2,"pooled"=yes> t-stat=2.785 c) Pv=0.003089 d) α=0.01 0.003089 <0.01. so reject Ho, so average mental ability of light marijuana users is greater than the average mental ability of heavy marijuana users e) 2-sample T-int, "pool" = yes (0.30788,3.6921) ^^ Does not include zero, and all positive, so reject Ho as μ1=μ2>0 (μ1>μ2). Same conclusion as (d)
A study was conducted to investigate the effectiveness of hypnotism in reducing pain. Results for randomly selected subjects are given in the accompany table. The values are before and after hypothesis; the measurements are in cm on a pain scale. Before: 6.6 6.5 9.0 10.3 11.3 8.1 6.3 11.6 After: 6.8 2.4 7.4 8.5 8.1 6.1 3.4 2.0 a) Construct a 95% CI for the mean of before-after differences. Assume normality. b) Use 0.05 significance level to test the claim that the sensory measurements are lower after hypnotism c) Does hypnotism appear to be effective in reducing pain? What would have happened if CI included 0?
a) Where μ is the pain score a person is experiencing μA: pain score a person is experiencing after hypnosis μB: pain score a person is experiencing before hypnosis <STAT>,<EDIT>, INPUT <L1= BEFORE> INPUT <L2 = AFTER> We may use the definition μd = μA-μB for this question. So we will be consistent with this. Get our single μd value using <L3="L1-L2"> Use tree diagram - No sigma - n <30 -Normality For textbook example we can't go further since normality not given For class example we can use T-interval using L3 T-int(L3,1,0.95) = (0.69, 5.56) d-bar = 3.125 Sd= 2.9114 n=8 Since CI is positive: - μd>0 - μB-μA>0 -μB>μA So mean pain before hypnosis is greater than mean pain after hypnosis So hypnosis is effective! ^^ Ok but n=8 and assume normality. Take with a grain of salt. If CI included a 0 can't say there was a significant difference, because Ho is valid where there is no difference b) α=0.05 Claim: hypnosis is effective in reducing pain Ha: μB<μA and Ho: μB=μA (by opposite) -Now rewrite so has 0,negative,positive Ha: μB-μA= positive and Ho: μB-μA=0 -Now rewrite in terms of differences Ha: μd>0 and Ho: μd=0 Use T-test(0,L3,1,>μo) t-stat = 3.036 df=8-1=7 p = 0.00948 α = 0.05 c) Since 0.00948 < 0.05, we reject Ho and conclude that hypnosis is effective in reducing pain (Ha) since do>0, μB-μA>0, μB>μA
Many people have high anxiety about visiting the dentist. Researchers want to know if this affects blood pressure (B.P.) in such a way that the mean blood pressure while waiting to see the dentist is higher than it is an hour after the visit. Ten individuals have their systolic blood pressure measured while they are in the dentist's waiting room and again an hour later after the conclusion of the visit to the dentist. The data are as follow: B.P. Before 132 135 149 133 119 121 128 132 119 110 B.P. After 118 137 140 139 107 116 122 124 115 103 Assume normality if needed for this problem. Let μBefore denote the true mean blood pressure while waiting to see the dentist, μAfter denote the true mean blood pressure after seeing the dentist. a) The appropriate alternative hypothesis for testing the researcher's claim is b) The appropriate test statistic, its value and the respective p-value for the hypothesis c) Suppose the p-value corresponding to the appropriate test statistic turns out to be 0.1, then what would be your conclusion at 10% significance level? d) The 92% confidence interval for the true average *difference* in blood pressure that compares the blood pressure before and after seeing the dentist, is
a) Ha: μB > μA (do>0); Ho: μB = μA(do=0) b) T-test by single population tree with L1 = μB, L2=μA and L3=L1-L2 as single population T-statistic = 2.99, P = 0.0075. (Reject Ho) c) Since 0.1 = 0.1, recall fail reject if p≥α we fail to reject if p<α so reject Ho. d) T-interval by single population tree (L3,1,0.92) = (1.945,9.455). Reject Ho and say that BP before dentist is higher than after dentist
A researcher suggests that there are occupational differences in the means testosterone levels. Looking a subset of the data we have the following summary statistics for testosterone levels in of two occupational groups - Medical doctors and University professors. Assume that the testosterone levels are (approximately) normally distributed in each of the groups. X: n, Mean, Sx Doctors: 16, 11.6, 3.39 Professors: 10, 10.7, 2.59 Let 'μDoctors' denote the true mean testosterone level in doctors, 'μProfessors' denote the true mean testosterone level in professors. a) Suppose you want a 95% confidence interval for μDoctors then what is the appropriate method and why? b) What is the 95% confidence interval for μDoctors? c) What is the 95% confidence interval for μProfessors? d) In comparing the 95% confidence interval for μDoctors and μProfessors, what is the conclusion, if any? e) The appropriate null and alternative hypothesis for testing the researcher's claim about the testosterone levels in the two groups is f) The appropriate test statistic, its value and the respective p-value for the hypothesis in (e) is g) Suppose the pvalue of the appropriate test statistic turn out to be 0.06, then what would be your conclusion at 5% significance level? h) The 95% confidence interval for the true mean difference in the testosterone levels of the two occupational groups is
a) Single population, no sigma, n<30 = T-interval (normality given) b) (9.79,13.4) c) (8.85, 12.6) d) Since overlap, no significant difference can be concluded. e) Ha: μD≠μP, Ho: μD=μP f) T-statistic from 2SampleT-test = 0.717; p=0.48 g) Since 0.06 > 0.05, fail to reject the Ho, and conclude that the mean testosterone levels are the same h) 2SampleT-interval (-1.69, 3.49). Since this CI includes zero, we fail to reject Ho and same conclusion as in (g)