Tingnan ang lahat ng mga set ng pag-aaral

Hypothesis Tests 3 (M12)

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

So we all know that with the normal curve sometimes you land out in the tail. So if you make 100 tests you could expect that

5 of them would show up in those tales just by chance

What do you think the chances are? Anything tha is making you worried?

5s and 2s look worriesom.

• Finish off 28 by talking about a slight different kind of chi-square test. o So to kind of review we had multiple categories, so you had multiple mms, different sides of a die, whereyou had expected values for each of those categories. And that type of test is called a goodness of fit.

Does what you observe fit your expected percentages?

o That this difference isn't due to chance, that maybe something went wrong with the mixing machine or with a filling machine, so we are ending up with way too many organges or something like tha. Now this bottom bullet is really important and why you can't do it with the fun size bags. You

REQUIRE an expected value of at least 5 in each category for this test to be valid.

What do I do when p value is 0.

Reject the null. So was gender and survival independent? Way more women survived than you would have expected to survive than men.

o So in a little fun size bag, there is only 20 mms, so 10% of 20mms would be 2 mms that's too small for the test to be valid. So you do have to reach this

critical mass of sample size before a chi-square test of goodness of fit is a sensible thing to do.

Chi square tests can also be used to

decide if two qualitative variables are dependent, that is, if they are related to or influence each other.

Data Snooping

deciding what hypothesis to test after looking at the data is bad practice. Data snooping makes p-values hard to interpret Reliable investigators test their conclusions on an independent batch of data.

o So people take it seriously but the point here is that sometimes it will jsut happen by chance. So as a conclusion on that,

deciding what hypothesis to test after you look at your data isn't a great idea

Box models Is the difference chance? A box model must

define chance. Test of significance should not be performed -on data for the whole population -on samples of convenience.

o Chi is the reek letter for "k." Chi square distribution, this is a distribution kind of like the t-distibution in that you have

different degrees of freedom and you get a different curve depending on how many degrees of freedom you have

So what do I get when I add them all up? The total was 300. I'm going to divide by 5 which means I expect 60 on each day. If it was

evenly distrubted it would be 60 each day.

o So I can find this number 529.5 by taking the row total 848 times the column total 806 and divide by 1291. So get 529.4 o So what percentage, I would have expected then about 35% of the women to die. o So calculate that by taking the row total for the women, and the column total for deaths and divide it by 1291. So would have expected 276.6. So just calculating the percentage of each one based off of the totals. o So in the next row to find the number of men I expected to survive? What would I multiply? 848 times 485 and divide by 1291. Get 318.5. And then last one, I'm going to take the row total 443 times column total and divide by 1291. Student question: is there ever a time when your row total and your column total, if you were to switch them I know they would work out the same, but is there ever a time that if you put the men and owmen in as the columns and the died and survived in as rows?

No it wouldn't because multiplication is communative, if you happened to grab the row total or do the column total times the row total that works too.

o There are a lot of calculations on these problems so when you're doing this on the test, it's really easy to miss type something so when we grade this we look at your work, it's really important especially on this problem that you write down what you intended to put into your calculator because that will count more than if you actually performed the calculation. o Whereas if you just write down the wrong number and I can't tell you knew what you were supposed to do I can't give you as much partial credit for that. o So showing your work is super important on this type of question. o So here is my test statistic, chi test is 4.81, how many degrees of freedom on this one? 4, 5 days of the week minus 1, that's 4 degrees of freedom. o So when I go get test statistic it's 4.81. Student question: always go to the right?

On these yes.

So what is that percentage that I expect for each one. Or in a related question, how many ones do I expect to get if I roll a die 240 times?

So I have 240 times, I have 6 different categories and I expect them to be equal. So I can just got 240 divided by 6 and get 40

The alternate is that

gender and survival are dependent

So in a situation like this I actually want to

include both sides of it. Too much on one and too much on another, both of those would cause me to reject my null hypoethesis.

The only difference is now we have

a bunch of different parameters, like a parameter for each different category and we're tyring to match to a bunch of different categories instead of just one or two like what we were doing before.

The last type of chi-squrae test we will talk about is

a test for independence

Investigators should not

• Only compare P to 5% or 1%

• The P-value for a test depends on the sample size.

• Small differences can be "significant" in large samples, but not be important. • Important differences may not be statistically significant if the sample is too small.

Investigators should

• Summarize the data • Say what test was used • Report the P-value

From slides: one tailed tests

• Use when the alternative hypothesis says that the average of the box is bigger than a given value.

o What's the difference between significant and nonsignificant? WE jave been using 5% for a p value as a significant. Bascially you are saying that you are

♣ accepting a 5% chance that when somebody lays their cards on the table and says I have 3 fives and you call BS that it was a mistake, that you rejected a null hypothesis that was true. That's your chance of making a mistake that rejecting that null hypothesis is true

o So this first one I'm going to go 18-13.65 squared divided by 13.65. then same for other rows. Going to strongly recommend that you

♣ write this out each time because it's really a lot of typing numbers into your calculator and it's really easy to type the wrong numbers if you don't have them written out. ♣ And you can round your calcuations to one decimal place. We will get some round off here, but not too big of a problem.

Okay so what I'm going to do on this slide is

is take and calculate the observed minus the expected squared, divided by expected for each row

If there is no planned use of chance

it's garbage.

Only reject the null hypothesis if

less than 5%

And so say yo collected all this data an dyou 50 different variables that you're collecting and it turned out that two of them were significant. Does that necessarily mean those two things are significant?

not really because you would expect to see some significant things even if your null hypothesis is try. You expect to make that mistake sometimes.

The other thing is that it's

not symmetric. So we are just going to be doing upper tail p-values here

Okay our rule for degrees of freedom in this kind of test is that we take the

number of categories minus 1, and that's my degrees of freedom.

Why did you do the 2-1 standing for. This is the

number of row minus 1 times the number of columns. It's the categories. There was two categories in the row and two categories in the columns. So just like with the mms where you had 6 minus one degrees of freedom. This is categories in the rows minus one times categories in the columns minus one

The alternate hypothesis would be that

our observed counts are not consistent with our observed percentages or expected counts

Data snooping makes

p values hard to interpret and reliable investigators will test their conclusions on an independent batch of data.

And here is how I recommend you do this, so

put paranethesis around the numberator. (10:00) so (18 minus 13.56 squared over 13.65 an dthat will give me that 1.39. The next is .74 and so forth

Expected values calculated using the formula:

row total X column total/sample size

o And you have to be careful of that. Generally what will happen is they run a huge bunch of tests * they have different ways of setting significance depending on how many test you're doing) and then they'll take a completely new group of data and see if they can replicate it. So if you've done it once and it was

significant, and then you collect it on a new sample and it's significant again then say okay maybe something significant is going on here.

Our null hypothesis has always been that our

smaple comes from this population with this claimed parameter.

o And ther eis again some examples in the textbook. o If you are doing significance tests on the entire popuatlion, you know the average of this poupation and that population. They are different you don't have to do a test to see that they are different. If you actually know the population parameters you don't need to do a hypothesis test to see if they're different. Only when you've taken

smaples, so this box model, this idea of taking random samples is CRITICAL to the process

So if you

snoop for a result a significant result you will likely find one.

There's a difference between something that's

statistically significant when you take really large smaples and you get these really skinny normal curves, and something that is important.

• Try one more. o So for this one you're back in the situation of mms because you don't have equal proporitins for the different fish. o See slide, o Total is 500. So take .3 times 500 is 150, etc. Where are you getting the .3 from, that's the 30%. And times them all by 500. Sp just like what we did with mms. If it was the original proportions, we sould still expect 30% of those 500 fish to be catfish, but not exactly because

we just randomly sampled. Supposedly there's a whole lot more than 500 fish in the lake, so we're expecting some differences, but not too many

If yo snoop for a significant result

you are likely to find one.

• Small differences can be

"significant" in large samples, but not be important. • Important differences may not be statistically significant if the sample is too small.

o So here is what my chi-square curve looks like. And most o fthe areas and we'relike clear over into the next room is our test statistic on this graph, like way way out there. o So I need to know the area past 332.2 under the chi square curve. That will be my p-value. Now you can ask your calculator what the area is by saying saying well lower bound is 322.2, big number, and one degree of freedom. Calculator is going to tell you o With the chi square curve we aren't talking about standard deviations. The mathematics is just a little goofier there, so you can think its 332 deaths, not exactly that but kind o fthe idea. I got a p value of 3.1 x10 to the -72%. That's

Our test statistic was 332.2 and degrees of freedom we had 2 categories, male and female in the rows. So I take 2-1 and I had died and survived so categories in the other variable so I end up with how many degrees of freedom here?

Calculate expected

COUNTS by multiplying the expected percentage (as a decimal) to the observed total

So this time in order to figure out how many I expected each day. There are 5 different days so how calculate the expected values.

Do you add them all up and divide by 5? Right I need to find the total first and then divide by 5. That's what would happen if it was equal.

**Note:** we REQUIRE an

EXPECTED value of at least 5 in each category for the test to be valid.

o But when you are filling sharing size bags of mms no one coming along and saying put the extract color in. they will put the percentage in a vat like it should be and they will stir it all up and then just funnel them through into the packags and seal them up. o So we do get because they stir them up in a big we will get randomness from that. Now one of the thing that's super super super important here is when we are calculating observed and expected values for chi squre tests we

HAVE to do observed and expected counts. Not percentages.

What happens though when I start to take random samples, what happens to the shape of the normal curve when I talk about smapling the average? And here talking Central Limit theorem. What happens to the shape of the normal curve as n gets bigger? What happens if I start taking a really large sample?

I'm going to get a normal curve for the boys that looks like this and a normal curve for the girls that looks like this and they start to separate. So you get a statistically significant difference of maybe a point or two, on the SAT math between boty snad girls because you have these super high scoring boys that pull it off.

o So for instance if you want to find the area under the chi squared curve with 6 degrees of freedom to the right of 8.7. So here's that picture, and 8.7 would be about right here. So I want this area. And this has 6 degrees of freedom. So if you look on your calculator under the distrubiton menu you have to scroll down, it's usually the 8th. It's called chi-squredcdf. Don't' use the pdf. And once you put it in, if you separate the information with commas:

Lower, upper, degrees of freedome, and then multiply by 100%. Check to make sure can ge thte same thing on my calculator

Two tailed tests

Use when the alternative hypothesis says that the average of the box differs from the given value.

And when you have all the numbers calculated you

add them up and that's what makes our test statistic. So my chi score statistic is going to be the sum of all those contributions.

o So let's look and see what changes with this curve when we change the degrees of freedom. A couple of things are different with the chi-squre distribution. One, it's

always positive. So it never goes below 0, so if you are going to do a lower tail value you can always just start at zero, because it never goes past that. It's not defined there

So the null hypothesis is

always that the variables are independent and the alternate hypothesis is the variables are dependent.

Want to stress again that the null hypothesis is always

always that they are always independent. You just have to look at the 2 variables being measured you just have to decide if they are independent.

Okay, does a 1 point different between males and females on the SAT math test, is that an important difference? Does it change how you should treat an individual person? An individual person could have landed anywhere ein there. So the fact that there's a statistically significant difference, doesn't mean it's

an important difference, it jsu tmeans that it's an important difference.

Data snooping

an investigator who makes 100 tests can expect 5 results that are statistically significant even if the null hypothesis is true.

Null is

birthweight and graduation are independent and alternate is birth weight and graduation are dependent

Then you will

calculate degrees of freedom take the numbers of rows minus 1 times categories in rows minus 1. And you get 1 degree of freedom.

o Which is a little backwars because we have spent all semester turning things into a percentage, so now changing back from a percentage into a ccount. So the first thing you need to do is

calculate how many mms total you had, so going to add these all up and ended up with 105 mms. So when I go to calculate my expected values, I'm expecting 13% of the 105 mms to be brown. So I'm going to take 105 times .13 that's going to give me 13% of that number which is 13.65

o So similar to what happened with the central limit you have to have a sample size of at least 30 in order to use the normal curve, again we have this minimum sample size requirement. • So how do we calculate the test statistic. This is the part that is most significantly different, although it does have similarities to what we've done before. o When we did z statistics and t statistics we took observed minus expected divided by the standard error. What we are going to do now is a little bit more involved. We are going to

calculate the difference between observed and expected for each category.

Now for this question we have quite a lot of calcualtions we have to do to get our test statistics. The first thing I need to do is

calculate the expected values for each classification of employee. So each classification for teaching level and each classification for job satisfaction

• Slide 2 pg. 6 o So low birth weight is associated with a lot of different poor health outcomes for a lot of children and we want to see if graudaiton rate is one of those outcomes. o WE don't have the total calculations for us so we need to do that first. First we now have got 2 classifications for birth weight and 2 classifications for graduate and did not graduate. So before we get started let's fill in the rest of the table. See slide. So total sample size of 200 children. And let's again

calculate the expected values. See slide for math.

o . So to get this first one right here for the high school teachers who are satisfied I need to take the row total of 395 times the column total of 350 and divide by the total total or the total sample size over here. See slide for math. This gives me an expected value of 209.2. I'm rounding to one decimal place. So again take the row total times the column total divided by the sample size. o I then continue for the elementary school teachers who are satisfied. See slide for math. And you might want to at this point

check to make sure that your rows and column totals for the expected values add up to the same numbers they did. So 209.2 plus 185.8 should give me 395.

• So we're still talking about claims about a parameter for objectives. And info about a sample, and conduct a hypothesis test. Follows the same structure as before. • Today focus on chi-squared test of goodness of fit. • So with goodness of fit test and independence tests we will be using anew distribution. So we've done normal distribution and the t-distribution this is the last continuous distribution we will look at. It's called the

chi-square distribution

o It's possible there are some scenarios where want to do lower tailed, but not done in this class. o And the degrees of freedom increase it squishes and moves the curve along the axis. It gets flatter and wider as you go and skinnier and smushier as you get smaller. The other thing you can notice about the curve is that the big humpy part is

close to the degrres of freedom. So this curve with a degrees fo freedom 6, 5 is right here, so the bulk of the curve is in the neighborhood of 6. But if I raise my degrees of freedom up to 20 over here, and the bulk of the curve is in the neighborhood of 20.

Another example in the text book that talks about his. The other last thing is that these things have to

come from random samples.

o So that's one thing to look for as you are reading journal articles in your field, in looking at what the p value is. o And most of the stuff she sees most of the scholarly papers do report p values so a good thing to understand what htye are. o Generallysay p equals or something like that. They will tell you what test and now that you understand the process of hypothesis testing, there's lots of more details that we didn't cover, but whether or not you're talking about a T test, or z test, or chi test, or any of the other tests theycome up with, most of them

come up with and report a p value. Which is the probability of accepting or rejecting a true null hypothesis

o So I get 42.8 from the first one, see slide for math. o Now add those guys up. And you get possibly the largest chi-square test statistic I have ever seen anywhere. o So once we have that number let's do the whole thing start to finish. The null hypothesis is that

gender and survival, our two variables are independent.

o They'll explain clearly in the article what test was used and will always report a p-value. So we've seen a great range of p-values. We've seen 2.6% of 2.3 times 10 to the negative 70. Now the difference between these two is really a huge difference. As far as

how confident you feel in rejecting the null hypothesis. So knowing that pvalue and reporting it allows people to make their own conclusions about how strongly they can reject this null hypothesis or not.

o So it does just kind of move it's way over, when I'm up to 35, the bulk of the curve is around 35. o It gets more and more symmetric as we get more degrees of freedom, but it doesn't converge to the normal curve like the t-distribution does. • The nice thing aobut the chi-squraed distribution is your knowledge of how to calculate areas from normal curves and t curves transfers over really nicely with chi square curves. So just like with the t-curves you have to

identify the number of degrees of freedom.

• Try this again. Different example we are going to roll a die 240 times, and say we get the following results. Is the die fair? o So before let's do the calculation and then we'll write down the hypotheses of that. So I have my observed counts. They are listed here under frequency but this my observed column. o And I need to come up with some expected values. Now in this problem it doesn't tell me what percent I expect to get why didn't' it do that? It's a die. So

it's impolicit in the statement. If the die is fair we are expecting for all of the percentages to be the same right?

And that any difference we are seeing between that parameter and our sample information is

just due to randomness and that's our null hypothesis here

o Tha'ts your p value and so all of them give you a p value which hyou can treat in the same way regardless of what kind of test it came from. o They will report the test they used. Generally have a table and will report the p value. And generally have lots of different things testing at once so they'll have a p value on one of them, like show a signigiant difference in gender, but not age, so you would have gender in the table, and people to see, that's what I expect to see based on just sampling error. But a p value of .007 that's starting to

look important like there is a real difference there.

o So there is my test statistic. One of the nice things about writing it out in a table like this is when you look at the contributions you can see by the size of it, which one is the largest discrepancy. So if you couoldn't just visually look we do have the largest discrepancy between what we observed an dwhat we expected for green. That's where the lrgest problem was and that's identified in this test statistic. So what that's doing is saying okay, it's

making a contribution for each of those so that you're accounting for each different category in our test statistic.

♣ So the idea that (not going to do it this way in this class,) but this is just the idea, what we do to fix this is we calculate it and if it turns out that muy Z value comes in right here, I'm going to take this area and I'm actually going to say well it could have turned out this way as well and I'm going to take the area on the opposite side and include that in my p value. So basically I take the area in one tail and

multiply it by 2. And that gives me my p value

o So conclusion is Do not reject the null. Is it possible to get a negative test statistic?

o Teach answer: not with a chi square distribution. It is definitely with zs and t's we saw lots of negatives that way. But because we are squaring at each step we are getting a positive number, so these are always going to be positive. If you get a negative you did something wrong

o If I go vertical 209.2 plus 140.8 should add up to 350. And doing that check on the values will help you make sure you've got the correct expected values. o It's very easy to grab the wrong number and get the wrong numbers for these values. So helps to do check. Once I have my expected values I'm going to

o calculate my test statistic. So to calculate my chi square test statistic I take the observed value minus the expected value squred, so for this first cell I take 224 minus 209.2 squared divided by the expected value. And then I do that for each of these 4 classfications. So I take 171 minus 185.8 squared divided by 185.8 and keep going. See slide for math. o And then you carefully type all this into your calculator, to get the correct test statistic. So my test statistic is 5.53.

My p-value, I'm going to use

o chi squared cdf and do just an upper tail p value that I just calculated. See slide for mpicutre. o Then like with all cdf funcitons multiply by 100%. And that's much less than 5% so we reject the null.

o The important thing on an exam is to write down all your work. So a lot of partial credit to be had, not just did you get the right answer at the end. Want to see your process. Once I have my test statistic I need my

o degrees of freedom. You take that number of rows minus one time the number of columns minus one. 2 categories for each variable so end up with just one degree of freedom

o See slide 5 pg. 1 bullet 2 ♣ Seems like a really big number. If I look at the curve with 15 degrees of freedom, 8.5 is gong to be clear back here, so based on our picture does our answer of 90% look reasonable? About 90% of the area? Yeah. • Try one more. See slide for math. o This is how we calculate our p-values for our new tests. So now all we have to do is

o figure out how to get our test statistic. So these numbers that I have been giving you to calculate the area to the right of, those are going to be the number like our test statistics and we will calculate our p-values based on that

For test statistic going to take

o take each of the observed values and subtract the expected values and square them and then divide by the expected values. Type in calculator exactly as wrote it down. Put in paraenthesis and then squared key, can type all at once if you would like. aAnd keep adding that together.

o So for p-value see slide. Even after multiplying by 100 stil have something in scientific notation so going to call it 0. That's because the test statistic is really big. Rejecting the null here means

o that birthweight and graduation are not independent. It would appear that low birthweight does affect the likelihood that child will graduate from high school by the age of 19. o So conclusion is to be reject the null.

• Let's try one more. See slide 4 pg. 3 o Selected a week randomly and counted up the people riding the bus on that route those days. o And we want to check to see if our assumption about it beign equal each day is true. o Just looking at the numbers what does it look like? It looks like we hav e a lot more people on Mondays than on the rest of the days. But is that just because a few extra people decided to ride Monday or is there something going on Mondays that we may need to account for? o Sothat's the kind of thinking we would go through on this type of problem. The null hypothesis would just be that

o that ridership is equal on all fo the days that there's no difference. And the alternate would just be the opposite of that,

o Okay when I roll this die, I got way more 5s than expecting. And way fewer 2s than expecting. Reminds of the epic settlers of cataon where we never rolled a 9. So that can happen, but when it does happen we get a little suspicios right. o So now that we have the test statistic, My hypothesis here, the null hypothesis is that

o the die is fair. That we're getting the same percentage within randomness of all the different sides of the die. Alternate is that the die is not fair.

So before like with a t test degrees of freedoms was o the number of measurements minus 1. o For the goodness of fit it was the number of categories minus one. But in this case we actually have a little bit more complicated degrees of freedom rule. We take

o the number of rows, so it's like the number of categories in the 1 variable minus 1 times the number of categories in the other variable minus one. And I write this as number of rows minus 1 nd number of columns minus 1 bwecause we take the one variable and write all of it's categories in a row, in a table, and we write the other variable as the columns.

o And now we can calculate the test statistic. • In an effort to make this as similar as possible to what we were doing before, let's make 2 columns again at the observed values and of the expected values and just make sure they go together. o So 680 goes with 529.4. 168 goes with 318.5, and if you mix these up you will be in trouble so make sure you put them together in pairs. Then I'm going to take the

o the observed minus the expected squared divded by the expected for each of these numbers. See slide for math. o And even if you are a calculator wizard write this down because it's very easy to mistype it on your calculator. If you enter in the number wrong and you do't write it down I can't give you partial credit if I didn't see you knew what you were doing. So if you can write all this down and then I see you make a calculator error get more credit than if you just give a wrong number

• Slide 6 pg. 4 o Here is our first one here. o And our categories here are gender, that's men and women. Mutually exclusive there, can't be both, and then whether you died or survived, that's another variable and those are two categories there. So if death and gender were indepdnetne you would expect

o the same proportion of men and women to have died on the titanic. If 70% of the passengers were men you would have expected 70% of the deaths to be men. If 30% women you would have expected 30% of the women to die. So what we need to do is decide if based off of the total numbers of men and women and the total number of people who died, what we would have expected to see based off the totals.

o Now when talking about chance, what did it mean for things to be independent. One does not affect the other. So an example of thigns that might or might not be independent, say your major and your income or starting salary. Proably not indepdent. o But youro major and brand of cell phone might be independent. It might be business majors all chose a specific brand, but might not see quite as big of a difference. So when we are going to decide if two qualititative variables are depdent, that is if they are related or influence each other we set the null hypothesis to be that

o they're indepdenent and we look for evidence to reject the null. o Now this is one case where it's really ncie because the null hypothesis is the same all the time.

• Read chapter 29. • Talks about the pit falls of testing for significance. o One of the things you don't see in a math is sometimes the fight between mathematicians deciding if what you guys are being taught is actually real or true or consistent with the rest of mathematics. o Once proved it's proved, but didn't realize stuff taught in the 80s and 90s, had been subjects of large contention in the 40s and 50s. o People hadn't proved it. And so you get misled that this just handed down and this is just the way it is. But with specifically statistics there continues to be a lot of dissent among statisticians about when hypothesis tests are appropriate. o What she has taught is a basic easy thing to do. Step 1, 2, 3, but a lot of assumptions. Things like

o was this really normally distributed to begin with, is the hypothesis test an appropriate thing to do?

Now I'm going to take these numbers here and they are the ones I'm going to use for my observed and expected. The typed in numbers are the

observed number of deaths. And the blue is what we would expect if they were independent.

Slide 2 pg. 5, the chi-square statistic is calculated the same way as the statistic for goodness of fit. You take your

observed you take your observed value minus your expected value, you squre that, you divde by the expected, then you add them all up. We do however have a different degrees of freedom rule

o And then we compare that to what actually happened. o Now let's think about how we would do that. What percentage of opeople on the titatnic were male? 848/1291 times 100%. 65.7%. o Now how many people died as a percentage? WE are trying to calculate the expected percentage of men who died. o 680/848 times 1oo so about 62%. Talk through this there is a short cut, but so we understand. 65.6% of the people who died should have been men, if it was independent. So 65.7% of 806, going to go .657 times 806 I would have expected to see 529.5 men die if 65% of the deaths were men. Now easier way to do that. I can calculate all of my expected values by taking the

row total times the column total and divide by the sample size, the total sample size.

Where you

run a bunch of tests and collect a bunch of data and then you say well this one is significant so that must be the one.

We calculate the X2 statistic in the same way:

see slide 2 pg. 5

o But we have to square those differnces because if we didn't we would just end up with the test statistic of zero all the time, beause the way things are constructed they would cancel each other out. If you get too many blues that has to come from somewhere, so maybe it came from the oranges, so maybe the blue would be under and the oranges over so when you added all those up, they would cancel each other out. o So kind of like we can't just take the average of deviations we have to square the deviations first, that's what's happening here. So for each category I have to

take the observed value minus expected value and square it, we're going to divide by the expected value then going to add all those up in every category.

This is where we

take two variables and we compare them to see if we think they are independent or not.

We set the null hypothesis to be

that the variables are independent and look for evidence to reject the null.

Okay, when you're doing null and alternative hypotheses for goodness of fit tests, the null hypothesis will always be

that your observed counts are consistent with your expected percentages.

• This is another exmpale of a test of independence. Slide 5 pg. 5. So when we go to write the null hypothesis is just going to be that

the 2 variables are independent. So measuring job satisfiaction and teaching level. So when I write my null hypothesis I just say that the null, is that job satisfaction and teaching level are independent and my alternate is that they are depdentdnet.

♣ Well what if you're not willing to accept a 5% risk? It's 5% because Ronald Fisher decided he liked 5%. And later Ronald fisher was arguing whether statistical tests were valid. So the guy that practically invented the genre himself came back later and said I don't know this is completely a good idea. o So what if I have a p value of 5.1%? WE've been taught in this class that we would not reject the null. But p value 4.9% but that's significant and I would reject the null. What's the difference between 5.1 and 4.9%? You're talkinga botu area in the tail, and even if talkinga bout risk, 51. And 4.9, there's really nothing practically different there, so this decision of whether or not it's significant there, is an arbitrary one based on

the level of risk that you're comfortable with.

From slides: • The P-value for a test depends on

the sample size.

So what we do with hypothesis tests is try to decide if

the sum difference that we see is too much of a difference. Did maybe, I got more bluegill in the lake than I expected to see a lot fewer catfish. Maybe there was a problem there.

Your expected counts do depend on

the total that you get for your experiment

There is in fact an entire division of the CDC that investigates clusters like this. If you think something like this is happening you can submit a request to the CDC and they will send out a team to investigate whether

they think this is by chance or actually environmental.

o So as people came by their camp he would ask them all kinds of things, he would ask them their age, their gender, take their blood pressure, did you have any pain killer today, what medications have you been on, how are you feeling, had a scale of how sick you were. And he collected all this different data on these peple as they came through. Well some of those things might turn out to be significant, the difference between geneder and mountain sickness, so if you test a whole bunch of things, like sometimes how with confidence intervals you just miss just because of chance. Sometiems

Hypothesis Tests 3 (M12)

Kaugnay na mga set ng pag-aaral

W. Capstone Arts

Week 3 Therapeutic Exercise - Resistance Exercise

Sr Med Surg PrepU Ch 72: Emergency Nursing

AP Psych: Understanding Consciousness and Hypnosis

Chapter 26, Infertility

History 1302 Chapter 21 REVIEW

PSYC Final Ch.5

W4 Peds Cardiac er

FIN exam 2 instapoll

Chapter 9 Human Sexual Behavior

AP Gov - Test Questions: Chapter 7

music appreciation 1100 quiz 3

Management

ACC3385 Ch3

PHARM - Integumentary Medications

Time Value of Money

PSM Exam

Final Econ 2301

MKTG 4400 Ch. 10 True or False

Give Me Liberty! Chapter 26