ANOVA
Middle of the road
Tukey, Scheffe
58. Construct a line graph from a hypothetical experiment that illustrates a main effect, but not an interaction.
...
Testing variation among the means of two or more means
A One-Way ANOVA refers to the fact that the design has only 1 IV. A Two-Way ANOVA, by contrast, means there are 2 IV's. A Three-Way ANOVA, means there are 3 IV's.
45. Under what situations can a researcher make a Type II error
A researcher can make a Type II error when there is a mean difference in the population but random chance yields a sample with a small mean difference and a p-value > .05. Type II errors are easier for researchers to make when power is low. So factors that negatively influence power can create situations when a researcher makes a Type II error (such as small effect size or low sample size).
8. What type of research questions does ANOVA address
ANOVA is used to address comparative research questions involving 2+ groups (and when the IV is categorical and the DV is continuous). Ex: Do 3 different therapy types differentially impact client depression scores.
One way ANOVA Need a grouping variable (remember the independent samples t-test?)
Analyze -Compare Means -One-Way ANOVA -Factor= IV -Dependent List= DM - OPTIONS- Descriptives POST-HOC- more about this later
46. As effect size increases (i.e., the groups become more different), what happens to power
As effect size increases, power increases.
34. How does Fischer's LSD compare to Tukey's test in terms of Type I error control
Fisher's LSD provides no protection against Type I errors as compared to Tukey's HSD.
13. If the null hypothesis is false, what does MSbetween quantify
If the null hypothesis is false, the groups will have different means in the population indicating that there is a treatment effect. the thing causing the means to be different is due to not only sampling error/noise, but also treatment effect
15. If the null hypothesis is false, what value (or range of values) would we expect the F to take on
If the null is false (i.e. there is an effect), the F statistic should be greater than 1.
53. What information do you need in order to conduct a power analysis
In order to conduct a power analysis you need at least 3 of the 4 pieces of information below: 1) Alpha level 2) Effect size 3) Sample Size 4) Power
23. What null hypothesis is being tested by Levene's test
Levene's test computes the absolute value of each score's distance from the group mean (i.e., a participant's contribution to the within-group variability) and then uses those scores as the DV in an ANOVA analysis. The null hypothesis is that the standard deviations of the groups are equal (s=s, or that there is homogeneity of variance).
31. What is the difference between post-‐hoc and planned comparisons. When would you use one versus the other
Planned comparisons-- a researcher specifies hypotheses about specific groups that she wants to compaire, PRIOR to the study.Post hoc (unplanned) comparisons-- a researcher performs exploratory analyses to determine which groups differ; this usually involves every possible comparison of group comparisons.You'd use a planned comparison when you have a specific hypothesis about specific groups (hypothesis testing) and post hoc comparisons when you don't (exploratory research)--- this last part wasn't out of notes, does that sound right to everyone?
10. Conceptually, what does between-‐group sums of squares quantify
Sum of Squares Between is the variability in the dependent variable that is due to the independent variable... this represents our treatment effect.
11. Conceptually, what does within-‐group sums of squares quantify
Sum of Squares Within is the leftover variability in the dependent variable that is NOT due to the independent variable... this represents error/noise.
67. How is "error" (e.g., SSW or MSW) defined in a two-‐factor ANOVA
The definition is the same as the one-factor ANOVA. Error is defined as the left-over variation not due to the independent variable(s). This may be from random noise (e.g., human error) or from systematic variability (e.g., gender differences). With systematic variability, if you had an additional independent variable, you can explain some of the within-group variability (error) and thus reduce error (and increase power).
41. How would you characterized the magnitude of each of the above effect sizes
The eta squared would be a medium to large effect, and the hedges g would be a very small effect (misses the conventional cut off for small).
18. What null hypothesis is being tested by ANOVA
The groups will have identical means in the population.
56. How is a main effect defined in a two-factor ANOVA
The main effect, in two-factor ANOVA, compares the means of one factor while completely ignoring the second factor. The main effect of a factor is examined by looking at the differences in marginal means within the factor of interest.
9. What type of data are required for the IV and DV (categorical, continuous, etc.)
To use ANOVA, we need continuous DVs and categorical/nominal IVs (note that covariates- control variables- can be categorical or continuous in ANOVA).
Does decision time influences the number of correct answers on a multiple choice test?
We randomly assign 12 people to 3 different conditions we then let all of our subjects read a paragraph about global warming Next, each person is given the same identical multiple-choice exam to test their comprehension of the paragraph. The first group has 1000 msecs to choose an alternative the second group has 5000 msecs the third group is given unlimited time to choose an answer
a raw score is comprised of three components:
m = population mean T = treatment effect (difference due to your IV manipulation) e = error (individual differences, errors in measurement,
The F-Ratio
treatment variance error variance good variance bad variance treatment slop variability between groups variability within-groups
59. Construct a line graph from a hypothetical experiment that illustrates an interaction.
...
61. Construct a table of means from a hypothetical experiment that illustrates the presence of an interaction.
...Means and Standard Deviations by Gender and Condition Variable M SD n Males Control 2 1.22 5 Violent 8 1.58 5 Females Control 3 2.00 5 Violent 4 1.00 5
6. What is the alpha level, and what purpose does it serve in the significance testing process
Alpha is the researcher-designated significance level of the test. It is the probability level associated with the decision rule such that a found p-value of greater than alpha means the data is consistent with the null, and a found value less that alpha results in rejection of the null.
21. Give an example of a study or data collection scenario where independence is violated.
An example of study with independence violation would be a treatment study where certain clients share the same therapist.
52. Still referring to the scenario above, which of the influences on power most likely caused this set of outcomes
An extremely large sample size probably caused this set of outcomes to occur.
Conservative test
Bonferonni
48. Changing your alpha level from .05 to .10 will have what effect on power
Changing your alpha level from .05 to .10 will increase your power.
66. In a 3x2 ANOVA, how many different simple effects tests are possible
Group A Group B Group C Level 1 1A 1B 1C Level 2 2A 2B 2C There are 5 simple effect tests possible. 2 from the levels and 3 from the groups.
22. What impact will independence violations have on your results, and how serious is this violation
Independence violations in your study will bias your results so that Random error is underestimated and the rates of false positive significance tests increase. This is a VERYserious violation that would INVALIDATE the ANOVA analysis--hierarchical linear models (AKA: multiplevel models or mixed linear models) are more appropriate in these scenarios b/c they can explicitly estimate the extent of the clustering effect.****
27. Levene's test yields a p-‐value of .47. What is your conclusion regarding homogeneity of variance
Recall that Levene's test tests the null hypothesis that the group variances are not significantly different from one another; a Levene's test yielding a p-value of .47 (non-significant results!) indicates that the group variances are not significantly different from one another and thus that the homogeneity of variance assumption has not been violated.
42. Define a Type I error.
Type one error is specified by alpha in advance of the study. It can be defined as the probability of rejecting the null hypothesis (saying there is an effect) when there is in fact no effect in the population. Type one error can only occur when there is no mean difference in the population, and when random chance yields a sample with an extreme mean difference (p <.05).
54. What factors can and cannot (or should not) be manipulated in conducting a power analysis
Usually, in a power analysis, the desired level of significance (alpha) and power (beta) are fixed. Estimated effect size, should not changed—but in reality, these estimates may fluctuate depending on how comfortable PI's are with the initial results of the power analysis (also, sometimes it can be upped, like by giving the rats more hormone). Sample size is a factor that may be manipulated in power analysis depending on study resources. You can also decrease the heterogeneity of your sample and thus reduce noise.
70. What are the analytic steps following a significant interaction
When the interaction is significant, ignore the main effects and perform addition analyses that help you understand the interaction. These additional analyses are testing simple effects to examine the influence of one IV within each group of the other IV. Simple effects are similar to one-factor ANOVAs preformed on a subset of the participants. Also know that since you have two factors, you can test the simple effects by splitting the design by whatever IV you prefer. (No need to perform both sets of analyses, choose to split the design that will best address your research question.)
50. How do you interpret a power value of, say .70
A power value of .70 means you have a 70% chance of detecting a treatment effect of a particular magnitude, if the effect truly exists.
17. Suppose that the standard deviations within each group went from 5 to 8. Holding all other values constant, what impact would this have on the F statistic
If the standard deviations increase, then the MSw will also increase because it's based on the standard deviations and represents error. So if the standard deviations increase, then the F statistic will decrease. This is because the amount of error in the study increased affecting the ratio of treatment effect to error (F).
49. Why is changing your alpha level a bad way to manipulate power
It's a bad way to manipulate power because it increases your Type I error rate (the chances of having a false positive).
68.What is an unbalanced design, and what negative outcome can result from such a design
An unbalanced design occurs because the benefit of adding a second factor (IV) decreases when group sizes are unequal. The negative outcome that results is a reduction in the SS for each IV because the IVs are correlated with unequal group sizes. In contrast, when IVs are uncorrelated in a balanced design, each IV accounts for unique variation in the DV. In addition, adding a factor (IV) often increases power by reducing error, but we have to be concerned with unbalanced designs.
14. If the null hypothesis is true, what is the expected value of the F statistic
If the null is true (i.e. there is no effect), the F statistic should be 1 (or close to 1). The ratio of the between-group variability to the within-group variability is nearly equal, which is why the F value is close to 1.
16. Suppose that the between-‐group variability doubled. Holding all other values constant, what impact would this have on the F statistic
The F statistic would increase because an increase in the between-group variability indicates a larger difference between the groups (i.e. a larger treatment effect) and this would be reflected by a larger F statistic.
F-ratio can assume many different values
The exact shape of the sampling distribution will be determined by the df F = 1 is not good F < 1 is not good F > 1 is (potentially) good
44. Under what situations can a researcher make a Type I error
(Type one error can only exist when there is no mean difference in the population, and when random chance yields a sample with an extreme mean difference). This can occur because of independence violations (because independence violations shrink within group variability in that each individual is not contributing one unit uniquely). Type one errors can also occur if the homogeneity of variance assumption is violated. If group sizes are equal, type one error increases roughly to 8% instead of 5%. If the groups are unequal, then the ANOVA is driven by the group with the largest n. If that largest group also has the smallest standard deviation, the within group variability is reduced, increasing the type one error rate.**Isn't it also possible to just have a massive n, that would jack up or power to the point where we detect statistical significance and reject the null, even if it is true in the population?***
69. What assumptions are required by a two-‐factor ANOVA
1) Independence of observations (pg. 2)Extra Info: The independence assumption requires that one participant's score is not related to or influenced by other participants' scores. 2) Homogeneity of variance (pg. 4)Extra Info: Homogeneity of variance means that the variability is the same within each group. This means that the standard deviations of each group must be the same. 3) Normality (pg. 10)Extra Info: ANOVA assumes that population of scores must be normally distributed with each group. For example, if we were to look at an individual therapy group, anger outbursts must follow a normal curve in the population. It is also not useful to look at the distribution of the entire sample, because the the distribution may not be normal even if the individual groups are normally distributed.
Factors that produce power for t will produce power for F
1. The difference between groups gets larger 2. The within group variance gets smaller 3. Sample sizes get larger
2.Suppose that large p value is obtained for a particular sample (e.g., p = .60). In this case, the data are consistent, or inconsistent with the null hypothesis
A large p-value (p= .60) means that the data are consistent with the null hypothesis because based on this sample, there is a 60% chance of getting these results by random chance from a population where the null is true.
36. What is the difference between a simple comparison and a complex comparison
A simple comparison involves only two conditions. A complex comparison pools two (or more?) conditions and compares the pooled group to another condition.
62. Describe (in words), the results from a hypothetical study where an interaction effect is present.
An interaction effect was present in the current analysis. Specifically, the effect of video-game condition on aggression was moderated by gender, such that the violent video game condition compared to controls resulted in significantly more aggressive behavior in boys, but not in girls. For girls, there was no significant difference between the video-game and control condition on aggressive behavior.
47. As individual differences among subjects increase, what happens to power
As individual differences among subjects increase, power decreases.
63.How will the addition of a second factor change the SSB and SSW values from a one-factor ANOVA
Compared to the SSw of a one-factor ANOVA, the addition of a second factor would decrease the value of SSw in a two-factor ANOVA. SSw represents the left-over variation that is not due to the independent variable, and the addition of a second factor can help explain some of the systematic variation in SSw, and thereby decrease its value, or error. The SSb term will stay the same. Since SSb represents the comparison of each group's average (e.g. IV-video game condition, or gender) to the grand mean, this will be the same regardless of how many factors you add to the analysis.
7. Describe what a confidence interval is and how to interpret one.
Confidence intervals provide a range of values intended to estimate parameters for a population based on our sample. So say we have a 95% confidence interval and the lower-bound number is 2 and the upper-bound number is 3.5 This means that for 95% of the samples we collect data from, the true population value will fall within the confidence interval... so, there's a 95% chance that the true population value falls within 2 and 3.5.
64.What will happen to power when a second factor is added to an ANOVA
Could power increase, or decrease? Under what situation could an increase or decrease occur? Adding a second factor to your ANOVA can either cause power to increase or decrease, depending on the context. Adding a second factor that is actually related to the DV can help explain some of the systematic variation in SSw, reduce error, and therefore, increase power. However, adding a second factor that is unrelated to your DV can actually reduce power. When you calculate the within-group mean square (MSw) in order to calculate the F statistic, you use the equation: MSw=SSw/dfw. Adding a second factor "diverts" degrees of freedom values to the other IV. In other words, having two factors will necessarily cause your dfw term to be smaller. Dividing SSw by a smaller number actually creates a larger MSw (error) term, and according to the formula for F=MSb/MSw, will then cause F to be smaller and therefore less likely to reach significance. The key point here is that choosing an unrelated second factor will not account for as much error in the data and your SSw will not be smaller, but rather will be bigger. This coupled with the necessarily smaller dfw of two factor ANOVA could result in reduced power. **also unsure of this. Craig said that power can't really be talked about in the "present"/about the current study, but is rather a projective number. If that is the case then I'm not sure if I got at exactly what he wanted.
65.What are simple effects
Describe a hypothetical study, and explain the concept of simple effects within the context of that study. In general a simple effect is a priori simple comparison involving two specific groups (e.g., Contrast codes for 3 groups = 0 -3 3). This allows for the researcher to find the difference of means between both groups. Example: There are three groups: individual therapy (I), group therapy (G), and a control group (C) in a study examining whether there is a difference between each therapy and the control group for the treatment of anxiety symptoms. In order to test the difference of means between each therapy group and control group you would use simple effects accompanied with the following coding system to yield the desired comparisons. Individual Group Control Indiv. Vs Control 1 0 -1 Group vs Control 0 1 -1
38. Under which situations can you use the various effect size measures
For example, when might you choose eta squared over hedges g? An eta squared is the appropriate effect size measure for an ANOVA with more than two groups. Both eta squared and w squared give you the proportion of variance in the DV explained by the IVs, but the two have different computations. Eta squared overestimates the proportion of variance accounted for in the population, but is accurate for the sample. The w squared is a "adjusted" estimate of the proportion of variance accounted for, and therefore is a better estimate of variance accounted for in the population (and sample). These differences between eta squared and w squared exist for small sample sizes, but the two effect sizes become comparable with large samples. The cut offs are small >.01, medium >.06, large >.14. A hedges g is an appropriate effect size measure for planned comparisons. The hedge's g is very similar to the Cohen's d in that it uses average standard deviation to standardize the mean difference. The cut offs are small >.20, medium >.50, large >.80 (the same as Cohen's d).
Conceptually related to the within-subjects t-test No need for a grouping variable Data is in separate columns
General Linear -------Model ----Repeated ------------------Measures ---------Name the factor and report number of levels --------(ADD then DEFINE) ------Move levels over ------Options (Click on DESCRIPTIVE STATISTICS) --------------OK
28. F-‐max yields a value of 9.78. What is your conclusion regarding homogeneity of variance
Going by Craig's rule of thumb that 3 is the cutoff for F-max values that we should be concerned about (i.e., if f-max is less than 3, the accuracy of the t-test may not be substantially compromised), homogeneity most likely HAS been violated. (Note that different sources suggest different rules of thumb, so we should check p-values for these F statistics to be safe).
24. Why is homogeneity of variance important in ANOVA
Homogeneity of variance is important in ANOVA because the F statistic is influence by within group variability (error) which is calculated by adding up the sums of squares within each group. If the variances in the two groups are different from each other, then adding the two together is not appropriate, and will not yield an estimate of the common within-group variance (since no common variance exists).
55. What is a marginal mean
How are marginal means computed? In 2-factor ANOVA, the marginal means for one factor are the means for that factor averaged across all levels of the other factor. For example, when looking at the phone sex data, the marginal mean for the Android group would be the mean number of sexual partners in the Android group across gender. The marginal mean is computed by averaging mean scores on a DV in within one group across several factors. So if, in the phone sex example, Android Females had a mean of 6 and Android Males had a mean of 8, the Android marginal mean=7.*NOTE: The marginal means for each group within one factor are compared when consider the main effect of that factor. So, the 3 marginal means for each phone group would be compared to see the main effect of phone group.
3.Suppose that a researcher was comparing males and females on some dependent variable using an ANOVA or t-‐test. The p-‐value was .08. What is the interpretation of the p-‐value (Note: the answer has nothing to do with whether the p-‐value is significant or not).
If the null hypothesis is true, a p-value of .08 means that the t statistic calculated for this sample has an 8% probability of occurring due to chance. Therefore, we consider the data consistent with the null hypothesis, meaning that males and females are different in the population based on the dependent variable being measured, since we usually use .05 as our cutoff.
Liberal
Least Significant Differences (LSD)
Do the means of the samples differ more than expected if the null hypothesis were true
Like the t-tests GOOD/BAD variance The F ratio Ratio of the between-groups population variance estimate (GOOD VARIANCE) to the within-groups population variance estimate (BAD VARIANCE
12. If the null hypothesis is true, what does MSbetween quantify
MSbetween is giving us an indication of a type 1 error (since MSbetween is telling us about treatment effects, and the null is true-- meaning there isn't a treatment effect). Any mean differences we find in this case are due to noise/ sampling error.
5. If the mean difference between two groups would be relatively common from a population where the null is true, what would the p-‐value look like
Mean differences that would be relatively common if the null were true would be much closer to 1, i.e. p-values greater than .1.
4. If the mean difference between two groups would occur very rarely from a population where the null is true, what would the p-‐value look like?
Mean differences that would occur rarely if the null were true will have very small probability values, i.e. p-values less than .05.
60. Construct a table of means from a hypothetical experiment that illustrates a main effect for one (or both) factor(s), but not an interaction.
Means and Standard Deviations by Condition Variable M SD n Control 2.5 1.5 10 Non-Violent 3.5 1.4 10 Violent 6 1.2 10
57. What is meant by moderation, or interaction
Moderation/interactions occur when the relationship of an IV to a DV is dependent on, or changes across levels on a second IV. A test of interactions examines whether the effectso f one IV are uniform for all groups of the second IV.
32. Suppose that a researcher conducted an ANOVA with three groups, and wanted to do pairwise comparisons among all the groups. Which follow-‐up procedure would be most powerful: Tukey or Scheffe
The Tukey test would be most powerful. Tukey is appropriate when you want to compare all possible pairwise comparisons. The Scheffe procedure compares all possible pairwise comparisons and all possible complex comparisons. The Scheffe procedure is usually undesirable because it adjusts the p-value for too many comparisons making it difficult to detect group differences (i.e., the test lacks power).
20. What is the independence assumption
The independence assumption is the requirement in ANOVA that one participant's score not be related to or influence by another participant's score. Statistically speaking, in ANOVA the standard error of the mean is calculated as σ /√N, Because it divides by N, the formula assumes that each individual contributes one "unit" of information. Independence causes redundancies in the data, such that each score contains less than one unit of unique information
33. What is the difference between Tukey and other post hoc tests (e.g., Fisher, Scheffe)
The main difference is they all differ in their level of protection against false positives. Fisher is at the no protection end, Scheffe is on the maximum protection end and Tukey and Dunnet are in the middle. Post Hoc Test Appropriate when Type I error control? Other Comments Fisher's LSD Does not protect against type I error Logic: If F is significant, the null is false, type I errors are not possible. Scheffe Test Goal is to compare all possible pairwise comparisons and complex comparisons "over-corrects", adjusts the p-value for too many comparisons: maximum protection against type I errors Difficult to detect any group differences Tukey's HSD Goal is to compare al possible pairwise comparisons Keeps familywise type I error rate at .05 Most common procedure used in psych Dunnet's test Goal is to compare a reference group to every other group Similar to Tukey but uses a smaller correction factor because it assumes fewer comparisons. If youre comparing intervention vs control this is the best to use.
1. What role does the null hypothesis play in the significance testing process
The null hypothesis defines the situation when the treatment makes no difference. When the null is true, the groups of interest are equivalent (and have equivalent means) in the population.
35. Suppose that a researcher did an ANOVA with four groups. Further, she wanted to compare the first two groups against the second two groups using a planned comparison. What are the contrast coefficients that would be used for this analysis
The way you go about assigning weights (i.e., a contrast coefficient) is: ● Each mean gets a weight ● The means that are not involved in a contrast get a weight of zero. ● The means that are being compared in a contrast have weights with opposite signs (positive or negative) ● All weight must sum to zero. Therefore, in this questions where you have four groups and you want to compare the first two against the second two you can use the following contrast coefficients: Group 1 Group 2 Group 3 Group 4 complex comparison -1 -1 +1 +1
40. How do you interpret a Hedge's g of, say .10
This is a tiny effect size (the standard for small is >.20). This tells us there is a .10 standard deviation difference is the means.
39. How do you interpret and eta squared of, say .10
This would mean that the proportion of variance explained by the IVs is 10%. This is a medium to large effect according to conventional standards.
43. Define a Type II error.
Type 2 error is the probability of failing to reject the null when there is in fact an effect in the population. Type 2 error can only occur when there is a mean difference in the population, and when random chance yields a sample with small mean differences. Type 2 error is inversely related to power, 1-power = type 2 error. Since we usually want a power of .80, then we are an accepting a type 2 error rate (failure to detect an effect even though there is one in the population) 20% of the time.
25. Under what condition will violating homogeneity of variance lead to an increase in Type I errors
Violating the homogeneity of variance can lead to an increase in Type 1 errors (false positives when: - The group sample sizes are equal OR - We have unequal group sizes and the group with the largest n (largest group) has the smallest standard deviation Because these conditions result in a within group variability is too small, giving us an F-stat that is too big--biasing the results towards a Type 1 error.
30. What is a Bonferroni adjustment, and when would you use it
When the familywise type I error rate is high, we can use post hoc tests that inflate the probability values for each comparison to protect against type I errors; the Bonferonni procedure is one such procedure. Specifically, the Bonferroni adjustment multiples each unadjusted p-value by the number of comparisons (e.g., if we are comparing 3 different treatment groups, we would multiply the p-value by 3) to produce Bonferroni p-values. Note that there's also an "alternate Bonferroni procedure" that researchers can employ that reduces the alpha level for each test by dividing desired alpha by the number of comparisons (i.e. instead of multiplying the unadjusted p-values by 3, you would divide desired alpha by 3) Finally, note that the Bonferroni adjustment is popular because it can be applied to any statistical test (e.g., a table of correlations), but is not ideal for post hoc tests because it tends to overcorrect the p-value and make it really difficult to detect group differences.
71. What are the analytic steps following a non-‐significant interaction
When there is non-significant interaction, examine the F statistics for the main effects only. In addition, when the interaction is not significant, each main effect is similar to a one-factor ANOVA. If the main effect is signficant, peform pairwise comparisons.
51. Suppose that a study is comparing 2 groups and finds and effect size close to zero (e.g. d=0.1). However, the results were statistically significant (e.g., p<0.05). Does the study have too much, too little, or just the right amount of power
When we detect a statistically significant result, even when there is a "trivial" effect size, or an effect size close to zero, this means that the study may have had a sample size that was too large. This means that the study probably had too much power, driven by an extremely large sample size.**Just as an aside other such scenarios may include: 1) We probably have just the right amount of power when we have a "decent" effect size and p<0.05 OR when we have a trivial effect size and p>0.05 b/c n is just the right size. 2)We do not have enough power when we have a "decent" effect size and p>0.05, probably b/c n is too small.
26. Under what condition will violating homogeneity of variance lead to an increase in Type II errors
When within group variability is too large, we can expect an increase in Type II errors (false negatives)
37. Can p-‐values be used as measures of effect size across studies
Why or why not? P-values cannot be used as measures of effect size across studies because they are dependent on sample size. On the other hand, effect size measures quantify the magnitude of the association in a way that is independent of sample size.
19. Suppose that the ANOVA yielded a significant F statistic (e.g, p < .05). What conclusion can you draw from this
You could conclude that the results are statistically significant indicating that at the mean of at least one group is significantly different from the others.
29. What is a familywise Type I error rate
a familywise Type I error rate is basically just a Type I error-- a.k.a. the probability of arriving at a false positive, or of finding evidence in favor of the alternative hypothesis when in reality the null hypothesis is true.I (Susan) looked this up in another stats book. I think this is helpful --Familywise Type I error is the type I error rate for the experiment as a whole, which includes all of the comparisons tested in the analysis. Separate, per comparison probabilities actually combine to produce a much larger value, which we call familywise type I error. This refers to the probability that at least one type I error has been committed somewhere among the various test conducted in the analysis. Familywise type I error corrections are only needed for post-hoc comparisons not planned comparions. **Similarly, from page 6 of the post-hoc test notes: The probability of making one or more type 1 errors across a set of tests is called the family wise type 1 error rate.**