# pasta and vinegar

mind/tech bazar from outer space

# [Research] Do your data violate one-way ANOVA assumptions?

A quick reminder I used to explain stuff to our students:

If the populations from which data to be analyzed by a one-way analysis of variance (ANOVA) were sampled violate one or more of the one-way ANOVA test assumptions, the results of the analysis may be incorrect or misleading. For example, if the assumption of independence is violated, then the one-way ANOVA is simply not appropriate, although another test (perhaps a blocked one-way ANOVA) may be appropriate. If the assumption of normality is violated, or outliers are present, then the one-way ANOVA may not be the most powerful test available, and this could mean the difference between detecting a true difference among the population means or not. A nonparametric test or employing a transformation may result in a more powerful test. A potentially more damaging assumption violation occurs when the population variances are unequal, especially if the sample sizes are not approximately equal (unbalanced). Often, the effect of an assumption violation on the one-way ANOVA result depends on the extent of the violation (such as how unequal the population variances are, or how heavy-tailed one or another population distribution is). Some small violations may have little practical effect on the analysis, while other violations may render the one-way ANOVA result uselessly incorrect or uninterpretable. In particular, small or unbalanced sample sizes can increase vulnerability to assumption violations.

Potential assumption violations include: • Implicit factors: lack of independence within a sample • Lack of independence: lack of independence between samples • Outliers: apparent nonnormality by a few data points • Nonnormality: nonnormality of entire samples • Unequal population variances • Patterns in plots of data: detecting violation assumptions graphically • Special problems with small sample sizes • Special problems with unbalanced sample sizes • Multiple comparisons: effects of assumption violations on multiple comparison tests