Can I trust ANOVA results for a non-normally distributed DV?

I have analyzed an experiment with a repeated measures ANOVA. The ANOVA is a 3x2x2x2x3 with 2 between-subject factors and 3 within (N = 189). Error rate is the dependent variable. The distribution of error rates has a skew of 3.64 and a kurtosis of 15.75. The skew and kurtosis are the result of 90% of the error rate means being 0. Reading some of the previous threads on normality tests here has me a little confused. I thought that if you had data that was not normally distributed it was in your best interest to transform it if possible, but it seems that a lot of people think analyzing non-normal data with an ANOVA or a T-test is acceptable. Can I trust the results of the ANOVA? (FYI, In the future I intend to analyze this type of data in R with mixed-models with a binomial distribution)

asked Dec 21, 2010 at 21:38 741 1 1 gold badge 8 8 silver badges 14 14 bronze badges

$\begingroup$ Could you link to some of those threads? My gut instinct is "NOOO no no no", but I'm hardly an expert and I'd be interested in reading some of those arguments. $\endgroup$

Commented Dec 21, 2010 at 21:54

$\begingroup$ You sure can't trust any p-values derived from F distributions with those kinds of data! $\endgroup$

Commented Dec 21, 2010 at 21:56

$\begingroup$ Many cite the ANOVA's robustness as justification for using it with non-normal data. IMHO, robustness is not a general attribute of a test, but you have to precisely state a) against which violations of its assumptions a test is robust (normality, sphericity, . ), b) to what degree these violations have no big effect, c) what the prerequisites are for the test to show robustness (large & equal cell size . ). In your split-plot design, I'd love to have somebody state the precise assumptions of sphericity & equality of covariance matrices. It's already mind-boggling in the 2-factorial case. $\endgroup$

Commented Dec 21, 2010 at 22:26

$\begingroup$ @Matt It sounds like 90% of the residuals are zero. If that's the case, no transformation is going to make the residuals remotely close to normal. Simulation studies have shown that p-values from F-tests are highly sensitive to deviations from normality. (In your case it's fairly likely that some denominators in the F-tests will be zero: a sharp indicator of how far things can go wrong.) You need a different approach. What to do depends on why so many residuals are zero. Lack of sufficient precision in measurements? $\endgroup$

Commented Dec 22, 2010 at 15:11

$\begingroup$ @Matt that's sounding more appropriate, assuming your data are counts. Another attractive consideration is a zero inflated negative binomial response (ats.ucla.edu/stat/r/dae/zinbreg.htm ). $\endgroup$

Commented Dec 22, 2010 at 16:42

5 Answers 5

$\begingroup$

Like other parametric tests, the analysis of variance assumes that the data fit the normal distribution. If your measurement variable is not normally distributed, you may be increasing your chance of a false positive result if you analyze the data with an anova or other test that assumes normality. Fortunately, an anova is not very sensitive to moderate deviations from normality; simulation studies, using a variety of non-normal distributions, have shown that the false positive rate is not affected very much by this violation of the assumption (Glass et al. 1972, Harwell et al. 1992, Lix et al. 1996). This is because when you take a large number of random samples from a population, the means of those samples are approximately normally distributed even when the population is not normal.

It is possible to test the goodness-of-fit of a data set to the normal distribution. I do not suggest that you do this, because many data sets that are significantly non-normal would be perfectly appropriate for an anova.

Instead, if you have a large enough data set, I suggest you just look at the frequency histogram. If it looks more-or-less normal, go ahead and perform an anova. If it looks like a normal distribution that has been pushed to one side, like the sulphate data above, you should try different data transformations and see if any of them make the histogram look more normal. If that doesn't work, and the data still look severely non-normal, it's probably still okay to analyze the data using an anova. However, you may want to analyze it using a non-parametric test. Just about every parametric statistical test has a non-parametric substitute, such as the Kruskal–Wallis test instead of a one-way anova, Wilcoxon signed-rank test instead of a paired t-test, and Spearman rank correlation instead of linear regression. These non-parametric tests do not assume that the data fit the normal distribution. They do assume that the data in different groups have the same distribution as each other, however; if different groups have different shaped distributions (for example, one is skewed to the left, another is skewed to the right), a non-parametric test may not be any better than a parametric one.

References

  1. Glass, G.V., P.D. Peckham, and J.R. Sanders. 1972. Consequences of failure to meet assumptions underlying fixed effects analyses of variance and covariance. Rev. Educ. Res. 42: 237-288.
  2. Harwell, M.R., E.N. Rubinstein, W.S. Hayes, and C.C. Olds. 1992. Summarizing Monte Carlo results in methodological research: the one- and two-factor fixed effects ANOVA cases. J. Educ. Stat. 17: 315-339.
  3. Lix, L.M., J.C. Keselman, and H.J. Keselman. 1996. Consequences of assumption violations revisited: A quantitative review of alternatives to the one-way analysis of variance F test. Rev. Educ. Res. 66: 579-619.