Jump to content

  • Log in with Facebook Log in with Twitter Log in with Windows Live Log In with Google      Sign In   
  • Create Account

Submit your paper to J Biol Methods today!
Photo
- - - - -

ANOVA analysis and normal distribution of data


  • Please log in to reply
5 replies to this topic

#1 makiyo

makiyo

    member

  • Members
  • Pip
  • 3 posts
0
Neutral

Posted 13 April 2009 - 07:25 AM

I have a data set with which I wanted to do anova. A brief description is that I have a putative mating gene and got two lines of animals WT and knock-out. So the purpose is to find the effect of knock-out in male or female. I was told the data need to have normal distribution and have similar variations before anova applies. Unfortunately after looking at the data, it seems not to be normally distributed.

I read some published papers dealing with similar data. They even did not check the normal distribution stuff before Anova.

Now I am confused. Can Anova be done without checking if the data is normally distributed? If the data don't have normal distribution, what could be wrong to get conclusion? Besides transformation, which I tried, but did not find any good one and moreover I did not see any paper doing such a thing to this kind of data, any other statistics could be done?

Thanks for any help!

#2 pcrman

pcrman

    Epigenetist

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 1,157 posts
65
Excellent

Posted 13 April 2009 - 09:01 PM

In order for the probability levels from a test to be valid, the data must come from a normal distribution. if not, you have to use nonparametric test such as Mann-Whitney U test (nonparametricversion of the two group upaired t test), Wilcoxon signed rank test (paired t test), Kruskal-Wallis test (nonparametric equivalent of a one-way ANOVA).

#3 makiyo

makiyo

    member

  • Members
  • Pip
  • 3 posts
0
Neutral

Posted 14 April 2009 - 03:28 AM

Thank you very much for your advice. Still have some confusions.

In my data, there are actually four groups according to the female male genotypes: WT-WT, WT-knockout, knockout-WT, knock-knockout. I should check each group of data have normal distribution, shouldn't I? What if I have a limited number of data in one group (e.g. 4 )? And what if there is an extreme value in the data. Should I discard it without biological reasons?

One more stupid question about the anova analysis in publications. Are they assumed to have done normal distribution and variance check though they don't put in paper?


In order for the probability levels from a test to be valid, the data must come from a normal distribution. if not, you have to use nonparametric test such as Mann-Whitney U test (nonparametricversion of the two group upaired t test), Wilcoxon signed rank test (paired t test), Kruskal-Wallis test (nonparametric equivalent of a one-way ANOVA).



#4 DRT

DRT

    Veteran

  • Active Members
  • PipPipPipPipPipPipPipPipPipPip
  • 156 posts
7
Neutral

Posted 27 April 2009 - 10:10 PM

Sorry for the delayed reply, you may have already found the answers you need.

In my data, there are actually four groups according to the female male genotypes: WT-WT, WT-knockout, knockout-WT, knock-knockout. I should check each group of data have normal distribution, shouldn't I? What if I have a limited number of data in one group (e.g. 4 )?


Look for normality as a larger group first ( ie express each data point as a deviation from itsí group mean then combine all points from all groups). If that doesn't work then treat each group individually.

And what if there is an extreme value in the data. Should I discard it without biological reasons?


With the exception of experimental errors, I donít think any data should be Ďdiscardedí (which is not to say that analysis canít be done with some points temporarily missing so long as it is acknowledged). I finally got this through to a PI of mine following a nasty incident when we had to repeat an assay for a commercial client and discovered that the data she had (unbeknownst to me) deleted as outliers turned out to be important.


One more stupid question about the anova analysis in publications. Are they assumed to have done normal distribution and variance check though they don't put in paper?


Generally if someone has gone to the trouble of checking the assumptions in their analysis (and found them to be valid) they will make a note of it in the paper otherwise it is probably safest to assume not.

#5 bob1

bob1

    Thelymitra pulchella

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 5,236 posts
336
Excellent

Posted 28 April 2009 - 05:05 PM

Thank you very much for your advice. Still have some confusions.

In my data, there are actually four groups according to the female male genotypes: WT-WT, WT-knockout, knockout-WT, knock-knockout. I should check each group of data have normal distribution, shouldn't I? What if I have a limited number of data in one group (e.g. 4 )? And what if there is an extreme value in the data. Should I discard it without biological reasons?

Limited data means that you can not assume that it is normally distributed. Parametric tests such as an ANOVA rely on normal distributions and require a minimum of about 30 samples for it to work. However, as DRT says; you can look for normality as a whole in your data. Papers that haven't checked the normality of their data and then done an ANOVA are doing it wrong, and the reviewers should have picked that up. However, the results generated from the analysis may not be erroneous, because all these sorts of tests are an approximation of the real situation, so the results may be right, but for the wrong reason.

To me it sounds like you need a Kruskal-Wallis test, possibly followed by a post-hoc test such as Tukey's post hoc if you want to distinguish which two groups are actually significantly different, rather than just saying that one of them is different without knowing which one.

#6 hobglobin

hobglobin

    Growing old is mandatory, growing up is optional...

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 5,465 posts
90
Excellent

Posted 29 April 2009 - 06:25 AM

Thank you very much for your advice. Still have some confusions.

In my data, there are actually four groups according to the female male genotypes: WT-WT, WT-knockout, knockout-WT, knock-knockout. I should check each group of data have normal distribution, shouldn't I? What if I have a limited number of data in one group (e.g. 4 )? And what if there is an extreme value in the data. Should I discard it without biological reasons?

Limited data means that you can not assume that it is normally distributed. Parametric tests such as an ANOVA rely on normal distributions and require a minimum of about 30 samples for it to work. However, as DRT says; you can look for normality as a whole in your data. Papers that haven't checked the normality of their data and then done an ANOVA are doing it wrong, and the reviewers should have picked that up. However, the results generated from the analysis may not be erroneous, because all these sorts of tests are an approximation of the real situation, so the results may be right, but for the wrong reason.

To me it sounds like you need a Kruskal-Wallis test, possibly followed by a post-hoc test such as Tukey's post hoc if you want to distinguish which two groups are actually significantly different, rather than just saying that one of them is different without knowing which one.


You should also look what type of data you have: nominal, ordinal or interval data. Anova is usable for interval variables.
And if you chose a non-parametric test such as Kruskal-Wallis test, you should use also a non-parametric post-hoc test. Example is the Nemenyi test (similar to Tukey, very conservative) or Steel-test.
One must presume that long and short arguments contribute to the same end. - Epicurus
...except casandra's that belong to the funniest, most interesting and imaginative (or over-imaginative?) ones, I suppose.




Home - About - Terms of Service - Privacy - Contact Us

©1999-2013 Protocol Online, All rights reserved.