Jump to content

  • Log in with Facebook Log in with Twitter Log in with Windows Live Log In with Google      Sign In   
  • Create Account

Submit your paper to J Biol Methods today!
Photo
- - - - -

Replace samples with mean?


  • Please log in to reply
4 replies to this topic

#1 czz

czz

    member

  • Members
  • Pip
  • 2 posts
0
Neutral

Posted 04 July 2015 - 10:31 PM

Hi all,

 

I would really appreciate some suggestions concerning an issue. 

If I have a dataset with 3 groups, n=6/group originally. Lets say I delete 1 sample from each group because they are outliers. After this move, when I have n=5/group, may I replace this one deleted data with my mean for the 5 samples/group, thus I would have n=6 again? I have a marginal non-significant result p=0.053 and if I do so it reaches significancy. I saw this approach several times used by experimental researchers working with low number of individuals or animals pro group, but I could not tell whether its correct. 

 

Thank you very much for your kind help in advance!

 



#2 DRT

DRT

    Veteran

  • Active Members
  • PipPipPipPipPipPipPipPipPipPip
  • 174 posts
10
Good

Posted 05 July 2015 - 12:12 PM

No; a desire to reach significance is not a good reason to substitute data. It’s bad enough deleting one sample from each group in the first place.

 

The only occasion I can think of when it is appropriate to substitute a data point with a mean is when a single data point is missing, for instance an animal died unexpectedly, and the experimental design requires absolute balance between treatments.



#3 bob1

bob1

    Thelymitra pulchella

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 6,285 posts
494
Excellent

Posted 05 July 2015 - 12:43 PM

I agree with DRT. Deleting a value and replacing it with the mean of all values will skew your data, especially with a small n, this gives you a false result that bears no resemblance to the actual data. For instance, how do you know that the one that you removed isn't the beginning of a pool of individuals (1/6 = 17% of the total population) that are non-responders to the treatment - if you then remove these and replace with the mean, your data will be very different to what is the actual situation!



#4 phage434

phage434

    Veteran

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 2,747 posts
304
Excellent

Posted 05 July 2015 - 01:28 PM

You can't increase your confidence in a result by making things up -- which is essentially what you propose.  This is absolutely a bad idea, and just shows how badly skewed our idea of experiments is when people with go to these lengths to hit a magic p=.05 number, when most of those numbers mean very little. If you have good reason to throw out the outliers, fine, but tell us why. I'd be happy to see a p value before and after they are discarded. I think there is little shame in a p value near .05 (as you have), and it would be far far better to report it that way than to make up ways of artificially getting a lower number.



#5 czz

czz

    member

  • Members
  • Pip
  • 2 posts
0
Neutral

Posted 06 July 2015 - 12:46 AM

Thank you very much. Actually the p value does not change so much if I don't discard p=0.55 before and p=53 after....This is technically a complex procedure where you can get some postop complications even if the procedure is good. I have excluded individuals where some complication might influenced the reproducibility of the measurement. 

 

Thank you, then I will avoid this mean imputation procedure. I report as it is. I heard and saw it multiple times that researchers were using this technique, but i never got the opportunity to discuss it with others. 






Home - About - Terms of Service - Privacy - Contact Us

©1999-2013 Protocol Online, All rights reserved.