Microarray analysis - (Jul/21/2012 )
Hi , I need to do some microarray analysis and I have been reading a lot and getting very confused with the different analysis to be done. I have to analyse a mutatnt and wild type exposed to two treatments. I did a two - way ANOVA a got a list of significantly different genes.I am very confused with how to go ahead with further analysis. Could some body give me some suggestions on how to go with data analysis?
To analyze microarray data, you either need sophisticated tools or bioinformatics people to help you because of the sheer volume of the data. Here is work flow of microarray analysis and types of analysis:
1. You need some kind of data normalization and transformation
2. Since you have two treatments (I assume you have repeats for each treatment), you can do a pairwise analysis using t test to derive a subset of genes changed (up, down)
3. Clustering analysis to cluster genes with similar expression pattern together
3. To gain biological insight of your data, you can do enrichment analysis: using your list of genes from the pairwise analysis including GO and pathway analysis. For this purpose, DAVID is a pretty nice and easy-to-use tool.
Those are what I can think of. I hope that helps.
Thank you very much for the information
I have done the normalisation and transformation . The organism I am working on is C. elegans and I have tried DAVID but it did not help me much so I thought of going with Genespring trial version and I am working with it now. I have tried Babelomics for pathway analysis because they are the only software who does a GSA analysis for the organism C. elegans unfortunately it did not work as well as there is a bug and they still fixing it.
Could you tell me how do I select gene list for clustering ?( I have been reading a few journals and review papers and it says that you need to do a t-test or ANOVA to get the significant differentially expressed genes before doing a k-means clustering and SVM. Is it the same case for Hierarchical clustering too?)
For c. elegans, I am not sure there are pathways, but DAVID should be able to do a nice GO analysis since worm genes should have well defined GO terms.
To obtain a set of significantly changed genes for clustering analysis, you apply t test to each gene in the two groups, t test should return significant difference (e.g. p<0.01) between the two groups and the change (threshold) should be at least 1.5 fold either way. the t-test can be used are:
<*>t-test: 2-tailed un-paired student t-test assuming equal variance
<*>Welch's t-test: 2-tailed unpaired t-test without assuming equal variance
Besides t-test, you have to perform a correction analysis such as Bonferroni: Single step adjusted p-value for multiple testing. Correlation must be used in conjunction with the statistical test to derive adjusted p value for each gene. Probably this is not easy to understand, but you can read this paper
Dudoit, et al. 2003. Multiple hypothesis testing in microarray experiments. Statistical Science 18(1): 71-103.
Thank you very much pcrman, you hav been very helpful
As mentioned before I am working on a mutant and wildtype . Totally I will have 2 treatments and i will also be doing recovery after treatement ( 2 time points).I would like to know the genes (transcripts) which are co expressed or coregulated. Could you please suggest how should I proceed to determine this interaction.
microarray analysis can take ages! the whole duration of a phd actually! what i'd recommend is that you analyse your data by a hypothesis-driven approach, otherwise you'll end up with so much information that it will drive you crazy. you can try and find genes that are differentially regulated in both treatments, or before or after recovery, or pretty much whatever you can think of ;-)
Multiexperiment Viewer (MEV) is a great tool for visualising clusters of genes and it's easy to use, but it's so much better once you kinda know what you're after.
I haven't tried DAVID, thanks for that info pcrman