microarray data - is it to be believed? (Jan/18/2007 )
I''m just looking at a whole host of journal articles, and i'm hoping an expert (or wannabee) in microarray could give me a bit of help.
i'm looking atht e expression levels from data obtained in the microarrays, and then comparing it (when available) with RT PCR or western blots. the thing is, the microarray is saying either the gene i'm interested in is not really expressed all that high, but then i look at RT PCR or westerns, and my gene is expressed in huge amount. or when the gene is supposed to be overexpressed, it says it's expressed at normal levels.
the more i look at microarray data on my gene of interest, the less i believe the microarray.
so, i'm supposed to be finding a link between 2 genes, which no one has cared that much about before. the problem is, the one paper that did fish, RTPCR, and northerns says my gene is overexpressed and amplified when the other gene is mutated. however, 3 microarrays (which have no RT PCR validation) say there is nothing of the sort.
who should i believe? or better still, is there a paper out there that highlights the short-commings of microarrays? could interesting links between genes and cancers and all types of things be lost because microarrays aren't accurate?
I am not an expert in microarray but from what I know, Microarray is not as sensitive as RTPCR to determine if a gene is over expressed or not. Only when the gene is overexpressed by a higher factor, does microarray pick it up. There r some exact numbers for these which I cannot remember. Microarray is more fancy and everyone wants to jump into it, but many microarray papers are being torn apart bcoz of its limitations.
A friend compared both procedures ( RTPCR and microarray) while planning his project and ended up with RTPCR due to it being more sensitivity. Well, others in the same field tried microarry and didnt find anything. Finally my friend published 4 papers bcoz of RT-PCR.
Me neither I'm not an expert, but I had several colleagues in several labs that did microarray and protein array.
My conclusion is that it is a tool you can use when you have no idea at what you are looking for. it will give you an idea of what gene or protein you should investigate, and then you should confirm with RT-PCR or ELISA. However you might pass beside some genes or protein whose transcription or expresssion is increased, but not enough to be seen in this first screening.
It's just to give you a first hint. But it should always be confirmed.
I am a microarray expert and microarrays are not just for blind exploration. They are consistent and reliable. If you do your job correctly they do not have to be validated. The problem with microarrays is not the microarray technology. The problem is in the assumptions people make about genes and gene regulation and in the bioinformatics. I often say that bioinformatics is the only discipline of biological research where you can be wrong half the time and still get your paper published. This is especially true of microarray experiments. Every microarray platform is prone to errors because of the sloppy way probes are associated with genes. For the bioinformatics community every splice variant is equivalent and they all are called the same gene. But if your microarray probe detects one splice variant and your PCR primers detect a different splice variant then it is unlikely that the two assays will agree. But no one takes the time to make sure that the PCR primers will detect the same thing that the microarray primers will detect. If you do you will also find that microarray technology is accurate and reproducible. The microarray community is beginning to realize this and they are starting to move away form the concept of one probe equals one gene and toward trying to determine exactly what each array detects. Here are a few references for a better identification of probes on an array.
Mecham BH, Klus GT, Strovel J, et al. (2004) Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. Nucleic Acids Res;32(9):e74.
Carter SL, Eklund AC, Mecham BH, Kohane IS, Szallasi Z. (2005)Redefinition of Affymetrix probe sets by sequence overlap with cDNA microarray probes reduces cross-platform inconsistencies in cancer-associated gene expression measurements. BMC Bioinformatics 6(1):107.
Harbig J, Sprinkle R, Enkemann SA. (2005) A sequence-based identification of the genes detected by probesets on the Affymetrix U133 plus 2.0 array. Nucleic Acids Res;33(3):e31.
Ji Y, Coombes K, Zhang J, Wen S, Mitchell J, Pusztai L, Symmans WF, Wang J. (2006) RefSeq refinements of UniGene-based gene matching improve the correlation of expression measurements between two microarray platforms. Appl Bioinformatics. 5(2):89-98
For further proof that this is the way the microarray community is moving, here are a few references that demonstrate that the microarray data is consistent and accurately correlates across platforms when one takes the time to sequence match the probes used in the different platforms.
Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J. (2005) Independence and reproducibility across microarray platforms. Nat Methods. 2:337-44.
Dallas PB, Gottardo NG, Firth MJ, Beesley AH, Hoffmann K, Terry PA, Freitas JR, Boag JM, Cummings AJ, Kees UR. (2005) Gene expression levels assessed by oligonucleotide microarray analysis and quantitative real-time RT-PCR -- how well do they correlate? BMC Genomics. 6:59.
Barnes M, Freudenberg J, Thompson S, Aronow B, Pavlidis P. (2005) Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms. Nucleic Acids Res. 33:5914-23.
Petersen D, Chandramouli GV, Geoghegan J, Hilburn J, Paarlberg J, Kim CH, Munroe D, Gangi L, Han J, Puri R, Staudt L, Weinstein J, Barrett JC, Green J, Kawasaki ES. (2005) Three microarray platforms: an analysis of their concordance in profiling gene expression. BMC Genomics. 6:63.
Zhu B, Ping G, Shinohara Y, Zhang Y, Baba Y. (2005) Comparison of gene expression measurements from cDNA and 60-mer oligonucleotide microarrays. Genomics. 85:657-65.
de Reynies A, Geromin D, Cayuela JM, Petel F, Dessen P, Sigaux F, Rickman DS. (2006) Comparison of the latest commercial short and long oligonucleotide microarray technologies. BMC Genomics. 7:51.
Wang Y, Barbacioru C, Hyland F, Xiao W, Hunkapiller KL, Blake J, Chan F, Gonzalez C, Zhang L, Samaha RR. (2006) Large scale real-time PCR validation on gene expression measurements from two commercial long-oligonucleotide microarrays. BMC Genomics. 7:59.
So what does a microarray researcher need to do? Think about the biology of your system. What are you trying to investigate? And then verify genes that will improve the knowledge of the system you are studying. If you get PCR primers or antibodies do it in a way that makes those reagent useful for other assays that will help your research. Don’t buy stuff just to validate the microarrays.
However, your first step is to sequence verify the probe on the microarray platform that you found. And consider what it detects; not just perfectly but with some mismatches. You can’t tell the difference between a high concentration of a gene that hybridizes poorly and a low concentration of a gene that hybridizes perfectly. If a probe can detect 5 different things you will need to determine which of the 5 things gave rise to your signal on the array. You can’t just pick one and assume it accounts for everything. Design primers that will account for the different splice variants and the gene family homologs. If you do not do molecular biology analysis you will end up with a lot of results where the microarray data does not match with the PCR or antibody results. The last thing to consider is that there are a lot of ways that a gene can be regulated. Consider changes to the 3 prime untranslated region and the regulation of translation by antisense microRNAs. A transcript can go down when the amount of protein increases. This does not mean that the microarray data does not agree with the downstream assay this means that cells are complex and we have to be careful about our assumptions.
I agree with what Florida1 says. I didn't believe microarray until I have done it myself. Recently in a microarray experiment, we have identified thousands of up-and down-regulated genes by a treatment. I would say over 95% is reproducible by RT-PCR. The critical part is repeats including biological and technical repeats. Without repeats you can not apply statistical analysis upon which genes should be selected for verification.
i get that microarray's aren't very sensitive, and that it really depends on the competency of the investigator... i'm not doing microarrays (finger's crossed), but data mining at the moment for a lit review.
the problem i've encountered is this:
BRCA1 mutant primary breast cancers have been put through a couple of microarrays, and some genes are upregualted, some down. The gene i'm looking at has been shown in teh past, using FISH, southern, and RTPCR to be amplified and overexpressed. in the microarrays, this gene is shown to be down regulated. i don't know who do believe. i'm siding with the FISH/southern/rtpcr people for the time being, just because they show more than just dots on a piece of paper.
similaraly, my gene of interest is significantly overexpressed in some breast cancer's. microarrays show that it isn't significantly overexpressed.
my gene is also supposed to not be expressed in some cell lines, shown by RT PCR, westerns, and northerns... but in the microarray paper (all of 1), it shows that my gene is actually expressed in these cell lines.
cells are complex, but to have such different results from the same cell line?
going to do microarray experiments. only to screen, nothing more than that.
me too. but i'll be doing my experiment to find out the difference between infected an uninfected cells using illumina's platform. anyone try illumina before? how is it? good or not so good?
I agree with Florida1, totally. I did microarray, Serial analysis of Gene Expression (SAGE), RT-PCR and QRT-PCR with the same RNAs.
The result summary was like this: (unpublished
-Microarray and SAGE senses differential expression of different genes. They rarely contradict with each other(one says up, other says down - ca 3%)
-For some of the genes microarray detected, QPCR approved the result. FOr some other genes coming from SAGE data, qPCR approved the result.
So, in my view, it is totally wrong to give up on microarray data. The thing is to efficiently design the experiment by trying to maximize the biological representativity. So, covering more variability.
Besides, microarray data is being used for some further down high trough put analysis, such as TF binding analysis or molecular cell phenotype analysis (e.g: after this effect, the cells activated PDGF signalling). This type of an high dimensional data, at least now, can not be produced by other methods (too expensive).