Use of ORF's - (Nov/24/2005 )
Is there any use for ORF's in eukaryotic genes?
For instance, they're useful in prokaryotics as these genes tend not have introns, and so you can just look for the longest ORF for coding regions, but can you do a similar thing for eukaryotic genes, that have intons?
A bit understated perhaps...
Finding eukaryotic genes from chromosomal DNA sequence data is, of course, a bit tougher than finding them in prokaryotic sources as relying on the presence of long open reading frames is insufficient. There are, however, several DNA features one can look for to indicate the presence of a gene in a eukaryotic DNA sequence:
- transcription factor binding sites
- polyadenylation signal
- Kozak signal
- non-random distribution of bases in coding sequences
- shifts in GC content (introns tend to have higher GC content than non-coding sequences)
- 5' splice site (e.g. AG/GUAAGU)
- 3' splice site (e.g. (C/U)N<10(C/T)AG/G)
- branch site (e.g. UAUAAC 20-50 bp upstream of the 3' splice site)
- six-frame translation similarity with known proteins (e.g. BLAST)
it's not enough to tell whether you obtained a complete coding sequence. i highly recommend you a relative paper that describes this topic in detail.
Interpreting cDNA sequences: some insights from studies on translation (kozak, 1996)