How to identified stop codon? - (Oct/08/2007 )
Hi I am working on a new sequence that has not been published before. Through PCR, Blunt Cloning and Blast search, I got the new sequence which mostly aligned with same gene sequences of a range of other species. Can anyone please let me know how to identify the stop codon of an unpublished sequence? The 5’ end of my sequence is very much the same as the NCBI sequences therefore I can identify the start codon by comparing my sequence with others. However the 3’ end sequence is different and I found a few “TGA” or “TAG” at the 3’ end of my sequence; therefore I am not sure where my stop codon is located.
Imagine your ORF encoding a protein sequence that is a sentence, for example: "My name is Lily123." In that case, the start codon encodes 'M', the stop codon encodes '.' Since you've already known your start codon, just start reading the amino acid sequence from there according to the genetic code and you will find your stop codon. There is only one stop mark for each sentence, after all.
Many thanks for your help. But if I see a few "stop marks" for my sentence, how do I identify the right one? The 3' end of my new seqnece is very different from GenBank records, so it's hard for me to identify the location of stop codon by referring to the CDS features of other published sequences. Moreover, there are more than one TAG at the 3' end.
if it is a simple bacterial gene, then the first stop codon you mean (starting from the first AUG) is the stop codon. if it is an eukaryotic genomic gene sequence, then you will have to aware of slice variants, which may have alternative terminal exons...(so got to look out for splice sequences)
Is your sequence a eukaryotic mRNA sequence or genomic sequence? /agree with the above post - there's no way to know exactly where your stop codon is if you are looking at unspliced, unedited raw genomic DNA sequence; but if you are looking at a mature mRNA sequence - then the stop codon should be (most likely is) the first TGA/TAA/TAG that is the same reading frame as the AUG start codon.
Thank you so much guys. My sequence is a eukaryotic mRNA. I identified the open reading frame according to your suggestions. By the way, I found this website is quite useful for finding ORF in a sequence: http://www.ncbi.nlm.nih.gov/gorf/gorf.html