DNA sequence data came back with lots of wrong bases - (Sep/29/2011 )

I had my genes of interest synthesized and did PCR to isolate my genes from the plasmid they came in. I used a Touchdown PCR protocol just to be safe. I ran my DNA out on a gel and I had bright bands with virtually no smearing. Thus, I decided to skip gel purification and used a Qiagen PCR purification kit instead. I used this product to TOPO clone my genes into Invitrogen TOPO vectors and transformed bacteria. I next grew up and did MPs on 4 colonies from each gene. I then sent the samples off for sequence analysis. I would like to note that my DNA concentration was on the low side for sequencing and thus I got a lot of failed reactions. In the reactions that worked though, all but one had these weird random "mutated" bases--I put it in quotes because I wasn't doing any site-directed mutagenesis. Basically, what could cause these mis-matches during amplification of the DNA? Also, could this just be a product of low concentration of DNA for sequencing? I am really going crazy over this. Another note, none of the mutations occurred where the primers actually bound to my templates and most of the mis-matches occurred in ones/twos with very few being long stretches of mismatches I would expect from primers mis-pairing. Anyone have any ideas? Thanks!


hard to answer without having the actual electropherogram?



It depends on what polymerase you used for the PCR. I have had wierd PCR reactions with Taq where I get clones that have a much higher mutation rate than should be expected, and if I redo the PCR most times it comes out fine.

The other question is, were the mutations the same across multiple clones or were they random in all clones. This could have been an issue with the gene synthesis company, as they will sometimes codon optimize genes without telling you (had this happen once) and if all of your mutations are the same, then it was your starting product and not PCR.

Lastly, I agree with pDNA that seeing an example of the chromatogram would be helpful to spot issues with the sequencing reactions themselves.

Best of Luck.


I used Platinum Taq polymerase (from Invitrogen) because I needed to do TOPO cloning. I compared sequences between clones that were transfected with the same PCR product and the nucleotide changes are virtually identical. I did consider codon usage optimization but 1: I was using the sequence data from the company as one of my templates (they claimed to have sequenced the DNA before sending it to me. and 2: Although some of the changes resulted in coding for the same amino acid--many were for totally different amino acids. Not to mention we are talkng about well over 100 changes in a gene a little over 1400 bases. I tried to upload the data but it wouldn't let me upload .ab1 files. Thanks for the help.


sounds weird!
100 mutations per 1400 bp is a exceptional high mutation rate ...and i would not expect it to be due to the polymerase.

However, for cloning purposes i would use a polymerase with a lower error rate than taq (e.g. phusion polymerase, won't do advertisement but for me this enzyme turned out to be really the best choice!).

For me this looks like the template is somehow compromised! ...,maybe you can check if the template is already wrong by doing an restiction digest using one enzyme the is "unique" to your sequenced template ...if this enzyme also cuts the template than you can assume that the template is already wrong!

Good luck!



Just an idea, are you sure your E. coli aren't, by some mistake, a random mutagenesis strain?

As for the sequencing file.
Upload it to (for example), and then share the link.