Strange Vector Problems - (Sep/28/2011 )
I have cloned environmental DNA into a Fosmid vector, which is essentially an 8.1 kb plasmid under inducible copy number control using L-Arabinose. I am using Sanger Sequencing to sequence about 800 bases of the apprx. 40 kb insert. Most of my sequences are fine, however when I use BLASTn about 10% of the sequences come back as vector, even after vector trimming.
What is even stranger is that the vector that shows up on the BLAST is not the vector I used. Typically about the first 300 bases are shown to be vector in BLAST (and they are the same 300 for each case), and if I remove them manually and BLAST the remaining 500 bases, I get a legitimate insert. Also, each clone can be sequenced from each end of the insert, and this problem is usually present only one side (but no pattern in terms of forward or reverse primer). When I take the 300 or so bases that are "vector" from the BLAST, and search my vector sequence for it, it is not present even with 25 mismatches.
There are other cases where some sequences have a different vector than the two mentioned above, and it is only about 80 bases when examined with BLAST. No other results show up, but if I cut this vector (which is not my vector because I have already clipped it and the sequence has no homology to the vector I am using) and BLAST the rest of the sequence a legitimate insert is detected through BLAST.
Any ideas on what could be cuasing this? Mispriming during sequencing? software limitations?
Are the vectors that are showing up on your Blastn searches being used in your lab? If they are, then I would say that your cloning reactions or reagents may have gotten contaminated with some of these vectors and gotten ligated in along with some of your inserts. Are the junctions between the extra "Vector" sequence and your "insert" sequence the result of restriction enzymes that were used in the cloning or are they ragged junctions (i.e. different for every clone)? This will indicate if this is a systematic or random problem, and could indicate where the contamination is coming from.
Best of Luck.
Hi, thanks for answering.
The vectors that come up on the BLASTn searches are not being used in the lab, and the inserts were prepared through mechanical shearing, so no restricion enzymes. The "extra vector" that persists after trimming and is not the vector used in construction is the same for each clone, and is consistently about 294-300 bases. Whats strange is that for example the forward direction may have this problem and the reverse wont. In some cases I have resequenced the clone and the vector is not there. Could this be a problem with sequencing since is it the same vector sequence for each clone, and resequencing has some effect. However, I mention again that this vector is not the vector that I or anyone in my lab uses, so it is puzzling why it is there, and interesting that clipping it off manually seems to give an appropriate insert.
If you can resequence the same clone with the same primer and not have the vector sequence show up, then the problem is definitely with the sequencing. Are you doing the sequencing yourself or are you sending it out? Again, if the extra vector sequence is from a vector not being used in the lab, then the contamination must be coming from the sequencer or the sequencing lab, but I can't think of a way for this to happen without the actual sequence data looking really ugly (i.e. overlapping sequence, poor peak resolution). Are there any unique restriction sites that are in this extra vector sequence that could be used to digest your fosmid DNA preps to verify the presence or absence of this sequence? This is definitely something I've never heard of before.
Best of Luck.