NGS of bi-partite viral genome - (Apr/18/2014 )
We asked a company to run NGS for an unknown purified virus but the result looks really strange after alignment and BLAST.
We think our virus is a begomovirus. These viruses sometimes have two circular DNA genome (bi-partite), and sometimes just one genome (mono-partite). We think the weird sequence could be because the company aligned the results against a mono-partite genome, but we're not sure. Some part of the sequence they gave us matches begomoviruses, but the other parts match with rat and other species!!!
We asked the company to give us raw data so we can align it ourselves, but I'm not really good with the software and reading of raw data, so I'm not sure what do to if we have a bi-partite genome.
Maybe I just add the sequence of the second genome to the first genome and align? But does bi-partite genome affect NGS reading in first place?
I think it gets tricky when you are doing NGS on an unknown genomic sample. Is there any homology between the two circular DNA genomes where your results could be read as a population instead of independent? I have seen population contamination with eukaryotic samples (mouse, human), but I have never done any virus work. Do you know how they sequenced your DNA?
Are you just trying to confirm what organism you are dealing with?
My favorite software to use is seqman by DNASTAR. You could fragment your sequence and feed it through seqman to see what you are dealing with. If you are trying to confirm if it is a begomovirus, you shouldn't have any problem.
Sorry if this is scattered, it is early and I forgot my coffee at home.
Haha, thanks jerryshelly
I'll have a look at Seqman. The funny thing is that I have NGS certificate issued by Illumina but I forgot how to use the software since my students dealt with it mostly. I already received raw data from the company. I'll try loading it into this software and see.
I will also read up old articles about double-circular-genome viruses to see how they figured out that these viruses have double genome before NGS was even invented.
Since these genomes are single stranded DNA, they must give multiple bands on gel. I am not sure if any RE that cuts ssDNA exists, so that I can make them linear. If our virus has single genome then its length must be around 2.8 kb.
I actually BLASTed the raw data just now. I can see 99% match for 1.3 kb of it with another strain. So I think I'm getting closer.