How to identify unknown sequence? - No BLAST identity!! (Nov/05/2008 )
Oh, if I BLASTn/MegaBLAST to all NCBI databases (redundant nr/nt) I get very low similarities, like <20% base matches. Same goes for BLASTx. What does this mean? My gene has no known identity and no known function?
Are you blasting it as DNA? Have you tried it as an amino acid sequence? Or with TBLASTX?
Yes BLASTed as DNA (BLASTn and BLASTx) and translated using GeneRunner BLASTed as amino acid (TBLASTN). And Here's an example of my TBLASTX. But when I BLASTn against Candida Genome Database's Assembly 19 complete sequence supercontigs I get one good match (96%). And TBLASTX against CGD's ORF DNA there are no significant matches.
What do I make of all this?
The image I uploaded was that of TBLASTX. Nothing very significant. VecScreen yielded no significant similarity.
Sorry - I didn't read your post carefully enough.
So, my next step would be to sequence it in the other direction, and see if you get the same sequence (to confirm that what you have is the true sequence, and not the result of a sequencing artifact).
If you've already done that, my next step would be to pick a forward primer from the 3' end and a reverse primer from the 5' end and extend the sequence.
Oh no, I figured might as well BLAST against patent database in NCBI, and guess what? my gene has like 50% similarity (216/220 out of 500bp subject length, my sequence is 400bp) to a patented gene.....
Am I infringing on someone's patent? If no should I continue with the primer walking?
In fact, a lot of my other "unknown" genes seem to have some patented gene similarity also. Why didn't these genes show up in non-redundant nucleotides (nr/nt) database? Does this mean that anyone who thinks they have a sequence with no homology to anything can suspect to BLAST to patented genes?
Anyone has any suggestion?
50%? 216/220 out of 4-500?
how does the similarity map look?
is there a long run of similarity or is it scattered. if there is a long run then it may be an alternatively spliced gene product or there may just be some conservation.
i would continue working on the project. if it turns out that it is from the same gene as the patent then you can reference it. it may be from the same gene but it is obviously not the same product.
thanks for your response.
I;ve uploaded the BLAST result to the patented sequence database in NCBI. What do you think?
yes I think continuing the project is good. After all the hard work....