Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

When did BLAST become so f... up? - (Feb/19/2012 )

Now, I've been BLASTing like 10 years. Usually primers.

But some time ago BLAST made changes that really made me wonder.

When I BLASTed primer in past, selected Human genomic and transcript, it showed me a table first with transcript hits, then with genomic hits, nicely sorted by their score or E-value or whatever is actually important.

Now what I get is:
Genomic sequences.
Transcripts
Genomic sequences (again!)
Transcripts (again!)
Genomic sequences (here we go..)

and what's worse it isn't sorted at all!
I actually got 100% coverage and 100% identity for my sequence, but that particular is showed NEAR THE BOTTOM!
First line of the table:
NT_007933.15 Homo sapiens chromosome 7 genomic contig, GRCh37.p5 Primary Assembly 34.2 569 100% 1.2 100% (Accession, Description, Max score, Total score, Query coverage, E value, Max ident - that's my gene all right)
The link leads to the first detailed alignment, that doesn't contain the 100% coverage hit. Like..WHAT?

Then there is another line in the table in the third Genomic sequences part that shows much lower E-value eventhough there is the same coverage and identity:
NT_007933.15 Homo sapiens chromosome 7 genomic contig, GRCh37.p5 Primary Assembly 44.1 44.1 100% 0.001 100%
WHAT THE..? That is the same sequence, why is there twice?? And the link leads to... the same detailed aligment as the first one.

So I primerBLASTED both my primers and get a hit on NT_007933.15 (surprize!) on position 38802528 (start of forward primer), I searched for that position on the BLAST page, and there it is, near the bottom, again detailed aligment for NT_007933.15 but this time with my 100% hit. Found it. After 15 minutes!

Try it yourself with BLAST result M21N9F2A012. (or BLASTn cacagagagagtctggacacgt on Human, no other options selected). Tried it with two different browsers, so it shouldn't be error of display of something.

What the hell happend with that? It's not even able to show me a 100% hit on my sequence anymore. I just hope it's some my mistake, because this couldn't be real. I really hope.

(If this sounds angry, frustrated and agressive, that's because I'm angry, frustrated and agressive)

-Trof-

hi Trof,

I just did a small test in the ncbi blast webpage. I'm using firefox 9.0.1 under ubuntu 11.04, but as you said, the web browser shouldn't be a problem. I don't know exactly what you mean. What I did was to select nucleotide blast from this webpage http://blast.ncbi.nlm.nih.gov/ then the database (in this case Human genomic + transcript) and "Optimize for Highly similar sequences (megablast)", with all the other default settings.


correction:

right. I did what you said. It's true, you do have 100% identity, but the e-value and score are lower because the alignment does not correspond to the whole primer sequence, it only accounts from the nucleotide 6 to the 21, whereas the first hit:

ref|NT_007933.15| Homo sapiens chromosome 7 genomic contig, GRCh37.p5 Primary Assembly

does include the 22 nucleotides in your sequence.
does this make more sense now?


cheers.

tj.

-toejam-

After some time I wrote this, BLAST stopped behaving this way and displays it now as before. Thank Darwin!

The problem wasn't with my understanding e-value (e-value is lower when the coverage is not 100%, that's logical, I got this concept) but with the fact, the results were sorted in a really weird way and on-page links didn't corresponded as they should. It was probably some problem on BLAST side, since it's OK now.

I should have done a screenshot to show how it looked, but I was too angry.

-Trof-