Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

Which BLAST db to download for mouse genome? - (May/19/2011 )

I've got local BLAST+ set up on a few computers here and I've been using it for a while against my own compiled databases. However I'd now also like to do some BLASTing against the mouse genome - something I've always done online in the past.

For the life of me I can't work out from the BLAST documentation which files I actually need to download from their FTP server to achieve this. Can someone point me in the right direction? I'm after the file/s for the mouse genomic reference assembly - I don't need transcripts, or anything other than that (unless they're all bundled together anyway).

From the server I can see the refseq_genomic_**.tar.gz files but that doesn't specify species, so I assume that includes the human assemblies as well. Given that there are 35 of them at 1Gb a piece, I'm hoping there's somewhere I can get the files for only the mouse sequence?

Thanks in advance :)

-A.N.Other-

A.N.Other on Thu May 19 15:53:18 2011 said:


I've got local BLAST+ set up on a few computers here and I've been using it for a while against my own compiled databases. However I'd now also like to do some BLASTing against the mouse genome - something I've always done online in the past.

For the life of me I can't work out from the BLAST documentation which files I actually need to download from their FTP server to achieve this. Can someone point me in the right direction? I'm after the file/s for the mouse genomic reference assembly - I don't need transcripts, or anything other than that (unless they're all bundled together anyway).

From the server I can see the refseq_genomic_**.tar.gz files but that doesn't specify species, so I assume that includes the human assemblies as well. Given that there are 35 of them at 1Gb a piece, I'm hoping there's somewhere I can get the files for only the mouse sequence?

Thanks in advance :)


The ResSeq multi-species database and contains curated sequences from NCBI, so it's a huge database. Take a look to this page:
NCBI RefSeq Project.

Maybe you can try download the mouse sequences from ENSEMBL databases and convert with formatdb command. Here's the link to sequences.
ENSEMBL FTP

Good luck.

Christian.

-CHODAR-

Thanks. I downloaded the files from ENSEMBL and compiled them with makeblastdb and blastdb_aliastool.

All sorted and up and running. Thanks for the quick response :)

-A.N.Other-