Protocol Online logo
Top : New Forum Archives (2009-): : Bioinformatics and Biostatistics

Downloading a complete trace archive for a species‏ - (Oct/26/2012 )

Hi,

I´m interested in downloading the complete EST sequences stored in trace archives in "BLAST".

Do you know how to do that?

cheers,

-mordiano-

Yes, you can. You have to access NCBI FTP site here http://www.ncbi.nlm.nih.gov/guide/data-software/#downloads_

http://www.ncbi.nlm.nih.gov/books/NBK49541/
Can I retrieve a large dataset for a particular organism?

For large datasets, you can formulate a search limited to organism, e.g., pig in Entrez, display all the records in your desired format, and then save using the Send to file option from the toolbar. Confirm the message that asks if you want to download xxx number of records. You can also use Batch Entrez to download a database-specific file of accessions or GIs. You can download some organism-specific files from the NCBI FTP site, for example, genomes. You can also use the Entrez Utilities (E-Utils).

-pcrman-

Thanks for the answer and sorry for beeing such a noob,

but when i click on "trace archives" its just empty, no species listed as promised:

ftp://ftp.ncbi.nlm.nih.gov/pub/TraceDB

When I go through the "BLAST->trace archives" sure enough the list of species appear...

-mordiano-

Right....

I also have to wait about 5s for the folders to appear.

That one was on me

-mordiano-

I used Firefox to access the FTP site ftp://ftp.ncbi.nlm.nih.gov/pub/TraceDB/. Organism specific trace files are there.

Try to access the FTP site using a different browser or a FTP client program.

-pcrman-

Got it!


...as a FASTA... is there a way to access the metadata?

I was told from my first source (1.) that the sequences for my species where from an EST analysis. The second source (2.) told me there where not ESTs there, but rather at the EST repository.

Im sticking with (1.) for now, It is a "strange" species so maybe they just "dumped" everything in the Trace archives... :/

-mordiano-