Posted 09 February 2009 - 07:25 AM
Posted 09 February 2009 - 09:41 AM
I know this works with a file of fasta sequences; I've never tried entering a list of accession numbers, though I'd be surprised if it didn't work with multiple accession numbers, just like it works with a single file containing multiple fasta sequences.
Posted 09 February 2009 - 11:46 AM
Perhaps I should describe what I'm trying to do, which might give an idea. I did a ChIP-chip experiment with primary mouse cells, and I want to compare that data set with a expression microarray experiment our lab did a while back with rat tissue. The samples are from the same tissues, just in different species.
Posted 09 February 2009 - 07:40 PM
The web is not a problem per se; there is a way to send specific querys to the data (see here), and Perl can handle the web scrapes using a module like WWW::Mechanize, but the iHOP people say:
Please note that iHOP is a freely available tool from the academic domain. Not a company! Thus, it is necessary to limit server load and to give preference to individual users.
Bulk downloads may lead to the banning of IP addresses for specific servers or institutions!
Please contact directly with Robert Hoffmann if there should be a scientific reason for bulk downloads. Thank you for your cooperation!
I don't know whether they would consider several dozen requests a minute originating from the same IP address to be "bulk downloading"....
Posted 10 February 2009 - 05:34 AM
I looked into that, but I'm not too interested in sequence homology, rather I want know what the rat homolog of a given mouse gene is.
Of course, the rat homolog of a given mouse gene (otherwise known as an ortholog) is defined by its sequence homology to the mouse gene, no?
Waiting for the blast to run takes longer (per gene) to churn out than what I've been doing.
This may be true if you use the web interface, but if you create a local BLASTable database of the rat genes (using the NCBI program makeblastdb.exe or the older formatdb.exe) and a local copy of the BLAST program (either blastn.exe, blastp.exe, or the older blastall.exe), you can speed things up considerably. Using a Perl script, I can take ~5,000 genes from one organism and find the closest match to each of them in another organism in minutes...
Posted 11 February 2009 - 02:54 PM
You could do this in pure SQL or dollop some #!/usr/bin/perl on top and use DBI;
oh and you could do rat and human at the same time.
mysql -u anonymous -h ensembldb.ensembl.org
Edited by perlmunky, 11 February 2009 - 03:06 PM.