Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

Using Remote NCBI blast - (Mar/07/2006 )

Hi
I have tried to do a remote blast by doing RemoteBlast.pm but I am unable to get the results. I am attaching my perl script and error.

I am new to bioperl please help me to solve it


SCRIPT:

package Bio::Tools::Run::RemoteBlast;
package Bio::Search::Result::ResultI;
use Bio::Tools::Run::RemoteBlast;
use Bio::Search::Result::ResultI;
use strict;
my $prog = 'blastp';
my $db = 'swissprot';
my $e_val= '1e-10';

my @params = ( '-prog' => $prog,
'-data' => $db,
'-expect' => $e_val,
'-readmethod' => 'SearchIO' );

my $factory = Bio::Tools::Run::RemoteBlast->new(@params);

#change a paramter
$Bio::Tools::Run::RemoteBlast::HEADER{'ENTREZ_QUERY'} = 'Homo sapiens [ORGN]';

#remove a parameter
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

my $v = 1;
#$v is just to turn on and off the messages

my $str = Bio::SeqIO->new(-file=>'hum' , '-format' => 'fasta' );

while (my $input = $str->next_seq()){
#Blast a sequence against a database:
print "$input\n\n";
#Alternatively, you could pass in a file with many
#sequences rather than loop through sequence one at a time
#Remove the loop starting 'while (my $input = $str->next_seq())'
#and swap the two lines below for an example of that.
my $r = $factory->submit_blast($input);
#my $r = $factory->submit_blast('amino.fa');

print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) {
foreach my $rid ( @rids ) {
my $bl= '.BLASTQ1';my $rid ="$rid$bl";
print "RID $rid\n";
my $rc = $factory->retrieve_blast($rid);
print "RC $rc\n\n";
if( !ref($rc) ) {
if( $rc < 0 ) {
$factory->remove_rid($rid);
}
print STDERR "." if ( $v > 0 );
sleep 5;
} else {
my $result = $rc->next_result();
print "Result\n $result\n";
#save the output
my $filename = $result->query_name()."\.out";
print "File $filename\n";
$factory->save_output($filename);
$factory->remove_rid($rid);
print "\nQuery Name: ", $result->query_name(), "\n";
while ( my $hit = $result->next_hit ) {
next unless ( $v > 0);
print "\thit name is ", $hit->name, "\n";
while( my $hsp = $hit->next_hsp ) {
print "\t\tscore is ", $hsp->score, "\n";
}
}
}
}
}
}

# This example shows how to change a CGI parameter:
$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'} = 'BLOSUM25';

# And this is how to delete a CGI parameter:
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};


OUTPUT:

Bio::Seq=HASH(0x850ca60)

waiting...RID 1141732920-14923-211018296944.BLASTQ1
RC Bio::SearchIO::blast=HASH(0x86ed580)

Result

Can't call method "query_name" on an undefined value at 2.pl line 55, <GEN4> line 18.




Thanks

SAnjibGupta

-sanjibgupta-

I actually gave up on the RemoteBlast module, and instead use a system call to NCBI's netblast client, blastcl3.exe (see here), and parse the blast output file with Bio::SearchIO. So, I might do something like:

CODE
#!/usr/bin/perl -w
use strict;
use warnings;
use Bio::SearchIO;
my $netblast_dir = $ENV{USERPROFILE} . "\\Desktop\\netblast";
my $blast_report;

chdir ($netblast_dir) or die;

system ("blastcl3 -p blastn -d nr -i in.fasta -o out.blast");

$blast_report = new Bio::SearchIO ('-format' => 'blast', '-file' => 'out.blast');

while ( my $result = $blast_report->next_result ) {
    my $hit_num = 0;
    while( my $hit = $result->next_hit ) {
    $hit_num++;
    while( my $hsp = $hit->next_hsp ) {
        my $hsp_num++;
        my $hitName = $hit->name;
        my $percent_id = sprintf ("%.2f", $hsp->percent_identity);
        my $hspLength = $hsp->hsp_length;
        my $hspStart = $hsp->start('hit');
        my $hspEnd = $hsp->end('hit');
        my $numID = $hsp->num_identical;
        my $numGaps = $hsp->gaps;
        print "Hit number: $hit_num, hsp number: $hsp_num\n\t$hitName\n" .
        "\t\tPercent ID: $percent_id\n" .
        "\t\tLength:  $hspLength\n" .
        "\t\tRange: $hspStart - $hspEnd\n" .
        "\t\tIdentical residues: $numID\n" .
        "\t\tGaps: $numGaps\n\n";
    }
    }
}

print "Done.\n";

#other things available:
#    result level:
#    my $queryName = $result->query_name;
#    my $queryAcc = $result->query_accession;
#    my $queryLength = $result->query_length;
#    my $queryDesc = $result->query_description;
#    my $dbName = $result->database_name;
#    my $numSeqsInDb = $result->database_entries;
#    my $numHits = $result->num_hits;
#    hit level:
#    my $hitAcc = $hit->accession;
#    my $hitDesc = $hit->description;
#    my $hitEvalue = $hit->significance;
#    my $hitBits = $hit->bits;
#    my $numHsps = $hit->num_hsps;
#    hsp level:
#    my $hspEvalue = $hsp->evalue;
#    my $fracIdentical = $hsp->frac_identical;
#    my $fracConserved = $hsp->frac_conserved;
#    my $hspQuerySeq = $hsp->query_string;
#    my $hspHitSeq = $hsp->hit_string;
#    my $hspConsensus = $hsp->homology_string;
#    my $hspRank = $hsp->rank;
#    my $queryStrand = $hsp->strand('query');
#    my $hitStrand = $hsp->strand('hit');
#    my $queryStart = $hsp->start('query');
#    my $queryEnd = $hsp->end('query');
#    my $hspScore = $hsp->score;
#    my $hspBits = $hsp->bits;


Hope this helps!

-HomeBrew-

Thanks for you quick help. I am running it Linux. Your scripts runs ok nut it shows 2 errors. Please suggest what i should do.



Can't locate object method "hsp_length" via package "Bio::Search::HSP::GenericHSP" at ./6.pl line 22, <GEN1> line 10072.

Can't locate object method "num_identical" via package "Bio::Search::HSP::GenericHSP" at ./6.pl line 25, <GEN1> line 10072.

Thanks
Sanjib Gupta

-sanjibgupta-

This is a bit of a hack, but what you could do is use lynx.
Open linux with the keylogger attached, make one example input (it has to be a complete run). Now using perl edit that keylogged file to change the sequence and parameters, whatever. write a bash loop to submit all your jobs.

Alternativly you could you web::mechanize - which would be a nicer non-hacky way of doing things.

-DPK-

I get no such error on either a Windows machine or a Linux box...

There are clearly methods named "hsp_length" and "num_identical" in the GenericHSP module (see here).

All I can think of is your BioPerl installation is out of date?

-HomeBrew-

QUOTE (DPK @ Mar 8 2006, 04:12 PM)
This is a bit of a hack, but what you could do is use lynx.
Open linux with the keylogger attached, make one example input (it has to be a complete run). Now using perl edit that keylogged file to change the sequence and parameters, whatever. write a bash loop to submit all your jobs.

Alternativly you could you web::mechanize - which would be a nicer non-hacky way of doing things.


I am a new user can u please send me a script.

-sanjibgupta-

I can't find any of my scripts (tar gzipped someplace) and I have my own work to do. Simply read the man page of lynx.
In any case this should be a last resort, as BioBrw suggested, update your copy of bioperl.

-DPK-