Jump to content

  • Log in with Facebook Log in with Twitter Log in with Windows Live Log In with Google      Sign In   
  • Create Account

Submit your paper to J Biol Methods today!
Photo
- - - - -

Search all of GenBank for a given sequence


  • Please log in to reply
4 replies to this topic

#1 GerryB

GerryB

    member

  • Members
  • Pip
  • 5 posts
0
Neutral

Posted 29 May 2009 - 02:28 PM

OK, this is a naive question, but is it possible to search Genbank (ALL of genbank) for a given nucleotide (or protein..) sequence?
So I have some sequence "ATTCGTAGCTGATGACGATGACATGGGATTTTGAGGGAAC" and I am curious what known sequences happen to have that very same substring somewhere in them.

This is different than a BLAST-like search where I am aligning a sequence to a known genome I provide.. I mean a more general search where I have some sequence I am trying to classify or even determine if it's related to ANYTHING.

I do realize there are alignment tools like Maq and SOAP and MUMer for searching for an alignment with a given sequence. I'm more of asking about searching "all known Genbank DNA sequences" in a database.

Is there such a tool? It would be interesting if I've found some sequence and perhaps I'm looking for what it may be related to, and boom, Genbank can say "hey, that's found here in the human genome, in 44 places, and it's in the mouse genome here, and there's a weird cancer variant of dogs that have it here.." and so on.

Or is such a tool a bizarre fantasy? Genbank has 100M sequences (100T bases!) so maybe searching like that is impractical.
But if it wasn't, would it be useful?

#2 HomeBrew

HomeBrew

    Veteran

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 930 posts
16
Good

Posted 29 May 2009 - 07:48 PM

Why is that different than just BLASTing your sequence against the non-redundant nucleotide database?

#3 GerryB

GerryB

    member

  • Members
  • Pip
  • 5 posts
0
Neutral

Posted 29 May 2009 - 10:21 PM

Why is that different than just BLASTing your sequence against the non-redundant nucleotide database?


Don't you have to specify what you want to compare against?
Or does the online BLAST server really search ALL of Genbase (non-redundant parts at least I assume)?

#4 HomeBrew

HomeBrew

    Veteran

  • Global Moderators
  • PipPipPipPipPipPipPipPipPipPip
  • 930 posts
16
Good

Posted 30 May 2009 - 07:44 AM

You do have to select what you want to search against, but you don't have to select a particular organism. Go here and click on the "nucleotide blast" link in the "Basic BLAST" section. Paste your sequence into the box in the "Enter Query Sequence" section, and select "Nucleotide collection (nr/nt)" from the database drop-down list in the "Choose Search Set" section, and click the BLAST button.

#5 GerryB

GerryB

    member

  • Members
  • Pip
  • 5 posts
0
Neutral

Posted 30 May 2009 - 07:57 AM

You do have to select what you want to search against, but you don't have to select a particular organism. Go here and click on the "nucleotide blast" link in the "Basic BLAST" section. Paste your sequence into the box in the "Enter Query Sequence" section, and select "Nucleotide collection (nr/nt)" from the database drop-down list in the "Choose Search Set" section, and click the BLAST button.


Ha! That shows where I misunderstood. I really thought you had to specify the genomes!
So that's pretty impressive that it can search the whole database!
Thanks for fixing my broken understanding.




Home - About - Terms of Service - Privacy - Contact Us

©1999-2013 Protocol Online, All rights reserved.