Jump to content

  • Log in with Facebook Log in with Twitter Log in with Windows Live Log In with Google      Sign In   
  • Create Account

Submit your paper to J Biol Methods today!
Photo
- - - - -

How to use NCBI ?


  • Please log in to reply
4 replies to this topic

#1 lyok

lyok

    Veteran

  • Active Members
  • PipPipPipPipPipPipPipPipPipPip
  • 206 posts
6
Neutral

Posted 23 May 2012 - 02:34 AM

Hallo all,

I have a problem with NCBI.

I have a sequence and I want to know what it might be.

Now what I do is the following:
I go to the NCBI website and then I blast this.
When I do this, I get a page with some results (see picture 1)
I see a lot of names with results.

But whats next?


What I do, is check the best result, look for the position of that gene.

Eg: here it states that the gene starts at position 24.409 (see picture 2)
But how do I find that gene without the need to manually start looking for it on the page that has the sequence of the bacterium with the sequence I am looking for? Or is this the only way to do it?... so do I need to manually look up position 24.409 in the genome of that bacterium to fine my gene?(see picture 3)? Or is there an easier method?

I hope my question is a bit clear?
fotke1.jpg fotke2.jpg fotke3.jpg


Another question: what if you have more then 1 99% similarity result? Here I have more... I cant look them all up manually? This will take days?

+ I did it with 1 gene , checked 2 results and both results give me another gene (1 says its a phage DNA package gene and 1 says its a bacerium gene).

Edited by lyok, 23 May 2012 - 03:55 AM.


#2 Felipillo

Felipillo

    Enthusiast

  • Active Members
  • PipPipPipPipPip
  • 54 posts
4
Neutral

Posted 23 May 2012 - 10:36 AM

Hi lyok

Mabe you're looking for a blast parser like this http://kirill-kryuko...s/blast-parser/ to get chromosome locations, in a simple output. And also see this post http://www.biostars....rom-the-genome/

I think you should consider the E-value instead of similarity, for homology prediction. Also if you do not need to look for regulatory elements in DNA, it's easier to work with proteins, they have signature conservation and protein families, that help you to infer protein function for unknown sequences.

Edited by Felipillo, 23 May 2012 - 10:50 AM.

Chance favors the prepared mind
Louis Pasteur.

#3 lyok

lyok

    Veteran

  • Active Members
  • PipPipPipPipPipPipPipPipPipPip
  • 206 posts
6
Neutral

Posted 23 May 2012 - 11:29 AM

Hi lyok

Mabe you're looking for a blast parser like this http://kirill-kryuko...s/blast-parser/ to get chromosome locations, in a simple output. And also see this post http://www.biostars....rom-the-genome/

I think you should consider the E-value instead of similarity, for homology prediction. Also if you do not need to look for regulatory elements in DNA, it's easier to work with proteins, they have signature conservation and protein families, that help you to infer protein function for unknown sequences.


Ok, thanks for the links.

Why should I use the E-value in stead?
BTW: in my specific situation here, the e-value is the same for the 99% values (or at least for some).
And to be honest: I am not that inclined to check all the "high scoring" genes.. it would take me days.. I just checked 1 or 2 each time.



I am not sure what you mean with: "Also if you do not need to look for regulatory elements in DNA, it's easier to work with proteins, they have signature conservation and protein families, that help you to infer protein function for unknown sequences.".

How can I work with a protein if I just have a DNA sequence?
Can I simple translate the DNA sequence into a protein? But how can I do this, because I cant know in advance what part of the sequence is coding for the protein, right?

#4 Felipillo

Felipillo

    Enthusiast

  • Active Members
  • PipPipPipPipPip
  • 54 posts
4
Neutral

Posted 24 May 2012 - 08:54 PM

The E- value tells you the probability to get a random match, so you must consider the lowest values.

"Also if you do not need to look for regulatory elements in DNA, it's easier to work with proteins, they have signature conservation and protein families, that help you to infer protein function for unknown sequences.".


In other words, for gene prediction you wanna predict all the binding sites for regulatory elements, like transcription factors, activators, repressors, etc.. Sometimes this is a difficult task, but proteins, are built with a limited number of blocks (signature, motif), so it's relatively easy to look for known blocks, at databases builded for that purpose.

If you wanna get blast results for proteins, you should use blastx against swissprot.

This online training material could help you a lot, to grasp the blast basics http://bioinfbook.or...pter4/index.php from
Bioinformatics and Functional Genomics book, one of the best Bioinformatics books, I have read.

@FelipeRiveroll

Edited by Felipillo, 24 May 2012 - 09:01 PM.

Chance favors the prepared mind
Louis Pasteur.

#5 lyok

lyok

    Veteran

  • Active Members
  • PipPipPipPipPipPipPipPipPipPip
  • 206 posts
6
Neutral

Posted 01 June 2012 - 09:01 AM

O k thanks.
But in the end, I am just looking for genes, not so much the proteints themself.

So you would recommend that book? Is it more about statistics and bioinformatics in general or also about the practical use of databases?

And what is that last link? it gets me on a twitter page?




Home - About - Terms of Service - Privacy - Contact Us

©1999-2013 Protocol Online, All rights reserved.