Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

BLAST analysis - (Apr/15/2007 )

Dear all,

Good day to one and all. If my Blastn and Blastx results show more than one identity for one gene ,eg

giXXXXXXrefXXXX: Gene A

Score: XXX E: XXXX
Identity: 80/100 Positives: XXXX....

Query: ATCGGCATCAGACTACGCATAC.....
Sbjt : ATCGGCATCAGACTACGCATAC.....


Score: XXXX E: XXXX
Identity: 20/30 Positives: XXXX....

Query: TACGACTACAGCAT....
Sbjt : TACGACTACAGCAT....

Does this mean that there are gaps in the alignment or the gene? Should I add the identities up when explaining homology? eg.
80/100 + 20/30 = 100/130?

Hope you guys understand?
Thanks...

-chris_sylim02-

Good question. You are partially right.

QUOTE (chris_sylim02 @ Apr 15 2007, 08:16 PM)
Does this mean that there are gaps in the alignment or the gene? Should I add the identities up when explaining homology? eg.
80/100 + 20/30 = 100/130?


You are partially right. You could say that there are gaps in the alignment. This is in the sense that you are thinking in the sense of the whole gene sequence that you are trying to blast. However, you are wrong, because you are doing local alignment rather than global alignment. BLAST (basic local alignment) are trying to find pieces of sequences that match each other. If the program find out that by extending the gap into next matching region, it is losing more scores than gaining, it will call it off and give you the first match segment. In this sense, it is doing its job of local best alignment, and there is no "Gap" between aligned local pieces.

Gap concept in BLAST are regions that matches nothing, for example:
AAATTGCGA----TGAC
AAATTGCGAACGTTGAC


In you case, there should be index telling you from ## to ## of those 100 (80 matched). There then you can know pieces that are being left out, then you can see, those left out are not "Gaps", most of cases, they simply don't match.

A biology perspective might be the sequence coding for loop regions between an alpha helix and a beta sheet.

Hope these helps.

-cyberpostdoc-

can i think that sequence (TACGACTACAGCAT....) have the repetitive sequence in the genome

-rogersun-