Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

Query: Multiple Sequence Alignment - (Jul/09/2008 )

Pages: 1 2 Next

Hi

I am trying to align a set of protein sequences in order to find a consensus sequence. I used two different programs available on the internet ie STRAP (http://www.charite.de/bioinf/strap/) and M-Coffee (http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi). Each of the two gives me a different alignment and both look really fine to me. Is there any way to determine which of the two alignments is the most correct one? If not then is there any way I can re conciliate the two alignments?

Thanks
dsain

-dsain-

QUOTE (dsain @ Jul 9 2008, 09:37 PM)
Hi

I am trying to align a set of protein sequences in order to find a consensus sequence. I used two different programs available on the internet ie STRAP (http://www.charite.de/bioinf/strap/) and M-Coffee (http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi). Each of the two gives me a different alignment and both look really fine to me. Is there any way to determine which of the two alignments is the most correct one? If not then is there any way I can re conciliate the two alignments?

Thanks
dsain

I have no real answer, but:

1. Check both programs for some explanation.

2. Eyeball the results, whichever looks better.

3. Do Google search for both programs, whichever brings more and high-impact journal articles having used this program is probably better (unless one is brand new and the other is from the previous century.

4. Find a third program, use the result that is most shared by two of these three programs.

And let us know if you do find an answer by talking with your institute's bioinformaticists.

..

-cellcounter-

QUOTE (dsain @ Jul 9 2008, 10:37 PM)
Hi

I am trying to align a set of protein sequences in order to find a consensus sequence. I used two different programs available on the internet ie STRAP (http://www.charite.de/bioinf/strap/) and M-Coffee (http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi). Each of the two gives me a different alignment and both look really fine to me. Is there any way to determine which of the two alignments is the most correct one? If not then is there any way I can re conciliate the two alignments?

Thanks
dsain



Check the parameters, ( gap extension and gap opening penalty, Matrix used) of both the alignment programs. Because, every program has its own default parameters. And critically check for the active site or binding site and see whether they are conserved between the your protein sequence of interest and the given aligned sequence.

Try giving the same parameters in both the programs and check for the alignment

-Biorad-

Each of the programs you've used has undoubtedly shown you their "best" alignment. Their judgment, however, is mathematical; yours is not.

As mentioned by Biorad, any alignment program gives certain weights (numerical penalties or rewards) for certain decisions that must be made (e.g. "Is causing these two additional consecutive amino acid residues to align with each other worth introducing a three space gap in one of the sequences?"), and seeks to select the alignment with some optimal total score.

This is one of the reasons just a small subset of alignment programs get cited in the literature -- it's easier to go with an "accepted" alignment algorithm than to explain why you chose another program's alignment or why you created your own. The thinking is "If Clustal's default alignment is good enough for this paper, and this paper, and this paper, etc., it's good enough for my paper, and good enough to stand without comment".

Ultimately, you must judge which alignment is best, or use them as guidance only and create your own alignment. You should, however, try a third program, as cellcounter suggests. How about ClustalW2?

-HomeBrew-

QUOTE (cellcounter @ Jul 9 2008, 11:17 PM)
QUOTE (dsain @ Jul 9 2008, 09:37 PM)
Hi

I am trying to align a set of protein sequences in order to find a consensus sequence. I used two different programs available on the internet ie STRAP (http://www.charite.de/bioinf/strap/) and M-Coffee (http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi). Each of the two gives me a different alignment and both look really fine to me. Is there any way to determine which of the two alignments is the most correct one? If not then is there any way I can re conciliate the two alignments?

Thanks
dsain

I have no real answer, but:

1. Check both programs for some explanation.

2. Eyeball the results, whichever looks better.

3. Do Google search for both programs, whichever brings more and high-impact journal articles having used this program is probably better (unless one is brand new and the other is from the previous century.

4. Find a third program, use the result that is most shared by two of these three programs.

And let us know if you do find an answer by talking with your institute's bioinformaticists.

..



Thanks for your suggestions. I did use a third program Clustal W and it gave me a alignment different than the previous two. My PI has been using T-coffee and M coffee for quite some time now for MSA but he wanted me to try out STRAP this time. Both the programs are fairly new but I did search the journal articles and found the STRAP has more of them than M- coffee, where as I am inclined to think that the alignment from M coffee should be a better one as it uses a multitude of MSA programs to compute the alignment (Clustalw, MAFT, dialign etc) where as STRAP uses only ClustalW.

I am not sure how to 'eyeball' the alignments. Can you suggest a way of doing that.

Thanks
dsain

-dsain-

QUOTE (Biorad @ Jul 10 2008, 02:39 AM)
QUOTE (dsain @ Jul 9 2008, 10:37 PM)
Hi

I am trying to align a set of protein sequences in order to find a consensus sequence. I used two different programs available on the internet ie STRAP (http://www.charite.de/bioinf/strap/) and M-Coffee (http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi). Each of the two gives me a different alignment and both look really fine to me. Is there any way to determine which of the two alignments is the most correct one? If not then is there any way I can re conciliate the two alignments?

Thanks
dsain



Check the parameters, ( gap extension and gap opening penalty, Matrix used) of both the alignment programs. Because, every program has its own default parameters. And critically check for the active site or binding site and see whether they are conserved between the your protein sequence of interest and the given aligned sequence.

Try giving the same parameters in both the programs and check for the alignment



Thanks for your reply. I am not sure if it possible to check the parameters as the M coffee program uses the output of various other MSA programs to generate a alignment.

Regarding the active sites...unfortunately the active sites are still unknown in the proteins I am aligning.

Pls also look at the reply I posted

dsain

-dsain-

QUOTE (HomeBrew @ Jul 10 2008, 04:34 AM)
Each of the programs you've used has undoubtedly shown you their "best" alignment. Their judgment, however, is mathematical; yours is not.

As mentioned by Biorad, any alignment program gives certain weights (numerical penalties or rewards) for certain decisions that must be made (e.g. "Is causing these two additional consecutive amino acid residues to align with each other worth introducing a three space gap in one of the sequences?"), and seeks to select the alignment with some optimal total score.

This is one of the reasons just a small subset of alignment programs get cited in the literature -- it's easier to go with an "accepted" alignment algorithm than to explain why you chose another program's alignment or why you created your own. The thinking is "If Clustal's default alignment is good enough for this paper, and this paper, and this paper, etc., it's good enough for my paper, and good enough to stand without comment".

Ultimately, you must judge which alignment is best, or use them as guidance only and create your own alignment. You should, however, try a third program, as cellcounter suggests. How about ClustalW2?



Thanks. The thing is that if I were using a well tried and tested program like Clustal W then I wouldnt be bothered about what results the other programs give me. But in my case I have been specifically asked by my PI to use STRAP and M coffee and then to come up with the most appropriate alignment.

Can you throw some light on how I can "create" my own alignment by looking at the ones that I got. I dont have any knowledge about the active sites etc of my sequence.

dsain

-dsain-

Hi all

I have a possible approach to determine which of the alignment is better..pls tell if you think it could be correct. I found out the consensus sequence of both the alignments using JALview and the strap align gives me a 660 identity consensus seq whereas the M coffee align gives me a 661 identity consensus seq. Does that show that the M coffee gives a better MSA?

Thanks
dsain

-dsain-

QUOTE (dsain @ Jul 10 2008, 03:22 PM)
using JALview and the strap align gives me a 660 identity consensus seq whereas the M coffee align gives me a 661 identity consensus seq. Does that show that the M coffee gives a better MSA?


that only means that the parameters used by the programs have slight differences (apparently not significant, 660 and 661 is pretty much the same when it comes to protein sequence alignments) as suggested earlier by biorad and HB.

-toejam-

QUOTE (toejam @ Jul 10 2008, 03:45 PM)
QUOTE (dsain @ Jul 10 2008, 03:22 PM)
using JALview and the strap align gives me a 660 identity consensus seq whereas the M coffee align gives me a 661 identity consensus seq. Does that show that the M coffee gives a better MSA?


that only means that the parameters used by the programs have slight differences (apparently not significant, 660 and 661 is pretty much the same when it comes to protein sequence alignments) as suggested earlier by biorad and HB.


Thanks. Even I wasnt too convinced by this theory but didnt know the reason why. So now in case I cant decide which one of the alignments is the better one, I will have to reconcile them. How about if I blast the alignments from both the programs with each other and then take the resulting consensus as my final consensus sequence? Would it be the best way to reconcile the 2 alignments?

dsain

-dsain-

Pages: 1 2 Next