Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

Workflow of constructing a phylogenetic tree - Sequence alignment (Mar/13/2006 )

A very simple yet I do not know what to do with my sequences question.
I have amplified the hrp gene in my bacterial samples and sent them to sequence. My question is what do I do next? And how do I interpret the results?

WHat I have done is aligned the sequences with Clustal W, in the MEGA software, then created a phylogenetic tree with the aligned sequences. SHould I have removed the gaps first? Is there a step by step process I should use to analyse my sequence data?

Any help would be appreciated????


Why have you created a phylogenic tree?

I hope you mean that you used the clustal dendorgram and even then, why didn't you use muscle? Clustalw can be a bit flaky.

If you created an alignment why would you want to remove the gaps? It's a crucial part of the alignment! Indeed opening and extending gaps is a major factor, which is why you can change the weight of the opening and extension penalties.

When you say analyse sequence data, what do you mean? What do you hope to achieve? Throw me a frickin' bone here. blink.gif biggrin.gif


Yes you should remove the gaps as they bias the results. And there is nothing wrong with ClustalW. The following article is a good introduction to phylogenetic tree contruction:

Trends Genet. 2003 Jun;19(6):345-51.

Phylogeny for the faint of heart: a tutorial.

Baldauf SL.

Department of Biology, University of York, Box 373, York, UK, YO10 5YW.

Phylogenetic trees seem to be finding ever broader applications, and researchers from very different backgrounds are becoming interested in what they might have to say. This tutorial aims to introduce the basics of building and interpreting phylogenetic trees. It is intended for those wanting to understand better what they are looking at when they look at someone else's trees or to begin learning how to build their own. Topics covered include: how to read a tree, assembling a dataset, multiple sequence alignment (how it works and when it does not), phylogenetic methods, bootstrap analysis and long-branch artefacts, and software and resources.

PMID: 12801728 [PubMed - indexed for MEDLINE]


No there is nothing wrong with clustalw but muscle is better - have a look at the literature.


Are we talking about these programs in terms of alignment or the tree reconstruction algorithms, or both? I wouldn't use clustalw/clustalx for trees where accuracy is required.