All that you know about outgroup - Outgroup and tree (Aug/12/2005 )
I am relatively new to the world of tree building and phylogenetics. Recently I was told to construct a tree out of sequenes that I have and also to root it by using 'Outgroup". I tried to find on net information on outgroup but wasnt satisfied. I would appreciate your help, if someone could explain me all about outgroup. my qs are :
1. What is a outgroup?
2. How do u choose a outgroup? I have a set of viral species from a group. how should i decide which one of them to use as an outgroup?
3. Why do u need a outgroup?
4. What can u infer when u include a outgroup in the tree?
5. How does it matter and affect the tree if an outgroup is not included in the tree, does the tree becomes unreliable, if yes why?
6. How does an outgroup make a tree better?
If any one can refer me to a reference or link or any info will be highly appreciated.
You use an outgroup to give relative distances on the tree. In the case of viral species, you can choose a member of the same group, or a more distantly related virus (that depends whether you want to show that all of the viruses you are studying are closely related or if you want to show that they are quite varied).
It sounds to me like you have your chosen virus sequences all in one group - I'd go to another virus in the same family, but not too distant.
The biggest problem I have come across with viral phylogenies are that all* of the programs and algorithms for constructing a tree are based on vertical evolution (ie that differences are carried down through the generations in small steps) and the big differences in viral sequences are caused by horizontal evolution (literally genes jumping from one "species" to another). The result is that a phylogeny based on the entire genome is unlikely to give you the same phylogeny if you analyse each ORF on their own. This can cause all manner of headaches
Let me sum up my experience: I used viral phylogenies to attempt to classify a new virus into a known group. I did NOT use an outgroup, since I had no idea where my virus might fall. Because I did not use an outgroup, I could view the tree as "unrooted" which makes classification easy since you can draw nice big circles over the groups. My problem was that the groupings did not fall with the given phylogenies in the ICTV so I contructed trees for each ORF (protein sequence) using every algorithm I could lay my hands on. Eventually, I found a few trees that were consistent with the ICTV classification and used these to place my virus into a group (which they have recenty gone of to change anyway!! ).
In another case, I was looking at variants of a virus that was well studied, so I could easily pick an outgroup from amongst the closely related viruses, and in this case the phylogenetics programs were "better" at the analysis since the differences were small.
In either case, make sure that your tree is robust by doing bootstrap analysis. A low bootstrapping score is indicative of a not-so-good tree. All of the terminology and explanations of the different algorithms should be available in any good phylogenetics text book (dontchya just *love* the library? )
* if I am wrong, and there is a program that takes into account more horizontal than vertical evolution, then please let me know. It's been a few years since I had to do this work.