Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

using clustalw to align/compare - (Oct/14/2004 )

hello everybody!

this is my first time using clustalw to align/compare for similarity between more than 2 protein sequences using;
http://www.ebi.ac.uk/clustalw/


but how do i add the different sequences before running them? i have read the Help but it doesn't give any good help. hope you can guide me.


thanks for helping!

-justwonder-

First of all, you need to input your sequences in an appropriate format. The Sequence Retrieval System (obtained through the Swiss-Prot website - http://bo.expasy.org/sprot ) is probably the best way to get these (I think anyway). Use it to search the Swiss-Prot and TrEMBL databases - but remember to select your output field as "Sequence". The outputs should look something like this:

>14KD_RHOSH
MFSFIDDIPSFEQIKARVRDDLRKHGWEKRWNDSRLVQKSRELLNDEELKIDPATWIWKRMPSREEVAARRQRDFETVWK
YRYRLGGFASGALLALALAGIFSTGNFGGSSDAGNRPSVVYPIE

Copy and paste the whole lot (> and title included) into the big white box on ClustalW. Do this for each sequence you wish to align, pasting each sequence into the same box (the ">" marker will indicate a new sequence). Take a new line for each sequence you paste though.

Once you're done, just click "Run" and you should get an alignment within a few minutes (might take longer, depending on time of day). You can experiment with the parameters to get the right sort of alignment.

Let me know how you get on.

Greg

-gmcg-

thanks greg!

that helps!


but do you know what is the difference between conserved and semi-conserved substitution in an alignment? and when interpreting the result what should i focuse on? is that only the identical columns or also conserved and semi-conserved substitutions?

p.s this is my first time on alignment blink.gif


many thanks!

-justwonder-