phosphorylated protein sequences with <20% homology - (May/26/2006 )
Does anyone know where I can obtain a dataset of phosphorylated protein sequences that have <20% sequence similarity?
Any suggestions would be very much appreciated.
I have seen the phospho ELM database a couple of days ago but I won't be able to get the dataset until mid-June. The first website you gave deals with prokaryotes and my research focuses on eukaryotic phosphorylated proteins.
Do you have any other ideas on where I can get a good dataset, which concerns proteins with less than 20% homology??? I've looked at other databases such as dbPTM and KinasePhos but no luck unfortunately. Anyway, I'd like to thank you for your suggetsions. It's very much appreciated.
Hi, If you can obtain a list of all these proteins you could use blastclust to cluster all your sequences (get sequences and structures from astral) and select one protein from each cluster - giving you a 25% max identity set
Where can I download blastclust from? Is it available through the web? I'm only familiar with blast and have not come across blastclust. I checked on NCBI's website and couldn't find it (only found the regular blast site). Please could you tell me how to go about it?
Thank you very much for your help.
if you download the NCBI toolset (including blast(blastpgp)) then you will also get blastclust.