Monophyletic clustering and characterization of protein families
2007
Zhang Jian | Zhao Zhiyuan | Evershed Jennifer | Li Guoying
A protein family contains sequences that are evolutionarily related. Generally, this is reflected by sequence similarity. There have been many attempts to organize the set of protein families into evolutionarily homogenous clusters using certain clustering methods. How do we characterize these clusters? How can we cluster protein families using these characterizations? In this work, these questions were addressed by use of a concept called group-wide co-evolution, and was exemplified by some real and simulated protein family data. The results have shown that the trend of a group of monophyletic proteins might be characterized by a normal distribution, while the strength and variability of this trend can be described by the sample mean and variance of the observed correlation coefficients after a suitable transformation. To exploit this property, we have developed a monophyletic clustering method called monophyletic k−medoids clustering. A software package written in R has been made available at http://www.kent.ac.uk/ims/personal/jz .
Show more [+] Less [-]Bibliographic information
This bibliographic record has been provided by Directory of Open Access Journals