Homology-based inference sets the bar high for protein function prediction
Open Access
- 28 February 2013
- journal article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 14 (S3), S7
- https://doi.org/10.1186/1471-2105-14-s3-s7
Abstract
Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA.Keywords
This publication has 16 references indexed in Scilit:
- A large-scale evaluation of computational protein function predictionNature Methods, 2013
- Analysis of protein function and its prediction from amino acid sequenceProteins-Structure Function and Bioinformatics, 2011
- Ongoing and future developments at the Universal Protein ResourceNucleic Acids Research, 2010
- ESG: extended similarity group method for automated protein function predictionBioinformatics, 2009
- Protein function prediction – the power of multiplicityTrends in Biotechnology, 2009
- GOSLING: a rule-based protein annotator using BLAST and GOBioinformatics, 2008
- ConFunc—functional annotation in the twilight zoneBioinformatics, 2008
- Enhanced automated function prediction using distantly related sequences and contextual association by PFPProtein Science, 2006
- Reliability of Assessment of Protein Structure Prediction MethodsStructure, 2002
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997