Pfam: the protein families database
Top Cited Papers
Open Access
- 27 November 2013
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 42 (D1), D222-D230
- https://doi.org/10.1093/nar/gkt1223
Abstract
Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.Keywords
This publication has 42 references indexed in Scilit:
- The challenge of increasing Pfam coverage of the human proteomeDatabase: The Journal of Biological Databases and Curation, 2013
- AntiFam: a tool to help identify spurious ORFs in protein annotationDatabase: The Journal of Biological Databases and Curation, 2012
- Reorganizing the protein space at the Universal Protein Resource (UniProt)Nucleic Acids Research, 2011
- The Pfam protein families databaseNucleic Acids Research, 2009
- Jalview Version 2—a multiple sequence alignment editor and analysis workbenchBioinformatics, 2009
- Pfam 10 years on: 10 000 families and still growingBriefings in Bioinformatics, 2008
- Pfam: clans, web tools and servicesNucleic Acids Research, 2006
- IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy contentBioinformatics, 2005
- The Pfam protein families databaseNucleic Acids Research, 2004
- Multiple Sequence Alignment Using ClustalW and ClustalXCurrent Protocols in Bioinformatics, 2002