BAR-PLUS: the Bologna Annotation Resource Plus for functional and structural annotation of protein sequences
Open Access
- 26 May 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 39 (uppl), W197-W202
- https://doi.org/10.1093/nar/gkr292
Abstract
We introduce BAR-PLUS (BAR + ), a web server for functional and structural annotation of protein sequences. BAR + is based on a large-scale genome cross comparison and a non-hierarchical clustering procedure characterized by a metric that ensures a reliable transfer of features within clusters. In this version, the method takes advantage of a large-scale pairwise sequence comparison of 13 495 736 protein chains also including 988 complete proteomes. Available sequence annotation is derived from UniProtKB, GO, Pfam and PDB. When PDB templates are present within a cluster (with or without their SCOP classification), profile Hidden Markov Models (HMMs) are computed on the basis of sequence to structure alignment and are cluster-associated (Cluster-HMM). Therefrom, a library of 10 858 HMMs is made available for aligning even distantly related sequences for structural modelling. The server also provides pairwise query sequence–structural target alignments computed from the correspondent Cluster-HMM. BAR + in its present version allows three main categories of annotation: PDB [with or without SCOP (*)] and GO and/or Pfam; PDB (*) without GO and/or Pfam; GO and/or Pfam without PDB (*) and no annotation. Each category can further comprise clusters where GO and Pfam functional annotations are or are not statistically significant. BAR + is available at http://bar.biocomp.unibo.it/bar2.0 .This publication has 16 references indexed in Scilit:
- Extending CATH: increasing coverage of the protein structure universe and linking structure with functionNucleic Acids Research, 2010
- The Bologna Annotation Resource: a Non Hierarchical Method for the Functional and Structural Annotation of Protein Sequences Relying on a Comparative Large-Scale Genome AnalysisJournal of Proteome Research, 2009
- Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein spaceBioinformatics, 2008
- MUSTANG: A multiple structural alignment algorithmProteins, 2006
- The predictive power of the CluSTr databaseBioinformatics, 2005
- ProtoNet 4.0: A hierarchical classification of one million protein sequencesNucleic Acids Research, 2004
- BLAST: at the core of a powerful and diverse set of sequence analysis toolsNucleic Acids Research, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Profile hidden Markov models.Bioinformatics, 1998