BAR-PLUS: the Bologna Annotation Resource Plus for functional and structural annotation of protein sequences

Open Access

26 May 2011

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 39 (uppl), W197-W202
https://doi.org/10.1093/nar/gkr292

Abstract

We introduce BAR-PLUS (BAR ⁺ ), a web server for functional and structural annotation of protein sequences. BAR ⁺ is based on a large-scale genome cross comparison and a non-hierarchical clustering procedure characterized by a metric that ensures a reliable transfer of features within clusters. In this version, the method takes advantage of a large-scale pairwise sequence comparison of 13 495 736 protein chains also including 988 complete proteomes. Available sequence annotation is derived from UniProtKB, GO, Pfam and PDB. When PDB templates are present within a cluster (with or without their SCOP classification), profile Hidden Markov Models (HMMs) are computed on the basis of sequence to structure alignment and are cluster-associated (Cluster-HMM). Therefrom, a library of 10 858 HMMs is made available for aligning even distantly related sequences for structural modelling. The server also provides pairwise query sequence–structural target alignments computed from the correspondent Cluster-HMM. BAR ⁺ in its present version allows three main categories of annotation: PDB [with or without SCOP (*)] and GO and/or Pfam; PDB (*) without GO and/or Pfam; GO and/or Pfam without PDB (*) and no annotation. Each category can further comprise clusters where GO and Pfam functional annotations are or are not statistically significant. BAR ⁺ is available at http://bar.biocomp.unibo.it/bar2.0 .

This publication has 16 references indexed in Scilit:

Extending CATH: increasing coverage of the protein structure universe and linking structure with function
Nucleic Acids Research, 2010
The Bologna Annotation Resource: a Non Hierarchical Method for the Functional and Structural Annotation of Protein Sequences Relying on a Comparative Large-Scale Genome Analysis
Journal of Proteome Research, 2009
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space
Bioinformatics, 2008
MUSTANG: A multiple structural alignment algorithm
Proteins, 2006
The predictive power of the CluSTr database
Bioinformatics, 2005
ProtoNet 4.0: A hierarchical classification of one million protein sequences
Nucleic Acids Research, 2004
BLAST: at the core of a powerful and diverse set of sequence analysis tools
Nucleic Acids Research, 2004
MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Research, 2004
An efficient algorithm for large-scale detection of protein families
Nucleic Acids Research, 2002
Profile hidden Markov models.
Bioinformatics, 1998

Cited by 22 articles