Inter-Residue Distance Prediction From Duet Deep Learning Models
Open Access
- 16 May 2022
- journal article
- research article
- Published by Frontiers Media SA in Frontiers in Genetics
- Vol. 13, 887491
- https://doi.org/10.3389/fgene.2022.887491
Abstract
Residue distance prediction from the sequence is critical for many biological applications such as protein structure reconstruction, protein–protein interaction prediction, and protein design. However, prediction of fine-grained distances between residues with long sequence separations still remains challenging. In this study, we propose DuetDis, a method based on duet feature sets and deep residual network with squeeze-and-excitation (SE), for protein inter-residue distance prediction. DuetDis embraces the ability to learn and fuse features directly or indirectly extracted from the whole-genome/metagenomic databases and, therefore, minimize the information loss through ensembling models trained on different feature sets. We evaluate DuetDis and 11 widely used peer methods on a large-scale test set (610 proteins chains). The experimental results suggest that 1) prediction results from different feature sets show obvious differences; 2) ensembling different feature sets can improve the prediction performance; 3) high-quality multiple sequence alignment (MSA) used for both training and testing can greatly improve the prediction performance; and 4) DuetDis is more accurate than peer methods for the overall prediction, more reliable in terms of model prediction score, and more robust against shallow multiple sequence alignment (MSA).This publication has 73 references indexed in Scilit:
- Assessing the utility of coevolution-based residue–residue contact predictions in a sequence- and structure-rich eraProceedings of the National Academy of Sciences of the United States of America, 2013
- Predicting protein contact map using evolutionary and physical constraints by integer programmingBioinformatics, 2013
- A Position-Specific Distance-Dependent Statistical Potential for Protein Structure and Functional StudyStructure, 2012
- Direct-coupling analysis of residue coevolution captures native contacts across many protein familiesProceedings of the National Academy of Sciences of the United States of America, 2011
- Enhanced inter-helical residue contact prediction in transmembrane proteinsChemical Engineering Science, 2011
- Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLDProteins: Structure, Function, and Bioinformatics, 2010
- Identification of direct residue contacts in protein–protein interaction by message passingProceedings of the National Academy of Sciences of the United States of America, 2009
- Towards accurate residue–residue hydrophobic contact prediction for α helical proteins via integer linear optimizationProteins: Structure, Function, and Bioinformatics, 2008
- A comprehensive assessment of sequence-based and template-based methods for protein contact predictionBioinformatics, 2008
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997