Bayesian optimization with evolutionary and structure-based regularization for directed protein evolution
Open Access
- 1 July 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Algorithms for Molecular Biology
- Vol. 16 (1), 1-15
- https://doi.org/10.1186/s13015-021-00195-4
Abstract
Background: Directed evolution (DE) is a technique for protein engineering that involves iterative rounds of mutagenesis and screening to search for sequences that optimize a given property, such as binding affinity to a specified target. Unfortunately, the underlying optimization problem is under-determined, and so mutations introduced to improve the specified property may come at the expense of unmeasured, but nevertheless important properties (ex. solubility, thermostability, etc). We address this issue by formulating DE as a regularized Bayesian optimization problem where the regularization term reflects evolutionary or structure-based constraints. Results: We applied our approach to DE to three representative proteins, GB1, BRCA1, and SARS-CoV-2 Spike, and evaluated both evolutionary and structure-based regularization terms. The results of these experiments demonstrate that: (i) structure-based regularization usually leads to better designs (and never hurts), compared to the unregularized setting; (ii) evolutionary-based regularization tends to be least effective; and (iii) regularization leads to better designs because it effectively focuses the search in certain areas of sequence space, making better use of the experimental budget. Additionally, like previous work in Machine learning assisted DE, we find that our approach significantly reduces the experimental burden of DE, relative to model-free methods. Conclusion: Introducing regularization into a Bayesian ML-assisted DE framework alters the exploratory patterns of the underlying optimization routine, and can shift variant selections towards those with a range of targeted and desirable properties. In particular, we find that structure-based regularization often improves variant selection compared to unregularized approaches, and never hurts.Keywords
Funding Information
- National Institute of Biomedical Imaging and Bioengineering (T32 EB009403)
- School of Computer Science, Carnegie Mellon University
This publication has 35 references indexed in Scilit:
- STRUCTURE-FUNCTION OF THE TUMOR SUPPRESSOR BRCA1Computational and Structural Biotechnology Journal, 2012
- Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit ProblemsFoundations and Trends® in Machine Learning, 2012
- Evolution of Stability in a Cold-Active Enzyme Elicits Specificity Relaxation and Highlights Substrate-Related Effects on Temperature AdaptationJournal of Molecular Biology, 2010
- Exploring protein fitness landscapes by directed evolutionNature Reviews Molecular Cell Biology, 2009
- Teaching old enzymes new tricks: engineering and evolution of glycosidases and glycosyl transferases for improved glycoside synthesisThis paper is one of a selection of papers published in this Special Issue, entitled CSBMCB — Systems and Chemical Biology, and has undergone the Journal's usual peer review process.Biochemistry and Cell Biology, 2008
- The Protein Data BankNucleic Acids Research, 2000
- Hidden Markov Models in Computational Biology: Applications to Protein ModelingJournal of Molecular Biology, 1994
- Selection of phage antibodies by binding affinity: Mimicking affinity maturationJournal of Molecular Biology, 1992
- The de novo design of protein structuresTrends in Biochemical Sciences, 1989
- ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLESBiometrika, 1933