Machine-learning-assisted materials discovery using failed experiments
Top Cited Papers
- 4 May 2016
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature
- Vol. 533 (7601), 73-76
- https://doi.org/10.1038/nature17439
Abstract
Inorganic-organic hybrid materials(1-3) such as organically templated metal oxides(1), metal-organic frameworks (MOFs)(2) and organohalide perovskites(4) have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table(5-9). Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation-and data-driven approaches (promoted by efforts such as the Materials Genome Initiative(10)) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility(11), photovoltaic properties(12), gas adsorption capacity(13) or lithium-ion intercalation(14)) to identify promising target candidates for synthetic efforts(11,15); determination of the structure-property relationship from large bodies of experimental data(16,17), enabled by integration with high-throughput synthesis and measurement tools(18); and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification(19,20) or gas adsorption properties(21)). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on 'dark' reactions-failed or unsuccessful hydrothermal syntheses-collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted conditions for new organically templated inorganic product formation with a success rate of 89 per cent. Inverting the machine-learning model reveals new hypotheses regarding the conditions for successful product formation.Keywords
This publication has 36 references indexed in Scilit:
- Open-source platform to benchmark fingerprints for ligand-based virtual screeningJournal of Cheminformatics, 2013
- Introduction to Metal–Organic FrameworksChemical Reviews, 2012
- From computational discovery to experimental characterization of a high hole mobility organic crystalNature Communications, 2011
- LIBSVMACM Transactions on Intelligent Systems and Technology, 2011
- Identifying Zeolite Frameworks with a Machine Learning ApproachThe Journal of Physical Chemistry C, 2009
- The WEKA data mining softwareACM SIGKDD Explorations Newsletter, 2009
- Organically-templated metal sulfates, selenites and selenatesChemical Society Reviews, 2006
- Exploration of a Simple Universal Route to the Myriad of Open-Framework Metal PhosphatesJournal of the American Chemical Society, 2000
- Oxyfluorinated microporous compounds ULM-n: chemical parameters, structures and a proposed mechanism for their molecular tectonicsJournal of Fluorine Chemistry, 1995
- Reduced molybdenum phosphates: octahedral-tetrahedral framework solids with tunnels, cages, and microporesChemistry of Materials, 1992