Machine learning–assisted molecular design and efficiency prediction for high-performance organic photovoltaic materials
Top Cited Papers
Open Access
- 1 November 2019
- journal article
- research article
- Published by American Association for the Advancement of Science (AAAS) in Science Advances
- Vol. 5 (11), eaay4275
- https://doi.org/10.1126/sciadv.aay4275
Abstract
In the process of finding high-performance materials for organic photovoltaics (OPVs), it is meaningful if one can establish the relationship between chemical structures and photovoltaic properties even before synthesizing them. Here, we first establish a database containing over 1700 donor materials reported in the literature. Through supervised learning, our machine learning (ML) models can build up the structure-property relationship and, thus, implement fast screening of OPV materials. We explore several expressions for molecule structures, i.e., images, ASCII strings, descriptors, and fingerprints, as inputs for various ML algorithms. It is found that fingerprints with length over 1000 bits can obtain high prediction accuracy. The reliability of our approach is further verified by screening 10 newly designed donor materials. Good consistency between model predictions and experimental outcomes is obtained. The result indicates that ML is a powerful tool to prescreen new OPV materials, thus accelerating the development of the OPV field.Keywords
Funding Information
- National Natural Science Foundation of China (21801238)
- Municipal Natural Science Foundation of Chongqing (cstc2017jcyjAX0451)
- Municipal Natural Science Foundation of Chongqing (cstc2017rgznzdyfX0023)
- Municipal Natural Science Foundation of Chongqing (cstc2018jcyjAX0556)
- Municipal Natural Science Foundation of Chongqing (cstc2017rgzn-zdyfX0030)
- National Youth Thousand Program Project (R52A199Z11)
- National Special Funds for Repairing and Purchasing Scientific Institutions (Y72Z090Q10)
- Municipal Natural Science Foundation of Chongqing (cstc2018jszx-cyzd0603)
- CAS Pioneer Hundred Talents Program B (Y92A010Q10)
This publication has 43 references indexed in Scilit:
- Open Babel: An open chemical toolboxJournal of Cheminformatics, 2011
- The Harvard Clean Energy Project: Large-Scale Computational Screening and Design of Organic Photovoltaics on the World Community GridThe Journal of Physical Chemistry Letters, 2011
- PaDEL-Descriptor: An Open Source Software to Calculate Molecular Descriptors and FingerprintsJournal of Computational Chemistry, 2011
- LIBSVMACM Transactions on Intelligent Systems and Technology, 2011
- Extended-Connectivity FingerprintsJournal of Chemical Information and Modeling, 2010
- Support vector machinesWIREs Computational Statistics, 2009
- Pattern Recognition and Machine LearningJournal of Electronic Imaging, 2007
- The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and BioinformaticsJournal of Chemical Information and Computer Sciences, 2003
- Bagging predictorsMachine Learning, 1996
- SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rulesJournal of Chemical Information and Computer Sciences, 1988