Charting, Navigating, and Populating Natural Product Chemical Space for Drug Discovery

Abstract
Natural products are a heterogeneous group of compounds with diverse, yet particular molecular properties compared to synthetic compounds and drugs. All relevant analyses show that natural products indeed occupy parts of chemical space not explored by available screening collections while at the same time largely adhering to the rule-of-five. This renders them a valuable, unique, and necessary component of screening libraries used in drug discovery. With ChemGPS-NP on the Web and Scaffold Hunter two tools are available to the scientific community to guide exploration of biologically relevant NP chemical space in a focused and targeted fashion with a view to guide novel synthesis approaches. Several of the examples given illustrate the possibility of bridging the gap between computational methods and compound library synthesis and the possibility of integrating cheminformatics and chemical space analyses with synthetic chemistry and biochemistry to successfully explore chemical space for the identification of novel small molecule modulators of protein function.The examples also illustrate the synergistic potential of the chemical space concept and modern chemical synthesis for biomedical research and drug discovery. Chemical space analysis can map under explored biologically relevant parts of chemical space and identify the structure types occupying these parts. Modern synthetic methodology can then be applied to efficiently fill this “virtual space” with real compounds.From a cheminformatics perspective, there is a clear demand for open-source and easy to use tools that can be readily applied by educated nonspecialist chemists and biologists in their daily research. This will include further development of Scaffold Hunter, ChemGPS-NP, and related approaches on the Web. Such a “cheminformatics toolbox” would enable chemists and biologists to mine their own data in an intuitive and highly interactive process and without the need for specialized computer science and cheminformatics expertise. We anticipate that it may be a viable, if not necessary, step for research initiatives based on large high-throughput screening campaigns,in particular in the pharmaceutical industry, to make the most out of the recent advances in computational tools in order to leverage and take full advantage of the large data sets generated and available in house. There are “holes” in these data sets that can and should be identified and explored by chemistry and biology.