Aggregating Data for Computational Toxicology Applications: The U.S. Environmental Protection Agency (EPA) Aggregated Computational Toxicology Resource (ACToR) System

Abstract
Computational toxicology combines data from high-throughput test methods, chemical structure analyses and other biological domains (e.g., genes, proteins, cells, tissues) with the goals of predicting and understanding the underlying mechanistic causes of chemical toxicity and for predicting toxicity of new chemicals and products. A key feature of such approaches is their reliance on knowledge extracted from large collections of data and data sets in computable formats. The U.S. Environmental Protection Agency (EPA) has developed a large data resource called ACToR (Aggregated Computational Toxicology Resource) to support these data-intensive efforts. ACToR comprises four main repositories: core ACToR (chemical identifiers and structures, and summary data on hazard, exposure, use, and other domains), ToxRefDB (Toxicity Reference Database, a compilation of detailed in vivo toxicity data from guideline studies), ExpoCastDB (detailed human exposure data from observational studies of selected chemicals), and ToxCastDB (data from high-throughput screening programs, including links to underlying biological information related to genes and pathways). The EPA DSSTox (Distributed Structure-Searchable Toxicity) program provides expert-reviewed chemical structures and associated information for these and other high-interest public inventories. Overall, the ACToR system contains information on about 400,000 chemicals from 1100 different sources. The entire system is built using open source tools and is freely available to download. This review describes the organization of the data repository and provides selected examples of use cases.