Cognitive analysis of metabolomics data for systems biology
- 22 January 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Protocols
- Vol. 16 (3), 1376-1418
- https://doi.org/10.1038/s41596-020-00455-4
Abstract
Cognitive computing is revolutionizing the way big data are processed and integrated, with artificial intelligence (AI) natural language processing (NLP) platforms helping researchers to efficiently search and digest the vast scientific literature. Most available platforms have been developed for biomedical researchers, but new NLP tools are emerging for biologists in other fields and an important example is metabolomics. NLP provides literature-based contextualization of metabolic features that decreases the time and expert-level subject knowledge required during the prioritization, identification and interpretation steps in the metabolomics data analysis pipeline. Here, we describe and demonstrate four workflows that combine metabolomics data with NLP-based literature searches of scientific databases to aid in the analysis of metabolomics data and their biological interpretation. The four procedures can be used in isolation or consecutively, depending on the research questions. The first, used for initial metabolite annotation and prioritization, creates a list of metabolites that would be interesting for follow-up. The second workflow finds literature evidence of the activity of metabolites and metabolic pathways in governing the biological condition on a systems biology level. The third is used to identify candidate biomarkers, and the fourth looks for metabolic conditions or drug-repurposing targets that the two diseases have in common. The protocol can take 1–4 h or more to complete, depending on the processing time of the various software used.Funding Information
- DOE | Office of Science (DE-AC02-05CH11231)
- U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (cloud computing credits)
- Deutsche Forschungsgemeinschaft (RI2811/1-1)
This publication has 125 references indexed in Scilit:
- Metabolic niche of a prominent sulfate-reducing human gut bacteriumProceedings of the National Academy of Sciences of the United States of America, 2013
- PubTator: a web-based text mining tool for assisting biocurationNucleic Acids Research, 2013
- Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN databaseNature Protocols, 2013
- Boosting automatic event extraction from the literature using domain adaptation and coreference resolutionBioinformatics, 2012
- Meta-analysis of untargeted metabolomic data from multiple profiling experimentsNature Protocols, 2012
- Mining metabolites: extracting the yeast metabolome from the literatureMetabolomics, 2010
- MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile dataBMC Bioinformatics, 2010
- Facilitating the development of controlled vocabularies for metabolomics technologies with text miningBMC Bioinformatics, 2008
- XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and IdentificationAnalytical Chemistry, 2006
- Chronic Kidney Disease and the Risks of Death, Cardiovascular Events, and HospitalizationThe New England Journal of Medicine, 2004