We need to keep a reproducible trace of facts, predictions, and hypotheses from gene to function in the era of big data
Open Access
- 30 November 2020
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Biology
- Vol. 18 (11), e3000999
- https://doi.org/10.1371/journal.pbio.3000999
Abstract
How do we scale biological science to the demand of next generation biology and medicine to keep track of the facts, predictions, and hypotheses? These days, enormous amounts of DNA sequence and other omics data are generated. Since these data contain the blueprint for life, it is imperative that we interpret it accurately. The abundance of DNA is only one part of the challenge. Artificial Intelligence (AI) and network methods routinely build on large screens, single cell technologies, proteomics, and other modalities to infer or predict biological functions and phenotypes associated with proteins, pathways, and organisms. As a first step, how do we systematically trace the provenance of knowledge from experimental ground truth to gene function predictions and annotations? Here, we review the main challenges in tracking the evolution of biological knowledge and propose several specific solutions to provenance and computational tracing of evidence in functional linkage networks.This publication has 23 references indexed in Scilit:
- The COMBREX Project: Design, Methodology, and Initial ResultsPLoS Biology, 2013
- STRING v9.1: protein-protein interaction networks, with increased coverage and integrationNucleic Acids Research, 2012
- An Introduction to Causal InferenceThe International Journal of Biostatistics, 2010
- Nature of the protein universeProceedings of the National Academy of Sciences of the United States of America, 2009
- Toward accurate reconstruction of functional protein networksMolecular Systems Biology, 2009
- GeneMANIA: a real-time multiple association network integration algorithm for predicting gene functionGenome Biology, 2008
- Protein networks in diseaseGenome Research, 2008
- RimO, a MiaB-like enzyme, methylthiolates the universally conserved Asp88 residue of ribosomal protein S12 in Escherichia coliProceedings of the National Academy of Sciences of the United States of America, 2008
- Network-Based Analysis of Affected Biological Processes in Type 2 Diabetes ModelsPLoS Genetics, 2007
- Network‐based prediction of protein functionMolecular Systems Biology, 2007