Unraveling the Complexities of Life Sciences Data

1 March 2013

journal article
research article
Published by Mary Ann Liebert Inc in Big Data

Vol. 1 (1), 42-50
https://doi.org/10.1089/big.2012.1505

Abstract

The life sciences have entered into the realm of big data and data-enabled science, where data can either empower or overwhelm. These data bring the challenges of the 5 Vs of big data: volume, veracity, velocity, variety, and value. Both independently and through our involvement with DELSA Global (Data-Enabled Life Sciences Alliance, DELSAglobal.org), the Kolker Lab (kolkerlab.org) is creating partnerships that identify data challenges and solve community needs. We specialize in solutions to complex biological data challenges, as exemplified by the community resource of MOPED (Model Organism Protein Expression Database, MOPED.proteinspire.org) and the analysis pipeline of SPIRE (Systematic Protein Investigative Research Environment, PROTEINSPIRE.org). Our collaborative work extends into the computationally intensive tasks of analysis and visualization of millions of protein sequences through innovative implementations of sequence alignment algorithms and creation of the Protein Sequence Universe tool (PSU). Pushing into the future together with our collaborators, our lab is pursuing integration of multi-omics data and exploration of biological pathways, as well as assigning function to proteins and porting solutions to the cloud. Big data have come to the life sciences; discovering the knowledge in the data will bring breakthroughs and benefits.

Keywords

This publication has 87 references indexed in Scilit:

Raise standards for preclinical cancer research
Nature, 2012
Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes
Cell, 2012
A Vision for 21st Century U.S. Policy to Support Sustainable Advancement of Scientific Discovery and Technological Innovation
OMICS: A Journal of Integrative Biology, 2010
Meta-analysis for Protein Identification: A Case Study on Yeast Data
OMICS: A Journal of Integrative Biology, 2010
Interplay of heritage and habitat in the distribution of bacterial signal transduction systems
Molecular BioSystems, 2010
Avoidable waste in the production and reporting of research evidence
The Lancet, 2009
NCBI Peptidome: a new public repository for mass spectrometry peptide identifications
Nature Biotechnology, 2009
Risk Assessment and Communication Tools for Genotype Associations with Multifactorial Phenotypes: The Concept of “Edge Effect” and Cultivating an Ethical Bridge between Omics Innovations and Society
OMICS: A Journal of Integrative Biology, 2009
Validating Annotations for Uncharacterized Proteins in Shewanella oneidensis
OMICS: A Journal of Integrative Biology, 2008
New metrics for comparative genomics
Current Opinion in Biotechnology, 2006

Cited by 43 articles