Flexible informatics for linking experimental data to mathematical models via DataRail
Open Access
- 24 January 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (6), 840-847
- https://doi.org/10.1093/bioinformatics/btn018
Abstract
Motivation: Linking experimental data to mathematical models in biology is impeded by the lack of suitable software to manage and transform data. Model calibration would be facilitated and models would increase in value were it possible to preserve links to training data along with a record of all normalization, scaling, and fusion routines used to assemble the training data from primary results. Results: We describe the implementation of DataRail, an open source MATLAB-based toolbox that stores experimental data in flexible multi-dimensional arrays, transforms arrays so as to maximize information content, and then constructs models using internal or external tools. Data integrity is maintained via a containment hierarchy for arrays, imposition of a metadata standard based on a newly proposed MIDAS format, assignment of semantically typed universal identifiers, and implementation of a procedure for storing the history of all transformations with the array. We illustrate the utility of DataRail by processing a newly collected set of ∼22 000 measurements of protein activities obtained from cytokine-stimulated primary and transformed human liver cells. Availability:DataRail is distributed under the GNU General Public License and available at http://code.google.com/p/sbpipeline/ Contact:sbpipeline@hms.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 11 references indexed in Scilit:
- Enabling high-throughput data management for systems biology: The Bioinformatics Resource ManagerBioinformatics, 2007
- Structural and functional analysis of cellular networks with CellNetAnalyzerBMC Systems Biology, 2007
- Linking data to models: data regressionNature Reviews Molecular Cell Biology, 2006
- The Gaggle: An open-source software system for integrating bioinformatics software and data sourcesBMC Bioinformatics, 2006
- A Compendium of Signals and Responses Triggered by Prodeath and Prosurvival CytokinesMolecular & Cellular Proteomics, 2005
- Bioconductor: open software development for computational biology and bioinformaticsGenome Biology, 2004
- Informatics and Quantitative Analysis in Biological ImagingScience, 2003
- MAPK activation is involved in posttranscriptional regulation of RSV-induced RANTES gene expressionAmerican Journal of Physiology-Lung Cellular and Molecular Physiology, 2002
- The Semantic WebScientific American, 2001
- Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-TotalsData Mining and Knowledge Discovery, 1997