Scaling and Normalization Effects in NMR Spectroscopic Metabonomic Data Sets

18 February 2006

journal article
research article
Published by American Chemical Society (ACS) in Analytical Chemistry

Vol. 78 (7), 2262-2267
https://doi.org/10.1021/ac0519312

Abstract

Considerable confusion appears to exist in the metabonomics literature as to the real need for, and the role of, preprocessing the acquired spectroscopic data. A number of studies have presented various data manipulation approaches, some suggesting an optimum method. In metabonomics, data are usually presented as a table where each row relates to a given sample or analytical experiment and each column corresponds to a single measurement in that experiment, typically individual spectral peak intensities or metabolite concentrations. Here we suggest definitions for and discuss the operations usually termed normalization (a table row operation) and scaling (a table column operation) and demonstrate their need in ¹H NMR spectroscopic data sets derived from urine. The problems associated with “binned” data (i.e., values integrated over discrete spectral regions) are also discussed, and the particular biological context problems of analytical data on urine are highlighted. It is shown that care must be exercised in calculation of correlation coefficients for data sets where normalization to a constant sum is used. Analogous considerations will be needed for other biofluids, other analytical approaches (e.g., HPLC−MS), and indeed for other “omics” techniques (i.e., transcriptomics or proteomics) and for integrated studies with “fused” data sets. It is concluded that data preprocessing is context dependent and there can be no single method for general use.

Keywords

This publication has 15 references indexed in Scilit:

Large-Scale Human Metabolomics Studies: A Strategy for Data (Pre-) Processing and Validation
Analytical Chemistry, 2005
Fusion of Mass Spectrometry-Based Metabolomics Data
Analytical Chemistry, 2005
A comparison of methods for alignment of NMR peaks in the context of cluster analysis
Journal of Pharmaceutical and Biomedical Analysis, 2005
HPLC-MS-based methods for the study of metabonomics
Journal of Chromatography B, 2005
A proposed framework for the description of plant metabolomics experiments and their results
Nature Biotechnology, 2004
Sample Classification Based on Bayesian Spectral Decomposition of Metabonomic NMR Data Sets
Analytical Chemistry, 2004
Creatinine Clearance, Cockcroft-Gault Formula and Cystatin C: Estimators of True Glomerular Filtration Rate in the Elderly?
Gerontology, 2002
Pattern recognition methods and applications in biomedical magnetic resonance
Progress in Nuclear Magnetic Resonance Spectroscopy, 2001
'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data
Xenobiotica, 1999
Pattern recognition analysis of high resolution ¹H NMR spectra of urine. A nonlinear mapping approach to the classification of toxicological data
NMR in Biomedicine, 1990

Cited by 415 articles