Optimizing transformations for automated, high throughput analysis of flow cytometry data

Open Access

4 November 2010

journal article
research article
Published by Springer Science and Business Media LLC in BMC Bioinformatics

Vol. 11 (1), 546
https://doi.org/10.1186/1471-2105-11-546

Abstract

In a high throughput setting, effective flow cytometry data analysis depends heavily on proper data preprocessing. While usual preprocessing steps of quality assessment, outlier removal, normalization, and gating have received considerable scrutiny from the community, the influence of data transformation on the output of high throughput analysis has been largely overlooked. Flow cytometry measurements can vary over several orders of magnitude, cell populations can have variances that depend on their mean fluorescence intensities, and may exhibit heavily-skewed distributions. Consequently, the choice of data transformation can influence the output of automated gating. An appropriate data transformation aids in data visualization and gating of cell populations across the range of data. Experience shows that the choice of transformation is data specific. Our goal here is to compare the performance of different transformations applied to flow cytometry data in the context of automated gating in a high throughput, fully automated setting. We examine the most common transformations used in flow cytometry, including the generalized hyperbolic arcsine, biexponential, linlog, and generalized Box-Cox, all within the BioConductor flowCore framework that is widely used in high throughput, automated flow cytometry data analysis. All of these transformations have adjustable parameters whose effects upon the data are non-intuitive for most users. By making some modelling assumptions about the transformed data, we develop maximum likelihood criteria to optimize parameter choice for these different transformations.

This publication has 25 references indexed in Scilit:

A Survey of Flow Cytometry Data Analysis Methods
Advances in Bioinformatics, 2009
Automatic Clustering of Flow Cytometry Data with Density-Based Merging
Advances in Bioinformatics, 2009
Merging Mixture Components for Cell Population Identification in Flow Cytometry
Advances in Bioinformatics, 2009
Per‐channel basis normalization methods for flow cytometry data
Cytometry Part A, 2009
Bridging the Divide between Manual Gating and Bioinformatics with the Bioconductor Package flowFlowJo
Advances in Bioinformatics, 2009
FlowFP: A Bioconductor Package for Fingerprinting Flow Cytometric Data
Advances in Bioinformatics, 2009
Automated high-dimensional flow cytometric data analysis
Proceedings of the National Academy of Sciences of the United States of America, 2009
flowClust: a Bioconductor package for automated gating of flow cytometry data
BMC Bioinformatics, 2009
flowCore: a Bioconductor package for high throughput flow cytometry
BMC Bioinformatics, 2009
Recent Bioinformatics Advances in the Analysis of High Throughput Flow Cytometry Data
Advances in Bioinformatics, 2009

Cited by 71 articles