VASC: dimension reduction and visualization of single cell RNA sequencing data by deep variational autoencoder
Open Access
- 6 October 2017
- preprint content
- other
- Published by Cold Spring Harbor Laboratory
- p. 199315
- https://doi.org/10.1101/199315
Abstract
Single cell RNA sequencing (scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities in single cell level. It is an important step for studying cell sub-populations and lineages based on scRNA-seq data by finding an effective low-dimensional representation and visualization of the original data. The scRNA-seq data are much noiser than traditional bulk RNA-Seq: in the single cell level, the transcriptional fluctuations are much larger than the average of a cell population and the low amount of RNA transcripts will increase the rate of technical dropout events. In this study, we proposed VASC (deep Variational Autoencoder for scRNA-seq data), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. It can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on twenty datasets, VASC shows superior performances in most cases and broader dataset compatibility compared with four state-of-the-art dimension reduction methods. Then, for a case study of pre-implantation embryos, VASC successfully re-establishes the cell dynamics and identifies several candidate marker genes associated with the early embryo development.Keywords
This publication has 30 references indexed in Scilit:
- Design and computational analysis of single-cell RNA-sequencing experimentsGenome Biology, 2016
- Heterogeneity in Oct4 and Sox2 Targets Biases Cell Fate in 4-Cell Mouse EmbryosCell, 2016
- ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysisGenome Biology, 2015
- Computational and analytical challenges in single-cell transcriptomicsNature Reviews Genetics, 2015
- Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencingGenome Research, 2014
- Bayesian approach to single-cell differential expression analysisNature Methods, 2014
- Single-cell sequencing-based technologies will revolutionize whole-organism scienceNature Reviews Genetics, 2013
- Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensionsThe Annals of Statistics, 2012
- Reducing the Dimensionality of Data with Neural NetworksScience, 2006
- Principal component analysisChemometrics and Intelligent Laboratory Systems, 1987