Identification of cell types from single-cell transcriptomes using a novel clustering method
Open Access
- 11 February 2015
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 31 (12), 1974-1980
- https://doi.org/10.1093/bioinformatics/btv088
Abstract
Motivation: The recent advance of single-cell technologies has brought new insights into complex biological phenomena. In particular, genome-wide single-cell measurements such as transcriptome sequencing enable the characterization of cellular composition as well as functional variation in homogenic cell populations. An important step in the single-cell transcriptome analysis is to group cells that belong to the same cell types based on gene expression patterns. The corresponding computational problem is to cluster a noisy high dimensional dataset with substantially fewer objects (cells) than the number of variables (genes). Results: In this article, we describe a novel algorithm named shared nearest neighbor (SNN)-Cliq that clusters single-cell transcriptomes. SNN-Cliq utilizes the concept of shared nearest neighbor that shows advantages in handling high-dimensional data. When evaluated on a variety of synthetic and real experimental datasets, SNN-Cliq outperformed the state-of-the-art methods tested. More importantly, the clustering results of SNN-Cliq reflect the cell types or origins with high accuracy. Availability and implementation: The algorithm is implemented in MATLAB and Python. The source code can be downloaded at http://bioinfo.uncc.edu/SNNCliq. Contact:zcsu@uncc.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 28 references indexed in Scilit:
- Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cellsNature Structural & Molecular Biology, 2013
- Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cellsNature Biotechnology, 2012
- Single-cell genomicsNature Methods, 2011
- Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotesNucleic Acids Research, 2009
- Clustering aggregationACM Transactions on Knowledge Discovery From Data, 2007
- Control of Stochasticity in Eukaryotic Gene ExpressionScience, 2004
- A maximum variance cluster algorithmIEEE Transactions on Pattern Analysis and Machine Intelligence, 2002
- A clustering algorithm based on graph connectivityInformation Processing Letters, 2000
- Rock: A robust clustering algorithm for categorical attributesInformation Systems, 2000
- Chameleon: hierarchical clustering using dynamic modelingComputer, 1999