Visualizing structure and transitions in high-dimensional biological data
Top Cited Papers
- 3 December 2019
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Biotechnology
- Vol. 37 (12), 1482-1492
- https://doi.org/10.1038/s41587-019-0336-3
Abstract
The high-dimensional data created by high-throughput technologies require visualization tools that reveal data structure and patterns in an intuitive form. We present PHATE, a visualization method that captures both local and global nonlinear structure using an information-geometric distance between data points. We compare PHATE to other tools on a variety of artificial and biological datasets, and find that it consistently preserves a range of patterns in data, including continual progressions, branches and clusters, better than other tools. We define a manifold preservation metric, which we call denoised embedding manifold preservation (DEMaP), and show that PHATE produces lower-dimensional embeddings that are quantitatively better denoised as compared to existing visualization methods. An analysis of a newly generated single-cell RNA sequencing dataset on human germ-layer differentiation demonstrates how PHATE reveals unique biological insight into the main developmental branches, including identification of three previously undescribed subpopulations. We also show that PHATE is applicable to a wide variety of data types, including mass cytometry, single-cell RNA sequencing, Hi-C and gut microbiome data.Keywords
Funding Information
- Gruber Foundation
- U.S. Department of Health & Human Services | NIH | Eunice Kennedy Shriver National Institute of Child Health and Human Development (F31HD097958)
- Alfred P. Sloan Foundation (FG-2016-6607)
- United States Department of Defense | Defense Advanced Research Projects Agency (D16AP00117)
- U.S. Department of Health & Human Services | National Institutes of Health (1R01HG008383, R01GM107092, R01GM130847)
- l’institut de valorisation des donnees
This publication has 69 references indexed in Scilit:
- Hierarchical data organization, clustering and denoising via localized diffusion foldersApplied and Computational Harmonic Analysis, 2012
- Single-Cell Mass Cytometry of Differential Immune and Drug Responses Across a Human Hematopoietic ContinuumScience, 2011
- Seriation and matrix reordering methods: An historical overviewStatistical Analysis and Data Mining, 2010
- Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human GenomeScience, 2009
- Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing, and Proximity AnalysisJournal of the American Statistical Association, 2009
- Generation of a defined and uniform population of CNS progenitors and neurons from mouse embryonic stem cellsNature Protocols, 2007
- Efficient Induction of Oligodendrocytes from Human Embryonic Stem CellsThe International Journal of Cell Cloning, 2006
- Generating Oocytes and Sperm from Embryonic Stem CellsSeminars in Reproductive Medicine, 2005
- Generation of glycogen- and albumin-producing hepatocyte-like cells from embryonic stem cellsBiological Chemistry, 2004
- Objective Criteria for the Evaluation of Clustering MethodsJournal of the American Statistical Association, 1971