Learning unsupervised feature representations for single cell microscopy images with paired cell inpainting
Open Access
- 3 September 2019
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 15 (9), e1007348
- https://doi.org/10.1371/journal.pcbi.1007348
Abstract
Cellular microscopy images contain rich insights about biology. To extract this information, researchers use features, or measurements of the patterns of interest in the images. Here, we introduce a convolutional neural network (CNN) to automatically design features for fluorescence microscopy. We use a self-supervised method to learn feature representations of single cells in microscopy images without labelled training data. We train CNNs on a simple task that leverages the inherent structure of microscopy images and controls for variation in cell morphology and imaging: given one cell from an image, the CNN is asked to predict the fluorescence pattern in a second different cell from the same image. We show that our method learns high-quality features that describe protein expression patterns in single cells both yeast and human microscopy datasets. Moreover, we demonstrate that our features are useful for exploratory biological analysis, by capturing high-resolution cellular components in a proteome-wide cluster analysis of human proteins, and by quantifying multi-localized proteins and single-cell variability. We believe paired cell inpainting is a generalizable method to obtain feature representations of single cells in multichannel microscopy images. To understand the cell biology captured by microscopy images, researchers use features, or measurements of relevant properties of cells, such as the shape or size of cells, or the intensity of fluorescent markers. Features are the starting point of most image analysis pipelines, so their quality in representing cells is fundamental to the success of an analysis. Classically, researchers have relied on features manually defined by imaging experts. In contrast, deep learning techniques based on convolutional neural networks (CNNs) automatically learn features, which can outperform manually-defined features at image analysis tasks. However, most CNN methods require large manually-annotated training datasets to learn useful features, limiting their practical application. Here, we developed a new CNN method that learns high-quality features for single cells in microscopy images, without the need for any labeled training data. We show that our features surpass other comparable features in identifying protein localization from images, and that our method can generalize to diverse datasets. By exploiting our method, researchers will be able to automatically obtain high-quality features customized to their own image datasets, facilitating many downstream analyses, as we highlight by demonstrating many possible use cases of our features in this study.This publication has 52 references indexed in Scilit:
- Unsupervised Clustering of Subcellular Protein Expression Patterns in High-Throughput Microscopy Images Reveals Protein Complexes and Functional Relationships between ProteinsPLoS Computational Biology, 2013
- Image processing and recognition for biological imagesDevelopment, Growth & Differentiation, 2013
- Dynamics of the DNA damage response: insights from live-cell imagingBriefings in Functional Genomics, 2013
- Automated Analysis and Reannotation of Subcellular Locations in Confocal Images from the Human Protein AtlasPLOS ONE, 2012
- Dissecting DNA damage response pathways by analysing protein localization and abundance changes during DNA replication stressNature, 2012
- Origins of regulated cell-to-cell variabilityNature Reviews Molecular Cell Biology, 2011
- Quantifying the distribution of probes between subcellular locations using unsupervised pattern unmixingBioinformatics, 2010
- GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene listsBMC Bioinformatics, 2009
- DBMLoc: a Database of proteins with multiple subcellular localizationsBMC Bioinformatics, 2008
- Global analysis of protein localization in budding yeastNature, 2003