Selecting XFEL single-particle snapshots by geometric machine learning
Open Access
- 1 January 2021
- journal article
- research article
- Published by AIP Publishing in Structural Dynamics
- Vol. 8 (1), 014701
- https://doi.org/10.1063/4.0000060
Abstract
A promising new route for structural biology is single-particle imaging with an X-ray Free-Electron Laser (XFEL). This method has the advantage that the samples do not require crystallization and can be examined at room temperature. However, high-resolution structures can only be obtained from a sufficiently large number of diffraction patterns of individual molecules, so-called single particles. Here, we present a method that allows for efficient identification of single particles in very large XFEL datasets, operates at low signal levels, and is tolerant to background. This method uses supervised Geometric Machine Learning (GML) to extract low-dimensional feature vectors from a training dataset, fuse test datasets into the feature space of training datasets, and separate the data into binary distributions of "single particles" and "non-single particles." As a proof of principle, we tested simulated and experimental datasets of the Coliphage PR772 virus. We created a training dataset and classified three types of test datasets: First, a noise-free simulated test dataset, which gave near perfect separation. Second, simulated test datasets that were modified to reflect different levels of photon counts and background noise. These modified datasets were used to quantify the predictive limits of our approach. Third, an experimental dataset collected at the Stanford Linear Accelerator Center. The single-particle identification for this experimental dataset was compared with previously published results and it was found that GML covers a wide photon-count range, outperforming other single-particle identification methods. Moreover, a major advantage of GML is its ability to retrieve single particles in the presence of structural variability.Funding Information
- National Science Foundation (STC 1231306)
- National Science Foundation (DBI-2029533)
- U.S. Department of Energy (DE-SC0002164)
This publication has 49 references indexed in Scilit:
- The Coherent X-ray Imaging Data BankNature Methods, 2012
- Unsupervised classification of single-particle X-ray diffraction snapshots by spectral clusteringOptics Express, 2011
- Systematic determination of order parameters for chain dynamics using diffusion mapsProceedings of the National Academy of Sciences of the United States of America, 2010
- Mapping the conformations of biological assembliesNew Journal of Physics, 2010
- Reconstruction algorithm for single-particle diffraction imaging experimentsPhysical Review E, 2009
- Gas dynamic virtual nozzle for generation of microscopic droplet streamsJournal of Physics D: Applied Physics, 2008
- Single Particle X-ray Diffractive ImagingNano Letters, 2007
- Maximum-likelihood Multi-reference Refinement for Electron Microscopy ImagesJournal of Molecular Biology, 2005
- X-ray image reconstruction from a diffraction pattern alonePhysical Review B, 2003
- Phase retrieval algorithms: a comparisonApplied Optics, 1982