Image-based crystal detection: a machine-learning approach

Open Access

18 November 2008

journal article
research article
Published by International Union of Crystallography (IUCr) in Acta Crystallographica Section D-Structural Biology

Vol. 64 (12), 1187-1195
https://doi.org/10.1107/s090744490802982x

Abstract

The ability of computers to learn from and annotate large databases of crystallization-trial images provides not only the ability to reduce the workload of crystallization studies, but also an opportunity to annotate crystallization trials as part of a framework for improving screening methods. Here, a system is presented that scores sets of images based on the likelihood of containing crystalline material as perceived by a machine-learning algorithm. The system can be incorporated into existing crystallization-analysis pipelines, whereby specialists examine images as they normally would with the exception that the images appear in rank order according to a simple real-valued score. Promising results are shown for 319 112 images associated with 150 structures solved by the Joint Center for Structural Genomics pipeline during the 2006-2007 year. Overall, the algorithm achieves a mean receiver operating characteristic score of 0.919 and a 78% reduction in human effort per set when considering an absolute score cutoff for screening images, while incurring a loss of five out of 150 structures.

Keywords

This publication has 14 references indexed in Scilit:

Advances in High-throughput Methodologies for Crystallizing Proteins
Biotechnology and Genetic Engineering Reviews, 2006
Integrated state evaluation for the images of crystallization droplets utilizing linear and nonlinear classifiers
Acta Crystallographica Section D-Structural Biology, 2006
Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features
Acta Crystallographica Section D-Structural Biology, 2006
Evaluation of crystalline objects in crystallizing protein droplets based on line-segment information in greyscale images
Acta Crystallographica Section D-Structural Biology, 2006
The Impact of Structural Genomics: Expectations and Outcomes
Science, 2006
Automatic Classification and Pattern Discovery in High-throughput Protein Crystallization Trials
Journal of Structural and Functional Genomics, 2005
Protein Production and Crystallization at the Joint Center for Structural Genomics
Journal of Structural and Functional Genomics, 2005
Computational analysis of crystallization trials
Acta Crystallographica Section D-Structural Biology, 2002
The high-speed Hydra-Plus-One system for automated high-throughput protein crystallography.
Acta Crystallographica Section D-Structural Biology, 2002
Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline
Proceedings of the National Academy of Sciences of the United States of America, 2002

Cited by 36 articles