Low Cost, Scalable Proteomics Data Analysis Using Amazon’s Cloud Computing Services and Open Source Search Algorithms
- 9 April 2009
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Proteome Research
- Vol. 8 (6), 3148-3153
- https://doi.org/10.1021/pr800970z
Abstract
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site (http://proteomics.mcw.edu/vipdac).Keywords
This publication has 6 references indexed in Scilit:
- The urine proteome as a biomarker of radiation injuryProteomics – Clinical Applications, 2008
- Data analysis and bioinformatics tools for tandem mass spectrometry in proteomicsPhysiological Genomics, 2008
- Computing in the cloudsnetWorker, 2007
- Improved Mass Spectrometric Proteomic Profiling of the Secretome of Rat Vascular Endothelial CellsJournal of Proteome Research, 2006
- Open Mass Spectrometry Search AlgorithmJournal of Proteome Research, 2004
- An Automated Multidimensional Protein Identification Technology for Shotgun ProteomicsAnalytical Chemistry, 2001