The PRIDE Proteomics Identifications Database: Data Submission, Query, and Dataset Comparison

Abstract
The PRIDE database has been developed to allow the proteomics community to share publicly, or within private collaborations, the vast volume of data generated by proteomics laboratories across the globe. These data are being generated at an expanding rate as increasingly sophisticated technologies become available. Compounding this problem, the infrastructure and techniques used to generate these data vary in terms of the instrumentation used, the protein sequence databases searched, the search engines employed, and the automatic or manual filtering of identifications following the initial automated search. The PRIDE project provides an infrastructure to solve these problems, including a generic, standards-based format that can be annotated to capture data generated using any proteomics pipeline, a protein accession mapping service to overcome the problem of disparate protein sequence databases being searched, and tools for query, comparison, and analysis of proteomics data. This chapter describes the main practical considerations in making use of PRIDE, including the available resources: the PRIDE database, the Ontology Lookup Service (OLS), the protein identifier cross-referencing service (PICR), the Proteome Harvest PRIDE submission spreadsheet, and the PRIDE BioMart. PRIDE can be accessed at http://www.ebi.ac.uk/pride.