AllerTOP - a server for in silico prediction of allergens
Open Access
- 17 April 2013
- journal article
- conference paper
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 14 (S6), S4-9
- https://doi.org/10.1186/1471-2105-14-s6-s4
Abstract
Background Allergy is a form of hypersensitivity to normally innocuous substances, such as dust, pollen, foods or drugs. Allergens are small antigens that commonly provoke an IgE antibody response. There are two types of bioinformatics-based allergen prediction. The first approach follows FAO/WHO Codex alimentarius guidelines and searches for sequence similarity. The second approach is based on identifying conserved allergenicity-related linear motifs. Both approaches assume that allergenicity is a linearly coded property. In the present study, we applied ACC pre-processing to sets of known allergens, developing alignment-independent models for allergen recognition based on the main chemical properties of amino acid sequences. Results A set of 684 food, 1,156 inhalant and 555 toxin allergens was collected from several databases. A set of non-allergens from the same species were selected to mirror the allergen set. The amino acids in the protein sequences were described by three z-descriptors (z 1 , z 2 and z 3 ) and by auto- and cross-covariance (ACC) transformation were converted into uniform vectors. Each protein was presented as a vector of 45 variables. Five machine learning methods for classification were applied in the study to derive models for allergen prediction. The methods were: discriminant analysis by partial least squares (DA-PLS), logistic regression (LR), decision tree (DT), naïve Bayes (NB) and k nearest neighbours (k NN). The best performing model was derived by k NN at k = 3. It was optimized, cross-validated and implemented in a server named AllerTOP, freely accessible at http://www.pharmfac.net/allertop. AllerTOP also predicts the most probable route of exposure. In comparison to other servers for allergen prediction, AllerTOP outperforms them with 94% sensitivity. Conclusions AllerTOP is the first alignment-free server for in silico prediction of allergens based on the main physicochemical properties of proteins. Significantly, as well allergenicity AllerTOP is able to predict the route of allergen exposure: food, inhalant or toxin.Keywords
This publication has 25 references indexed in Scilit:
- The WEKA data mining softwareACM SIGKDD Explorations Newsletter, 2009
- AllerHunter: A SVM-Pairwise System for Assessment of Allergenicity and Allergic Cross-Reactivity in ProteinsPLOS ONE, 2009
- Biopython: freely available Python tools for computational molecular biology and bioinformaticsBioinformatics, 2009
- Allergens as eukaryotic proteins lacking bacterial homologuesMolecular Immunology, 2007
- AllerTool: a web server for predicting allergenicity and allergic cross-reactivity in proteinsBioinformatics, 2006
- AlgPred: prediction of allergenic proteins and mapping of IgE epitopesNucleic Acids Research, 2006
- An attempt to define allergen-specific molecular surface features: a bioinformatic approachBioinformatics, 2005
- Hydrophobicity: an ancient damage-associated molecular pattern that initiates innate immune responsesNature Reviews Immunology, 2004
- The use of the area under the ROC curve in the evaluation of machine learning algorithmsPattern Recognition, 1997
- DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structuresAnalytica Chimica Acta, 1993