Real Time Classification of Viruses in 12 Dimensions
Open Access
- 22 May 2013
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 8 (5), e64328
- https://doi.org/10.1371/journal.pone.0064328
Abstract
The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in . Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in . Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels.This publication has 34 references indexed in Scilit:
- Novel Paramyxoviruses in Free-Ranging European BatsPLOS ONE, 2012
- Estimation of pairwise sequence similarity of mammalian enhancers with word neighbourhood countsBioinformatics, 2012
- What Does Virus Evolution Tell Us about Virus Origins?Journal of Virology, 2011
- MEGA5: Molecular Evolutionary Genetics Analysis Using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony MethodsMolecular Biology and Evolution, 2011
- A Novel Method of Characterizing Genetic Sequences: Genome Space with Biological Distance and ApplicationsPLOS ONE, 2011
- A Novel Construction of Genome Space with Biological GeometryDNA Research, 2010
- The comparative genomics of viral emergenceProceedings of the National Academy of Sciences of the United States of America, 2010
- Drosophila A virus is an unusual RNA virus with a T=3 icosahedral core and permuted RNA-dependent RNA polymeraseJournal of General Virology, 2009
- Nyamanini and Midway Viruses Define a Novel Taxon of RNA Viruses in the Order MononegaviralesJournal of Virology, 2009
- Alignment-free genome comparison with feature frequency profiles (FFP) and optimal resolutionsProceedings of the National Academy of Sciences of the United States of America, 2009