novoSNP, a novel computational tool for sequence variation discovery
- 1 March 2005
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 15 (3), 436-442
- https://doi.org/10.1101/gr.2754005
Abstract
Technological improvements shifted sequencing from low-throughput, work-intensive, gel-based systems to high-throughput capillary systems. This resulted in a broad use of genomic resequencing to identify sequence variations in genes and regulatory, as well as extended genomic regions. We describe a software package, novoSNP, that conscientiously discovers single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs) in sequence trace files in a fast, reliable, and user-friendly way. We compared the performance of novoSNP with that of PolyPhred and PolyBayes on two data sets. The first data set comprised 1028 sequence trace files obtained from diagnostic mutation analyses of SCN1A (neuronal voltage-gated sodium channel α-subunit type I gene). The second data set comprised 9062 sequence trace files from a genomic resequencing project aiming at the construction of a high-density SNP map of MAPT (microtubule-associated protein tau gene). Visual inspection of these data sets had identified 38 sequence variations for SCN1A and 488 for MAPT. novoSNP automatically identified all 38 SCN1A variations including five INDELs, while for MAPT only 15 of the 488 variations were not correctly marked. PolyPhred detected far fewer SNPs as compared to novoSNP and missed nearly all INDELs. PolyBayes, designed for the sequence analysis of cloned templates, detected only a limited number of the variations present in the data set. Besides the significant improvement in the automated detection of sequence variations both in diagnostic mutation analyses and in SNP discovery projects, novoSNP also offers a user-friendly interface for inspecting possible genetic variations.Keywords
This publication has 20 references indexed in Scilit:
- SNPbox: a modular software package for large-scale primer designBioinformatics, 2004
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- De Novo Mutations in the Sodium-Channel Gene SCN1A Cause Severe Myoclonic Epilepsy of InfancyAmerican Journal of Human Genetics, 2001
- A map of human genome sequence variation containing 1.42 million single nucleotide polymorphismsNature, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- The Genome Sequence of Drosophila melanogasterScience, 2000
- Genome Sequence of the Nematode C. elegans : A Platform for Investigating BiologyScience, 1998
- Comparative Analysis of Human DNA Variations by Fluorescence-Based Sequencing of PCR ProductsGenomics, 1994
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Basic local alignment search toolJournal of Molecular Biology, 1990