inGAP: an integrated next-generation genome analysis pipeline

Abstract
Summary: We develop a novel mining pipeline, Integrative Next-generation Genome Analysis Pipeline (inGAP), guided by a Bayesian principle to detect single nucleotide polymorphisms (SNPs), insertion/deletions (indels) by comparing high-throughput pyrosequencing reads with a reference genome of related organisms. inGAP can be applied to the mapping of both Roche/454 and Illumina reads with no restriction of read length. Experiments on simulated and experimental data show that this pipeline can achieve overall 97% accuracy in SNP detection and 94% in the finding of indels. All the detected SNPs/indels can be further evaluated by a graphical editor in our pipeline. inGAP also provides functions of multiple genomes comparison and assistance of bacterial genome assembly. Availability: inGAP is available at http://sites.google.com/site/nextgengenomics/ingap Contact:scs@bx.psu.edu Supplementary information: Supplementary data are available at Bioinformatics online.