Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
Open Access
- 8 May 2018
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Genomics
- Vol. 19 (1), 1-10
- https://doi.org/10.1186/s12864-018-4703-0
Abstract
Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps. Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping. Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.Keywords
Funding Information
- Center for Common Disease (UM1HG008895)
This publication has 15 references indexed in Scilit:
- A novel post hoc method for detecting index switching finds no evidence for increased switching on the Illumina HiSeq XMolecular Ecology Resources, 2017
- Computational correction of cross-contamination due to exclusion amplification barcode spreadingPublished by Cold Spring Harbor Laboratory ,2017
- Index switching causes “spreading-of-signal” among multiplexed samples in Illumina HiSeq 4000 DNA sequencingPublished by Cold Spring Harbor Laboratory ,2017
- Liquid biopsies come of age: towards implementation of circulating tumour DNANature Reviews Cancer, 2017
- Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant callerBMC Genomics, 2017
- Sensitive detection of somatic point mutations in impure and heterogeneous cancer samplesNature Biotechnology, 2013
- Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype DataAmerican Journal of Human Genetics, 2012
- Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platformNucleic Acids Research, 2011
- A scalable, fully automated process for construction of sequence-ready human exome targeted capture librariesGenome Biology, 2011
- Regulation of average length of complex PCR productNucleic Acids Research, 1999