High-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing

Abstract
High-throughput amplicon sequencing of large genomic regions remains challenging for short-read technologies. Here, we report a high-throughput amplicon sequencing approach combining unique molecular identifiers (UMIs) with Oxford Nanopore Technologies (ONT) or Pacific Biosciences circular consensus sequencing, yielding high-accuracy single-molecule consensus sequences of large genomic regions. We applied our approach to sequence ribosomal RNA operon amplicons (similar to 4,500 bp) and genomic sequences (>10,000 bp) of reference microbial communities in which we observed a chimera rate <0.02%. To reach a mean UMI consensus error rate <0.01%, a UMI read coverage of 15x (ONT R10.3), 25x (ONT R9.4.1) and 3x (Pacific Biosciences circular consensus sequencing) is needed, which provides a mean error rate of 0.0042%, 0.0041% and 0.0007%, respectively.
Funding Information
  • Villum Fonden (15510)
  • Poul Due Jensen Foundation / Grundfos foundation: Grant reference “Microflora Danica”.
  • Genome British Columbia (SIP011)
  • Natural Sciences and Engineering Research Council of Canada