Insufficiently complex unique-molecular identifiers (UMIs) distort small RNA sequencing
Open Access
- 3 September 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in Scientific Reports
- Vol. 10 (1), 1-9
- https://doi.org/10.1038/s41598-020-71323-0
Abstract
The attachment of unique molecular identifiers (UMIs) to RNA molecules prior to PCR amplification and sequencing, makes it possible to amplify libraries to a level that is sufficient to identify rare molecules, whilst simultaneously eliminating PCR bias through the identification of duplicated reads. Accurate de-duplication is dependent upon a sufficiently complex pool of UMIs to allow unique labelling. In applications dealing with complex libraries, such as total RNA-seq, only a limited variety of UMIs are required as the variation in molecules to be sequenced is enormous. However, when sequencing a less complex library, such as small RNAs for which there is a more limited range of possible sequences, we find increased variation in UMIs are required, even beyond that provided in a commercial kit specifically designed for the preparation of small RNA libraries for sequencing. We show that a pool of UMIs randomly varying across eight nucleotides is not of sufficient depth to uniquely tag the microRNAs to be sequenced. This results in over de-duplication of reads and the marked under-estimation of expression of the more abundant microRNAs. Whilst still arguing for the utility of UMIs, this work demonstrates the importance of their considered design to avoid errors in the estimation of gene expression in libraries derived from select regions of the transcriptome or small genomes.Funding Information
- Beat Cancer Principal Research Fellowship
- Worldwide Cancer Research (19-0300, 19-0300)
- Australian Research Council (DP190103333, DP190103333)
- National Health and Medical Research Council (APP1118170, APP1129353)
This publication has 31 references indexed in Scilit:
- Biases in small RNA deep sequencing dataNucleic Acids Research, 2013
- IsomiRs – the overlooked repertoire in the dynamic microRNAomeTrends in Genetics, 2012
- Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodesProceedings of the National Academy of Sciences of the United States of America, 2012
- Counting absolute numbers of molecules using unique molecular identifiersNature Methods, 2011
- Barcoding bias in high-throughput multiplex sequencing of miRNAGenome Research, 2011
- Counting individual DNA molecules by the stochastic attachment of diverse labelsProceedings of the National Academy of Sciences of the United States of America, 2011
- Cutadapt removes adapter sequences from high-throughput sequencing readsEMBnet.Journal, 2011
- Analyzing and minimizing PCR amplification bias in Illumina sequencing librariesGenome Biology, 2011
- The Epithelial-Mesenchymal Transition Generates Cells with Properties of Stem CellsCell, 2008
- CLIP: Crosslinking and ImmunoPrecipitation of In Vivo RNA Targets of RNA-Binding ProteinsMethods in Molecular Biology, 2008