A multi-center cross-platform single-cell RNA sequencing reference dataset
Open Access
- 2 February 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Scientific Data
- Vol. 8 (1), 1-11
- https://doi.org/10.1038/s41597-021-00809-x
Abstract
Single-cell RNA sequencing (scRNA-seq) is developing rapidly, and investigators seeking to use this technology are left with a variety of options for both experimental platform and bioinformatics methods. There is an urgent need for scRNA-seq reference datasets for benchmarking of different scRNA-seq platforms and bioinformatics methods. To be broadly applicable, these should be generated from renewable, well characterized reference samples and processed in multiple centers across different platforms. Here we present a benchmark scRNA-seq dataset that includes 20 scRNA-seq datasets acquired either as mixtures or as individual samples from two biologically distinct cell lines for which a large amount of multi-platform whole genome sequencing data are also available. These scRNA-seq datasets were generated from multiple popular platforms across four sequencing centers. We believe the datasets we describe here will provide a resource that meets this need by allowing evaluation of various bioinformatics methods for scRNA-seq analyses, including but not limited to data preprocessing, imputation, normalization, clustering, batch correction, and differential analysis.This publication has 40 references indexed in Scilit:
- Trimmomatic: a flexible trimmer for Illumina sequence dataBioinformatics, 2014
- Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype callsNature Biotechnology, 2014
- featureCounts: an efficient general purpose program for assigning sequence reads to genomic featuresBioinformatics, 2013
- Smart-seq2 for sensitive full-length transcriptome profiling in single cellsNature Methods, 2013
- STAR: ultrafast universal RNA-seq alignerBioinformatics, 2012
- CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear AmplificationCell Reports, 2012
- Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cellsNature Biotechnology, 2012
- RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genomeBMC Bioinformatics, 2011
- Cutadapt removes adapter sequences from high-throughput sequencing readsEMBnet.Journal, 2011
- A scaling normalization method for differential expression analysis of RNA-seq dataGenome Biology, 2010