Low-coverage sequencing: Implications for design of complex trait association studies
- 1 April 2011
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 21 (6), 940-951
- https://doi.org/10.1101/gr.117259.110
Abstract
New sequencing technologies allow genomic variation to be surveyed in much greater detail than previously possible. While detailed analysis of a single individual typically requires deep sequencing, when many individuals are sequenced it is possible to combine shallow sequence data across individuals to generate accurate calls in shared stretches of chromosome. Here, we show that, as progressively larger numbers of individuals are sequenced, increasingly accurate genotype calls can be generated for a given sequence depth. We evaluate the implications of low-coverage sequencing for complex trait association studies. We systematically compare study designs based on genotyping of tagSNPs, sequencing of many individuals at depths ranging between 2× and 30×, and imputation of variants discovered by sequencing a subset of individuals into the remainder of the sample. We show that sequencing many individuals at low depth is an attractive strategy for studies of complex trait genetics. For example, for disease-associated variants with frequency >0.2%, sequencing 3000 individuals at 4× depth provides similar power to deep sequencing of >2000 individuals at 30× depth but requires only ∼20% of the sequencing effort. We also show low-coverage sequencing can be used to build a reference panel that can drive imputation into additional samples to increase power further. We provide guidance for investigators wishing to combine results from sequenced, genotyped, and imputed samples.Keywords
This publication has 38 references indexed in Scilit:
- SNP detection and genotyping from low-coverage sequencing data on multiple diploid samplesGenome Research, 2010
- Next-Generation DNA Sequencing MethodsAnnual Review of Genomics and Human Genetics, 2008
- Genome-wide association studies for complex traits: consensus, uncertainty and challengesNature Reviews Genetics, 2008
- A second generation human haplotype map of over 3.1 million SNPsNature, 2007
- Whole-genome re-sequencingCurrent Opinion in Genetics & Development, 2006
- Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studiesNature Genetics, 2006
- Calibrating a coalescent simulation of human genome sequence variationGenome Research, 2005
- Genome sequencing in microfabricated high-density picolitre reactorsNature, 2005
- Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenesNucleic Acids Research, 2003
- Linkage Disequilibrium in Humans: Models and DataAmerican Journal of Human Genetics, 2001