Sequencing of Pooled DNA Samples (Pool-Seq) Uncovers Complex Dynamics of Transposable Element Insertions in Drosophila melanogaster

Abstract
Transposable elements (TEs) are mobile genetic elements that parasitize genomes by semi-autonomously increasing their own copy number within the host genome. While TEs are important for genome evolution, appropriate methods for performing unbiased genome-wide surveys of TE variation in natural populations have been lacking. Here, we describe a novel and cost-effective approach for estimating population frequencies of TE insertions using paired-end Illumina reads from a pooled population sample. Importantly, the method treats insertions present in and absent from the reference genome identically, allowing unbiased TE population frequency estimates. We apply this method to data from a natural Drosophila melanogaster population from Portugal. Consistent with previous reports, we show that low recombining genomic regions harbor more TE insertions and maintain insertions at higher frequencies than do high recombining regions. We conservatively estimate that there are almost twice as many “novel” TE insertion sites as sites known from the reference sequence in our population sample (6,824 novel versus 3,639 reference sites, with on average a 31-fold coverage per insertion site). Different families of transposable elements show large differences in their insertion densities and population frequencies. Our analyses suggest that the history of TE activity significantly contributes to this pattern, with recently active families segregating at lower frequencies than those active in the more distant past. Finally, using our high-resolution TE abundance measurements, we identified 13 candidate positively selected TE insertions based on their high population frequencies and on low Tajima's D values in their neighborhoods. Transposable elements (TE's) are parasitic genetic elements that spread by replicating themselves within a host genome. Most organisms are burdened with transposable elements; in fact, up to 80% of some genomes can consist of TE–derived DNA. Here, we use new sequencing technology to examine variation in genomic TE composition within a population at a finer scale and in a more unbiased fashion than has been possible before. We study a Portuguese population of D. melanogaster and find a large number of TE insertions, most of which occur in few individuals. Our analysis confirms that TE insertions are subject to purifying selection that counteracts their spread, and it suggests that the genome records waves of past TE invasions, with recently active elements occurring at low population frequency. We also find indications that TE insertions may sometimes have beneficial effects.