The Release 5.1 Annotation of Drosophila melanogaster Heterochromatin

Abstract
The repetitive DNA that constitutes most of the heterochromatic regions of metazoan genomes has hindered the comprehensive analysis of gene content and other functions. We have generated a detailed computational and manual annotation of 24 megabases of heterochromatic sequence in the Release 5 Drosophila melanogaster genome sequence. The heterochromatin contains a minimum of 230 to 254 protein-coding genes, which are conserved in other Drosophilids and more diverged species, as well as 32 pseudogenes and 13 noncoding RNAs. Improved methods revealed that more than 77% of this heterochromatin sequence, including introns and intergenic regions, is composed of fragmented and nested transposable elements and other repeated DNAs. Drosophila heterochromatin contains “islands” of highly conserved genes embedded in these “oceans” of complex repeats, which may require special expression and splicing mechanisms.