Centromere Satellites From Arabidopsis Populations: Maintenance of Conserved and Variable Domains

Abstract
The rapid evolution of centromere sequences between species has led to a debate over whether centromere activity is sequence-dependent. TheArabidopsis thaliana centromere regions contain ∼20,000 copies of a 178-bp satellite repeat. Here, we analyzed satellites from 41 Arabidopsis ecotypes, providing the first broad population survey of satellite variation within a species. We found highly conserved segments and consistent sequence lengths in theArabidopsis satellites and in the published collection of human α-satellites, supporting models for a functional role. Despite this conservation, polymorphisms are significantly enriched at some sites, yielding variation that could restrict binding proteins to a subset of repeat monomers. Some satellite regions vary considerably; at certain bases, consensus sequences derived from each ecotype diverge significantly from the Arabidopsis consensus, indicating substitutions sweep through a genome in less than 5 million years. Such rapid changes generate more variation within the set ofArabidopsis satellites than in genes from the chromosome arms or from the recombinationally suppressed centromere regions. These studies highlight a balance between the mechanisms that maintain particular satellite domains and the forces that disperse sequence changes throughout the satellite repeats in the genome.[Supplemental material is available online atwww.genome.org.]