Sepsid even-skipped Enhancers Are Functionally Conserved in Drosophila Despite Lack of Sequence Conservation

Abstract
The gene expression pattern specified by an animal regulatory sequence is generally viewed as arising from the particular arrangement of transcription factor binding sites it contains. However, we demonstrate here that regulatory sequences whose binding sites have been almost completely rearranged can still produce identical outputs. We sequenced the even-skipped locus from six species of scavenger flies (Sepsidae) that are highly diverged from the model species Drosophila melanogaster, but share its basic patterns of developmental gene expression. Although there is little sequence similarity between the sepsid eve enhancers and their well-characterized D. melanogaster counterparts, the sepsid and Drosophila enhancers drive nearly identical expression patterns in transgenic D. melanogaster embryos. We conclude that the molecular machinery that connects regulatory sequences to the transcription apparatus is more flexible than previously appreciated. In exploring this diverse collection of sequences to identify the shared features that account for their similar functions, we found a small number of short (20–30 bp) sequences nearly perfectly conserved among the species. These highly conserved sequences are strongly enriched for pairs of overlapping or adjacent binding sites. Together, these observations suggest that the local arrangement of binding sites relative to each other is more important than their overall arrangement into larger units of cis-regulatory function. The transformation of a fertilized egg into a complex, multicellular organism is a carefully choreographed process in which thousands of genes are turned on and off in specific spatial and temporal patterns that confer distinct physical properties and behaviors on emerging cells and tissues. To understand how an organism's genome specifies its form and function, it is therefore necessary to understand how patterns of gene expression are encoded in DNA. Decades of analysis of the fruit fly Drosophila melanogaster have identified numerous regulatory sequences, but have not fully illuminated how they work. Here we harness the record of natural selection to probe the function of these sequences. We identified regulatory sequences from scavenger fly species that diverged from Drosophila over 100 million years ago. While these regulatory sequences are almost completely different from their Drosophila counterparts, they drive identical expression patterns in Drosophila embryos, demonstrating extreme flexibility in the molecular machines that interpret regulatory DNA. Yet, the identical outputs produced by these sequences mean they must have something in common, and we describe one shared feature of regulatory sequence organization and function that has emerged from these comparisons. Our approach can be generalized to any regulatory system and species, and we believe that a growing collection of regulatory sequences with dissimilar sequences but similar outputs will reveal the molecular logic of gene regulation.