Sequence composition, organization, and evolution of the core Triticeae genome

Abstract
We investigated the composition and the basis of genome expansion in the core Triticeae genome using Aegilops tauschii, the D-genome donor of bread wheat. We sequenced an unfiltered genomic shotgun (trs) and a methylation–filtration (tmf) library of A. tauschii, and analyzed wheat expressed sequence tags (ESTs) to estimate the expression of genes and transposable elements (TEs). The sampled D-genome sequences consisted of 91.6% repetitive elements, 2.5% known genes, and 5.9% low-copy sequences of unknown function. TEs constituted 68.2% of the D-genome compared with 50% in maize and 14% in rice. The DNA transposons constituted 13% of the D-genome compared with 2% in maize. TEs were methylated unevenly within and among elements and families, and most were transcribed which contributed to genome expansion in the core Triticeae genome. The copy number of a majority of repeat families increased gradually following polyploidization. Certain TE families occupied discrete chromosome territories. Nested insertions and illegitimate recombination occurred extensively between the TE families, and a majority of the TEs contained internal deletions. The GC content varied significantly among the three sequence sets examined ranging from 42% in tmf to 46% in trs and 52% in the EST. Based on enrichment of genic sequences, methylation–filtration offers one option, although not as efficient as in maize, for isolating gene-rich regions from the large genome of wheat.