Long-read-based Genome Assembly ofDrosophila gunungcolaReveals Fewer Chemosensory Genes in Flower-breeding Species

Abstract
Drosophila gunungcola exhibits reproductive activities on the fresh flowers of several plant species and is an emerging model to study the co-option of morphological and behavioral traits in male courtship display. Here, we report a near-chromosome-level genome assembly that was constructed based on long-read Pac-Bio sequencing data (with ∼66x coverage) and annotated with the assistant from RNA-seq transcriptome data of whole organisms at various developmental stages. A nuclear genome of 189 Mb with 13,950 protein-coding genes and a mitogenome of 17.5 kb. Few inter-chromosomal rearrangements were found in the comparisons of synteny with D. elegans, its sister species, and D. melanogaster, suggesting that the gene compositions on each Muller element are evolutionarily conserved. Loss events of several OR and IR genes in D. gunungcola and D. elegans were revealed when orthologous genomic regions were compared across species in the D. melanogaster species group. This high-quality reference genome will facilitate further comparative studies on traits related to the evolution of sexual behavior and diet specialization.