Sequencing, assembly and annotation of the whole-insect genome of Lymantria dispar dispar, the European gypsy moth
Open Access
- 30 April 2021
- journal article
- research article
- Published by Oxford University Press (OUP) in G3 Genes|Genomes|Genetics
- Vol. 11 (8)
- https://doi.org/10.1093/g3journal/jkab150
Abstract
The European gypsy moth, Lymantria dispar dispar (LDD), is an invasive insect and a threat to urban trees, forests and forest-related industries in North America. For use as a comparator with a previously published genome based on the LD652 pupal ovary-derived cell line, as well as whole-insect genome sequences obtained from the Asian gypsy moth subspecies L. dispar asiatica and L. dispar japonica, the whole-insect LDD genome was sequenced, assembled and annotated. The resulting assembly was 998 Mb in size, with a contig N50 of 662 Kb and GC content of 38.8%. Long interspersed nuclear elements (LINEs) constitute 25.4% of the whole-insect genome, and a total of 11,901 genes predicted by automated gene finding encoded proteins exhibiting homology with reference sequences in the NCBI NR and/or UniProtKB databases at the most stringent similarity cutoff level (i.e., the gold tier). These results will be especially useful in developing a better understanding of the biology and population genetics of L. dispar and the genetic features underlying Lepidoptera in general.Keywords
Funding Information
- USDA-ARS (8042-22000-315-00-D)
- Genome Canada’s Large-Scale Applied Research Project (# 10106)
- Biosurveillance of Alien Forest Enemies
- Genomics Research and Development Initiative
- Government of Canada
This publication has 33 references indexed in Scilit:
- Transcriptome of the Lymantria dispar (Gypsy Moth) Larval Midgut in Response to Infection by Bacillus thuringiensisPLOS ONE, 2013
- CD-HIT: accelerated for clustering the next-generation sequencing dataBioinformatics, 2012
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- Assembly algorithms for next-generation sequencing dataGenomics, 2010
- MetWAMer: eukaryotic translation initiation site predictionBMC Bioinformatics, 2008
- LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposonsBMC Bioinformatics, 2008
- Incorporation of splice site probability models for non-canonical introns improves gene structure prediction in plantsBioinformatics, 2005
- De novo identification of repeat families in large genomesBioinformatics, 2005
- Improving the Arabidopsis genome annotation using maximal transcript alignment assembliesNucleic Acids Research, 2003
- Automated De Novo Identification of Repeat Sequence Families in Sequenced GenomesGenome Research, 2002