Environmental monitoring using next generation sequencing: rapid identification of macroinvertebrate bioindicator species

Top Cited Papers
Open Access
Abstract
Invertebrate communities are central to many environmental monitoring programs. In freshwater ecosystems, aquatic macroinvertebrates are collected, identified and then used to infer ecosystem condition. Yet the key step of species identification is often not taken, as it requires a high level of taxonomic expertise, which is lacking in most organizations, or species cannot be identified as they are morphologically cryptic or represent little known groups. Identifying species using DNA sequences can overcome many of these issues; with the power of next generation sequencing (NGS), using DNA sequences for routine monitoring becomes feasible. In this study, we test if NGS can be used to identify species from field-collected samples in an important bioindicator group, the Chironomidae. We show that Cytochrome oxidase I (COI) and Cytochrome B (CytB) sequences provide accurate DNA barcodes for chironomid species. We then develop a NGS analysis pipeline to identifying species using megablast searches of high quality sequences generated using 454 pyrosequencing against comprehensive reference libraries of Sanger-sequenced voucher specimens. We find that 454 generated COI sequences successfully identified up to 96% of species in samples, but this increased up to 99% when combined with CytB sequences. Accurate identification depends on having at least five sequences for a species; below this level species not expected in samples were detected. Incorrect incorporation of some multiplex identifiers (MID's) used to tag samples was a likely cause, and most errors could be detected when using MID tags on forward and reverse primers. We also found a strong quantitative relationship between the number of 454 sequences and individuals showing that it may be possible to estimate the abundance of species from 454 pyrosequencing data. Next generation sequencing using two genes was successful for identifying chironomid species. However, when detecting species from 454 pyrosequencing data sets it was critical to include known individuals for quality control and to establish thresholds for detecting species. The NGS approach developed here can lead to routine species-level diagnostic monitoring of aquatic ecosystems.