BSAseq: an interactive and integrated web-based workflow for identification of causal mutations in bulked F2 populations
- 10 August 2020
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 37 (3), 382-387
- https://doi.org/10.1093/bioinformatics/btaa709
Abstract
With the advance of next-generation sequencing (NGS) technologies and reductions in the costs of these techniques, bulked segregant analysis (BSA) has become not only a powerful tool for mapping quantitative trait loci (QTL) but also a useful way to identify causal gene mutations underlying phenotypes of interest. However, due to the presence of background mutations and errors in sequencing, genotyping, and reference assembly, it is often difficult to distinguish true causal mutations from background mutations. In this study, we developed the BSAseq workflow, which includes an automated bioinformatics analysis pipeline with a probabilistic model for estimating the linked region (the region linked to the causal mutation) and an interactive Shiny web application for visualizing the results. We deeply sequenced a sorghum male-sterile parental line (ms8) to capture the majority of background mutations in our bulked F2 data. We applied the workflow to 11 bulked sorghum F2 populations and 1 rice F2 population and identified the true causal mutation in each population. The workflow is intuitive and straightforward, facilitating its adoption by users without bioinformatics analysis skills. We anticipate that the BSAseq workflow will be broadly applicable to the identification of causal mutations for many phenotypes of interest. BSAseq is freely available on https://www.sciapps.org/page/bsa Supplementary data are available at Bioinformatics online.Funding Information
- USDA-ARS (8062-21000-041-00D, 3096-21000-021-00D, 3096-21000-022-00D)
- National Science Foundation (DBI-1265383, IOS-1445025)
This publication has 21 references indexed in Scilit:
- SIFT missense predictions for genomesNature Protocols, 2015
- Architecting a Distributed Bioinformatics Platform with iRODS and iPlant Agave APIPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- XSEDE: Accelerating Scientific DiscoveryComputing in Science & Engineering, 2014
- A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEffFly, 2012
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- Genome sequencing reveals agronomically important loci in rice using MutMapNature Biotechnology, 2012
- A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing dataBioinformatics, 2011
- The iPlant Collaborative: Cyberinfrastructure for Plant BiologyFrontiers in Plant Science, 2011
- The art and design of genetic screens: Arabidopsis thalianaNature Reviews Genetics, 2002
- Identification of markers linked to disease-resistance genes by bulked segregant analysis: a rapid method to detect markers in specific genomic regions by using segregating populations.Proceedings of the National Academy of Sciences of the United States of America, 1991