Automating approximate Bayesian computation by local linear regression
Open Access
- 7 July 2009
- journal article
- Published by Springer Science and Business Media LLC in BMC Genomic Data
- Vol. 10 (1), 35
- https://doi.org/10.1186/1471-2156-10-35
Abstract
In several biological contexts, parameter inference often relies on computationally-intensive techniques. "Approximate Bayesian Computation", or ABC, methods based on summary statistics have become increasingly popular. A particular flavor of ABC based on using a linear regression to approximate the posterior distribution of the parameters, conditional on the summary statistics, is computationally appealing, yet no standalone tool exists to automate the procedure. Here, I describe a program to implement the method. The software package ABCreg implements the local linear-regression approach to ABC. The advantages are: 1. The code is standalone, and fully-documented. 2. The program will automatically process multiple data sets, and create unique output files for each (which may be processed immediately in R), facilitating the testing of inference procedures on simulated data, or the analysis of multiple data sets. 3. The program implements two different transformation methods for the regression step. 4. Analysis options are controlled on the command line by the user, and the program is designed to output warnings for cases where the regression fails. 5. The program does not depend on any particular simulation machinery (coalescent, forward-time, etc.), and therefore is a general tool for processing the results from any simulation. 6. The code is open-source, and modular. Examples of applying the software to empirical data from Drosophila melanogaster, and testing the procedure on simulated data, are shown. In practice, the ABCreg simplifies implementing ABC based on local-linear regression.Keywords
This publication has 20 references indexed in Scilit:
- Approximately Sufficient Statistics and Bayesian ComputationStatistical Applications in Genetics and Molecular Biology, 2008
- Compound Tests for the Detection of Hitchhiking Under Positive SelectionMolecular Biology and Evolution, 2007
- Integration within the Felsenstein equation for improved Markov chain Monte Carlo methods in population geneticsProceedings of the National Academy of Sciences of the United States of America, 2007
- Sequential Monte Carlo without likelihoodsProceedings of the National Academy of Sciences of the United States of America, 2007
- Controlling the False-Positive Rate in Multilocus Genome Scans for SelectionGenetics, 2007
- Approximate Bayesian Inference Reveals Evidence for a Recent, Severe Bottleneck in a Netherlands Population of Drosophila melanogasterGenetics, 2006
- Recombination and the Properties of Tajima's D in the Context of Approximate-Likelihood CalculationGenetics, 2005
- Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populationsGenome Research, 2005
- Population Genetics of Polymorphism and Divergence for Diploid Selection Models With Arbitrary DominanceGenetics, 2004
- Sampling theory for neutral alleles in a varying environmentPhilosophical Transactions Of The Royal Society B-Biological Sciences, 1994