Improving reproducibility by using high-throughput observational studies with empirical calibration
Open Access
- 6 August 2018
- journal article
- research article
- Published by The Royal Society in Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
- Vol. 376 (2128), 20170356
- https://doi.org/10.1098/rsta.2017.0356
Abstract
Concerns over reproducibility in science extend to research using existing healthcare data; many observational studies investigating the same topic produce conflicting results, even when using the same data. To address this problem, we propose a paradigm shift. The current paradigm centres on generating one estimate at a time using a unique study design with unknown reliability and publishing (or not) one estimate at a time. The new paradigm advocates for high-throughput observational studies using consistent and standardized methods, allowing evaluation, calibration and unbiased dissemination to generate a more reliable and complete evidence base. We demonstrate this new paradigm by comparing all depression treatments for a set of outcomes, producing 17 718 hazard ratios, each using methodology on par with current best practice. We furthermore include control hypotheses to evaluate and calibrate our evidence generation process. Results show good transitivity and consistency between databases, and agree with four out of the five findings from clinical trials. The distribution of effect size estimates reported in the literature reveals an absence of small or null effects, with a sharp cut-off at p = 0.05. No such phenomena were observed in our results, suggesting more complete and more reliable evidence. This article is part of a discussion meeting issue ‘The growing ubiquity of algorithms in society: implications, impacts and innovations’.Funding Information
- National Science Foundation (IIS 1251151 and DMS 1264153)
- National Institutes of Health (R01 LM06910 and U01 HG008680)
This publication has 38 references indexed in Scilit:
- Constructing a semantic predication gold standard from the biomedical literatureBMC Bioinformatics, 2011
- Exposure to Oral Bisphosphonates and Risk of Esophageal CancerJAMA, 2010
- Sertraline versus other antidepressive agents for depressionEmergencias, 2010
- Born to Be Criminal? What to Make of Early Biological Risk Factors for Criminal BehaviorAmerican Journal of Psychiatry, 2010
- Reanalysis of two studies with contrasting results on the association between statin use and fracture risk: the General Practice Research DatabaseInternational Journal of Epidemiology, 2006
- Why Most Published Research Findings Are FalsePLoS Medicine, 2005
- Measuring inconsistency in meta-analysesBMJ, 2003
- Postmenopausal Estrogen and Progestin Use and the Risk of Cardiovascular DiseaseThe New England Journal of Medicine, 1996
- Postmenopausal Estrogen Use, Cigarette Smoking, and Cardiovascular Morbidity in Women over 50The New England Journal of Medicine, 1985
- The central role of the propensity score in observational studies for causal effectsBiometrika, 1983