Improving reproducibility by using high-throughput observational studies with empirical calibration

Open Access

6 August 2018

journal article
research article
Published by The Royal Society in Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences

Vol. 376 (2128), 20170356
https://doi.org/10.1098/rsta.2017.0356

Abstract

Concerns over reproducibility in science extend to research using existing healthcare data; many observational studies investigating the same topic produce conflicting results, even when using the same data. To address this problem, we propose a paradigm shift. The current paradigm centres on generating one estimate at a time using a unique study design with unknown reliability and publishing (or not) one estimate at a time. The new paradigm advocates for high-throughput observational studies using consistent and standardized methods, allowing evaluation, calibration and unbiased dissemination to generate a more reliable and complete evidence base. We demonstrate this new paradigm by comparing all depression treatments for a set of outcomes, producing 17 718 hazard ratios, each using methodology on par with current best practice. We furthermore include control hypotheses to evaluate and calibrate our evidence generation process. Results show good transitivity and consistency between databases, and agree with four out of the five findings from clinical trials. The distribution of effect size estimates reported in the literature reveals an absence of small or null effects, with a sharp cut-off at p = 0.05. No such phenomena were observed in our results, suggesting more complete and more reliable evidence. This article is part of a discussion meeting issue ‘The growing ubiquity of algorithms in society: implications, impacts and innovations’.

Funding Information

National Science Foundation (IIS 1251151 and DMS 1264153)
National Institutes of Health (R01 LM06910 and U01 HG008680)

This publication has 38 references indexed in Scilit:

Constructing a semantic predication gold standard from the biomedical literature
BMC Bioinformatics, 2011
Exposure to Oral Bisphosphonates and Risk of Esophageal Cancer
JAMA, 2010
Sertraline versus other antidepressive agents for depression
Emergencias, 2010
Born to Be Criminal? What to Make of Early Biological Risk Factors for Criminal Behavior
American Journal of Psychiatry, 2010
Reanalysis of two studies with contrasting results on the association between statin use and fracture risk: the General Practice Research Database
International Journal of Epidemiology, 2006
Why Most Published Research Findings Are False
PLoS Medicine, 2005
Measuring inconsistency in meta-analyses
BMJ, 2003
Postmenopausal Estrogen and Progestin Use and the Risk of Cardiovascular Disease
The New England Journal of Medicine, 1996
Postmenopausal Estrogen Use, Cigarette Smoking, and Cardiovascular Morbidity in Women over 50
The New England Journal of Medicine, 1985
The central role of the propensity score in observational studies for causal effects
Biometrika, 1983

Cited by 59 articles