Implementation of a deidentified federated data network for population-based cohort discovery

Abstract
Objective The Cross-Institutional Clinical Translational Research project explored a federated query tool and looked at how this tool can facilitate clinical trial cohort discovery by managing access to aggregate patient data located within unaffiliated academic medical centers. Methods The project adapted software from the Informatics for Integrating Biology and the Bedside (i2b2) program to connect three Clinical Translational Research Award sites: University of Washington, Seattle, University of California, Davis, and University of California, San Francisco. The project developed an iterative spiral software development model to support the implementation and coordination of this multisite data resource. Results By standardizing technical infrastructures, policies, and semantics, the project enabled federated querying of deidentified clinical datasets stored in separate institutional environments and identified barriers to engaging users for measuring utility. Discussion The authors discuss the iterative development and evaluation phases of the project and highlight the challenges identified and the lessons learned. Conclusion The common system architecture and translational processes provide high-level (aggregate) deidentified access to a large patient population (>5 million patients), and represent a novel and extensible resource. Enhancing the network for more focused disease areas will require research-driven partnerships represented across all partner sites.