An Automated Line-of-Therapy Algorithm for Adults With Metastatic Non–Small Cell Lung Cancer: Validation Study Using Blinded Manual Chart Review (Preprint)

Abstract
BACKGROUND Extraction of line-of-therapy (LOT) information from electronic health record and claims data is essential for determining longitudinal changes in systemic anticancer therapy in real-world clinical settings. OBJECTIVE The aim of this retrospective cohort analysis is to validate and refine our previously described open-source LOT algorithm by comparing the output of the algorithm with results obtained through blinded manual chart review. METHODS We used structured electronic health record data and clinical documents to identify 500 adult patients treated for metastatic non–small cell lung cancer with systemic anticancer therapy from 2011 to mid-2018; we assigned patients to training (n=350) and test (n=150) cohorts, randomly divided proportional to the overall ratio of simple:complex cases (n=254:246). Simple cases were patients who received one LOT and no maintenance therapy; complex cases were patients who received more than one LOT and/or maintenance therapy. Algorithmic changes were performed using the training cohort data, after which the refined algorithm was evaluated against the test cohort. RESULTS For simple cases, 16 instances of discordance between the LOT algorithm and chart review prerefinement were reduced to 8 instances postrefinement; in the test cohort, there was no discordance between algorithm and chart review. For complex cases, algorithm refinement reduced the discordance from 68 to 62 instances, with 37 instances in the test cohort. The percentage agreement between LOT algorithm output and chart review for patients who received one LOT was 89% prerefinement, 93% postrefinement, and 93% for the test cohort, whereas the likelihood of precise matching between algorithm output and chart review decreased with an increasing number of unique regimens. Several areas of discordance that arose from differing definitions of LOTs and maintenance therapy could not be objectively resolved because of a lack of precise definitions in the medical literature. CONCLUSIONS Our findings identify common sources of discordance between the LOT algorithm and clinician documentation, providing the possibility of targeted algorithm refinement.