An Automated Line-of-Therapy Algorithm for Adults With Metastatic Non–Small Cell Lung Cancer: Validation Study Using Blinded Manual Chart Review

Abstract
Journal of Medical Internet Research - International Scientific Journal for Medical Research, Information and Communication on the Internet #Preprint #PeerReviewMe: Warning: This is a unreviewed preprint. Readers are warned that the document has not been peer-reviewed by expert/patient reviewers or an academic editor, may contain misleading claims, and is likely to undergo changes before final publication, if accepted, or may have been rejected/withdrawn. Readers with interest and expertise are encouraged to sign up as peer-reviewer, if the paper is within an open peer-review period. Please cite this preprint only for review purposes or for grant applications and CVs (if you are the author). Background: Extraction of line of therapy (LOT) information from electronic health record (EHR) and claims data is essential for determining longitudinal changes in systemic anticancer therapy (SACT) in real-world clinical settings. Objective: The aim of this retrospective cohort analysis was to validate and refine our previously described open source LOT algorithm by comparing algorithm output with results obtained through blinded manual chart review. Methods: We used structured EHR data and clinical documents to identify 500 adult patients treated for metastatic non-small cell lung cancer with SACT from 2011 through mid-2018, assigning patients to training (n=350) and test cohorts (n=150), randomly divided proportional to the overall ratio of simple:complex cases (n=254:246). Simple cases were patients who received one LOT and no maintenance therapy; complex cases received more than one LOT and/or maintenance therapy. Algorithmic changes were performed using the training cohort data, after which the refined algorithm was evaluated against the test cohort. Results: For the simple cases, 16 instances of discordance between LOT algorithm and chart review pre-refinement were reduced to 8 instances post-refinement; in the test cohort there was no discordance between algorithm and chart review. For the complex cases, algorithm refinement reduced discordance from 68 to 62 instances, with 37 instances in the test cohort. Percentage agreement between LOT algorithm output and chart review for patients who received one LOT was 89% pre-refinement, 93% post-refinement, and 93% for the test cohort, while the likelihood of precise matching between algorithm output and chart review decreased with increasing number of unique regimens. Several areas of discordance that arose from differing definitions of LOTs and maintenance therapy could not be objectively resolved because of a lack of precise definitions in the medical literature. Conclusions: Our findings identify common sources of discordance between an LOT algorithm and clinician documentation, providing for the possibility of targeted algorithm refinement.