The Value of Extracting Clinician-Recorded Affect for Advancing Clinical Research on Depression: Proof-of-Concept Study Applying Natural Language Processing to Electronic Health Records

Abstract
Background: Affective characteristics are associated with depression severity, course, and prognosis. Patients’ affect captured by clinicians during sessions may provide a rich source of information that more naturally aligns with the depression course and patient-desired depression outcomes. Objective: In this paper, we propose an information extraction vocabulary used to pilot the feasibility and reliability of identifying clinician-recorded patient affective states in clinical notes from electronic health records. Methods: Affect and mood were annotated in 147 clinical notes of 109 patients by 2 independent coders across 3 pilots. Intercoder discrepancies were settled by a third coder. This reference annotation set was used to test a proof-of-concept natural language processing (NLP) system using a named entity recognition approach. Results: Concepts were frequently addressed in templated format and free text in clinical notes. Annotated data demonstrated that affective characteristics were identified in 87.8% (129/147) of the notes, while mood was identified in 97.3% (143/147) of the notes. The intercoder reliability was consistently good across the pilots (interannotator agreement [IAA] >70%). The final NLP system showed good reliability with the final reference annotation set (mood IAA=85.8%; affect IAA=80.9%). Conclusions: Affect and mood can be reliably identified in clinician reports and are good targets for NLP. We discuss several next steps to expand on this proof of concept and the value of this research for depression clinical research.