The Evaluation of a Temporal Reasoning System in Processing Clinical Discharge Summaries

Abstract
Context: TimeText is a temporal reasoning system designed to represent, extract, and reason about temporal information in clinical text. Objective: To measure the accuracy of the TimeText for processing clinical discharge summaries. Design: Six physicians with biomedical informatics training served as domain experts. Twenty discharge summaries were randomly selected for the evaluation. For each of the first 14 reports, 5 to 8 clinically important medical events were chosen. The temporal reasoning system generated temporal relations about the endpoints (start or finish) of pairs of medical events. Two experts (subjects) manually generated temporal relations for these medical events. The system and expert-generated results were assessed by four other experts (raters). All of the twenty discharge summaries were used to assess the system's accuracy in answering time-oriented clinical questions. For each report, five to ten clinically plausible temporal questions about events were generated. Two experts generated answers to the questions to serve as the gold standard. We wrote queries to retrieve answers from system's output. Measurements: Correctness of generated temporal relations, recall of clinically important relations, and accuracy in answering temporal questions. Results: The raters determined that 97% of subjects' 295 generated temporal relations were correct and that 96.5% of the system's 995 generated temporal relations were correct. The system captured 79% of 307 temporal relations determined to be clinically important by the subjects and raters. The system answered 84% of the temporal questions correctly. Conclusion: The system encoded the majority of information identified by experts, and was able to answer simple temporal questions.

This publication has 21 references indexed in Scilit: