Auto-categorization of medical concepts and contexts

Abstract
In healthcare, information extraction is important in order to identify conceptual knowledge as a category of medical concepts from a large number of unstructured and semi-structured corpora. Category describes how medical concepts are fundamentally separated from each other to represent their conceptual knowledge in the corpus. In this paper, we focus on identifying the category of medical concepts and contexts which describe the subjective and the conceptual information of the medical corpus. To recognize the medical concept and assign their category, we employ our previously developed WordNet of Medical Event (WME 2.0) domain-specific lexicon. The lexicon provides medical concepts and their affinity, gravity, polarity scores, similar sentiment words, and sentiment features, help to develop the category assignment system. The identified categories for the concepts are diseases, drugs, symptoms, human_anatomy, and miscellaneous medical terms (MMT), which all refer the broadest fundamental classes of medical concepts. Therefore, the assigned categories of medical concepts used to build the category assignment system for the medical context. The proposed system allows extracting eleven types of pairbased categories as disease-symptom, disease-drug, and disease-MMT of contexts. To validate the categorization system for medical concepts and contexts, we have employed widely used supervised machine learning classifiers namely Naïve Bayes and Logistic Regression in the presence of WME 2.0 lexicon. The two classifiers provide F-scores of 0.81 and 0.86 for the concepts and contexts categorization systems, respectively.

This publication has 20 references indexed in Scilit: