KoRASA: Pipeline Optimization for Open-Source Korean Natural Language Understanding Framework Based on Deep Learning

Abstract
Since the emergence of deep learning-based chatbots for knowledge services, numerous research and development projects have been conducted in various industries. A high demand for chatbots has drastically increased the global market size; however, the limited functional scalability of open-domain chatbots is a challenge to their application to industries. Moreover, as most chatbot frameworks employ English, it is necessary to create chatbots customized for other languages. To address this problem, this paper proposes KoRASA as a pipeline-optimization method, which uses a deep learning-based open-source chatbot framework to understand the Korean language. KoRASA is a closed-domain chatbot that is applicable across a wide range of industries in Korea. KoRASAs operation consists of four stages: tokenization, featurization, intent classification, and entity extraction. The accuracy and F1-score of KoRASA were measured based on datasets taken from common tasks carried out in most industrial fields. The algorithm for intent classification and entity extraction was optimized. The accuracy and F1-score were 98.2 and 98.4 for intent classification and 97.4 and 94.7 for entity extraction, respectively. Furthermore, these results are better than those achieved by existing models. Accordingly, KoRASA can be applied to various industries, including mobile services based on closed-domain chatbots using Korean, robotic process automation (RPA), edge computing, and Internet of Energy (IoE) services.
Funding Information
  • Korea Electric Power Corporation

This publication has 14 references indexed in Scilit: