Survey on evaluation methods for dialogue systems

Abstract
In this paper, we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation, in and of itself, is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires. However, this tends to be very cost- and time-intensive. Thus, much work has been put into finding methods which allow a reduction in involvement of human labour. In this survey, we present the main concepts and methods. For this, we differentiate between the various classes of dialogue systems (task-oriented, conversational, and question-answering dialogue systems). We cover each class by introducing the main technologies developed for the dialogue systems and then present the evaluation methods regarding that class.
Funding Information
  • CHIST-ERA
  • Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (20CH21_174237)
  • Agencia Estatal de Investigación (PCIN-2017-118/AEI)
  • Agencia Estatal de Investigación (PCIN-2017-085/AEI)
  • Agence Nationale de la Recherche (ANR-17-CHR2-0001-03)