A Semantic Layer on Semi-Structured Data Sources for Intuitive Chatbots

Abstract
The main limits of chatbot technology are related to the building of their knowledge representation and to their rigid information retrieval and dialogue capabilities, usually based on simple "pattern matching rules". The analysis of distributional properties of words in a texts corpus allows the creation of semantic spaces where represent and compare natural language elements. This space can be interpreted as a "conceptual" space where the axes represent the latent primitive concepts of the analyzed corpus.The presented work aims at exploiting the properties of a data-driven semantic/conceptual space built using semistructured data sources freely available on the Web, like Wikipedia. This coding is equivalent to adding, into the Wikipedia graph, a conceptual similarity relationship layer. The chatbot can exploit this layer in order to simulate an "intuitive" behavior, attempting to retrieve semantic relations between Wikipedia resources also through associative sub-symbolic paths.

This publication has 9 references indexed in Scilit: