Inferential language models for information retrieval

Abstract
Language modeling (LM) has been widely used in IR in recent years. An important operation in LM is smoothing of the document language model. However, the current smoothing techniques merely redistribute a portion of term probability according to their frequency of occurrences only in the whole document collection. No relationships between terms are considered and no inference is involved. In this article, we propose several inferential language models capable of inference using term relationships. The inference operation is carried out through a semantic smoothing either on the document model or query model, resulting in document or query expansion. The proposed models implement some of the logical inference capabilities proposed in the previous studies on logical models, but with necessary simplifications in order to make them tractable. They are a good compromise between inference power and efficiency. The models have been tested on several TREC collections, both in English and Chinese. It is shown that the integration of term relationships into the language modeling framework can consistently improve the retrieval effectiveness compared with the traditional language models. This study shows that language modeling is a suitable framework to implement basic inference operations in IR effectively.

This publication has 35 references indexed in Scilit: