USING PROBABILISTIC MODELS OF DOCUMENT RETRIEVAL WITHOUT RELEVANCE INFORMATION

1 April 1979

journal article
Published by Emerald in Journal of Documentation

Vol. 35 (4), 285-295
https://doi.org/10.1108/eb026683

Abstract

Most probabilistic retrieval models incorporate information about the occurrence of index terms in relevant and non‐relevant documents. In this paper we consider the situation where no relevance information is available, that is, at the start of the search. Based on a probabilistic model, strategies are proposed for the initial search and an intermediate search. Retrieval experiments with the Cranfield collection of 1,400 documents show that this initial search strategy is better than conventional search strategies both in terms of retrieval effectiveness and in terms of the number of queries that retrieve relevant documents. The intermediate search is shown to be a useful substitute for a relevance feedback search. Experiments with queries that do not retrieve relevant documents at high rank positions indicate that a cluster search would be an effective alternative strategy.

Keywords

This publication has 4 references indexed in Scilit:

SEARCH TERM RELEVANCE WEIGHTING GIVEN LITTLE RELEVANCE INFORMATION
Journal of Documentation, 1979
AN EVALUATION OF FEEDBACK IN DOCUMENT RETRIEVAL USING CO‐OCCURRENCE DATA
Journal of Documentation, 1978
A THEORETICAL BASIS FOR THE USE OF CO‐OCCURRENCE DATA IN INFORMATION RETRIEVAL
Journal of Documentation, 1977
Precision Weighting—An Effective Automatic Indexing Method
Journal of the ACM, 1976

Cited by 266 articles