Expected-Case Complexity of Approximate Nearest Neighbor Searching

1 January 2003

journal article
Published by Society for Industrial & Applied Mathematics (SIAM) in SIAM Journal on Computing

Vol. 32 (3), 793-815
https://doi.org/10.1137/s0097539799366340

Abstract

Most research in algorithms for geometric query problems has focused on their worst-case performance. However, when information on the query distribution is available, the alternative paradigm of designing and analyzing algorithms from the perspective of expected-case performance appears more attractive. We study the approximate nearest neighbor problem from this perspective. As a first step in this direction, we assume that the query points are sampled uniformly from a hypercube that encloses all the data points; however, we make no assumption on the distribution of the data points. We show that with a simple partition tree, called the sliding-midpoint tree, it is possible to achieve linear space and logarithmic query time in the expected case; in contrast, the data structures known to achieve linear space and logarithmic query time in the worst case are complex, and algorithms on them run more slowly in practice. Moreover, we prove that the sliding-midpoint tree achieves optimal expected query time in a certain class of algorithms

Keywords

This publication has 10 references indexed in Scilit:

Balanced Aspect Ratio Trees: Combining the Advantages of k-d Trees and Octrees
Journal of Algorithms, 2001
An optimal algorithm for approximate nearest neighbor searching in fixed dimensions
Journal of the ACM, 1998
Approximate Nearest Neighbor Queries Revisited
Discrete & Computational Geometry, 1998
Query by image and video content: the QBIC system
Computer, 1995
Approximate closest-point queries in high dimensions
Information Processing Letters, 1993
Indexing by latent semantic analysis
Journal of the American Society for Information Science, 1990
Rate-distortion performance of DPCM schemes for autoregressive sources
IEEE Transactions on Information Theory, 1985
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software, 1977
Convexity
Published by Cambridge University Press (CUP) ,1958
A Mathematical Theory of Communication
Bell System Technical Journal, 1948

Cited by 7 articles