Expected-Case Complexity of Approximate Nearest Neighbor Searching
- 1 January 2003
- journal article
- Published by Society for Industrial & Applied Mathematics (SIAM) in SIAM Journal on Computing
- Vol. 32 (3), 793-815
- https://doi.org/10.1137/s0097539799366340
Abstract
Most research in algorithms for geometric query problems has focused on their worst-case performance. However, when information on the query distribution is available, the alternative paradigm of designing and analyzing algorithms from the perspective of expected-case performance appears more attractive. We study the approximate nearest neighbor problem from this perspective. As a first step in this direction, we assume that the query points are sampled uniformly from a hypercube that encloses all the data points; however, we make no assumption on the distribution of the data points. We show that with a simple partition tree, called the sliding-midpoint tree, it is possible to achieve linear space and logarithmic query time in the expected case; in contrast, the data structures known to achieve linear space and logarithmic query time in the worst case are complex, and algorithms on them run more slowly in practice. Moreover, we prove that the sliding-midpoint tree achieves optimal expected query time in a certain class of algorithmsKeywords
This publication has 10 references indexed in Scilit:
- Balanced Aspect Ratio Trees: Combining the Advantages of k-d Trees and OctreesJournal of Algorithms, 2001
- An optimal algorithm for approximate nearest neighbor searching in fixed dimensionsJournal of the ACM, 1998
- Approximate Nearest Neighbor Queries RevisitedDiscrete & Computational Geometry, 1998
- Query by image and video content: the QBIC systemComputer, 1995
- Approximate closest-point queries in high dimensionsInformation Processing Letters, 1993
- Indexing by latent semantic analysisJournal of the American Society for Information Science, 1990
- Rate-distortion performance of DPCM schemes for autoregressive sourcesIEEE Transactions on Information Theory, 1985
- An Algorithm for Finding Best Matches in Logarithmic Expected TimeACM Transactions on Mathematical Software, 1977
- ConvexityPublished by Cambridge University Press (CUP) ,1958
- A Mathematical Theory of CommunicationBell System Technical Journal, 1948