On-Line Approximate String Matching with Bounded Errors

7 June 2008

book chapter
conference paper
Published by Springer Science and Business Media LLC

p. 130-142
https://doi.org/10.1007/978-3-540-69068-9_14

Abstract

We introduce a new dimension to the widely studied on-line approximate string matching problem, by introducing an error threshold parameter ε so that the algorithm is allowed to miss occurrences with probability ε. This is particularly appropriate for this problem, as approximate searching is used to model many cases where exact answers are not mandatory. We show that the relaxed version of the problem allows us breaking the average-case optimal lower bound of the classical problem, achieving average case O(nlog_σ m/m) time with any \(\epsilon = \textrm{poly}(k/m)\), where n is the text size, m the pattern length, k the number of errors for edit distance, and σ the alphabet size. Our experimental results show the practicality of this novel and promising research direction.

Keywords

This publication has 7 references indexed in Scilit:

Average-optimal single and multiple approximate string matching
ACM Journal of Experimental Algorithmics, 2004
Large deviations for sums of partly dependent random variables
Random Structures & Algorithms, 2004
Algorithms column
ACM SIGACT News, 2003
A guided tour to approximate string matching
ACM Computing Surveys, 2001
Sublinear approximate string matching and biological applications
Algorithmica, 1994
Approximate string matching and local similarity
Lecture Notes in Computer Science, 1994
Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science, 1992

Cited by 1 article