Bounds on the Complexity of the Longest Common Subsequence Problem

1 January 1976

journal article
Published by Association for Computing Machinery (ACM) in Journal of the ACM

Vol. 23 (1), 1-12
https://doi.org/10.1145/321921.321922

Abstract

The problem of finding a longest common subsequence of two strings is discussed. This problem arises in data processing applications such as comparing two files and in genetic applications such as studying molecular evolution. The difficulty of computing a longest common subsequence of two strings is examined using the decision tree model of computation, in which vertices represent “equal - unequal” comparisons. It is shown that unless a bound on the total number of distinct symbols is assumed, every solution to the problem can consume an amount of time that is proportional to the product of the lengths of the two strings. A general lower bound as a function of the ratio of alphabet size to string length is derived. The case where comparisons between symbols of the same string are forbidden is also considered and it is shown that this problem is of linear complexity for a two-symbol alphabet and quadratic for an alphabet of three or more symbols.

Keywords

This publication has 8 references indexed in Scilit:

On computing the length of longest increasing subsequences
Discrete Mathematics, 1975
Preserving order in a forest in less than logarithmic time
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1975
A linear space algorithm for computing maximal common subsequences
Communications of the ACM, 1975
A system for typesetting mathematics
Communications of the ACM, 1975
The String-to-String Correction Problem
Journal of the ACM, 1974
On the Optimality of Some Set Algorithms
Journal of the ACM, 1972
Recommendations with Respect to the Behavioral and Social Aspects of Human Genetics
Proceedings of the National Academy of Sciences of the United States of America, 1972
A general method applicable to the search for similarities in the amino acid sequence of two proteins
Journal of Molecular Biology, 1970

Cited by 153 articles