Bounds on the Complexity of the Longest Common Subsequence Problem
- 1 January 1976
- journal article
- Published by Association for Computing Machinery (ACM) in Journal of the ACM
- Vol. 23 (1), 1-12
- https://doi.org/10.1145/321921.321922
Abstract
The problem of finding a longest common subsequence of two strings is discussed. This problem arises in data processing applications such as comparing two files and in genetic applications such as studying molecular evolution. The difficulty of computing a longest common subsequence of two strings is examined using the decision tree model of computation, in which vertices represent “equal - unequal” comparisons. It is shown that unless a bound on the total number of distinct symbols is assumed, every solution to the problem can consume an amount of time that is proportional to the product of the lengths of the two strings. A general lower bound as a function of the ratio of alphabet size to string length is derived. The case where comparisons between symbols of the same string are forbidden is also considered and it is shown that this problem is of linear complexity for a two-symbol alphabet and quadratic for an alphabet of three or more symbols.Keywords
This publication has 8 references indexed in Scilit:
- On computing the length of longest increasing subsequencesDiscrete Mathematics, 1975
- Preserving order in a forest in less than logarithmic timePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1975
- A linear space algorithm for computing maximal common subsequencesCommunications of the ACM, 1975
- A system for typesetting mathematicsCommunications of the ACM, 1975
- The String-to-String Correction ProblemJournal of the ACM, 1974
- On the Optimality of Some Set AlgorithmsJournal of the ACM, 1972
- Recommendations with Respect to the Behavioral and Social Aspects of Human GeneticsProceedings of the National Academy of Sciences of the United States of America, 1972
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970