Meaningful change detection in structured data
- 1 June 1997
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGMOD Record
- Vol. 26 (2), 26-37
- https://doi.org/10.1145/253262.253266
Abstract
Detecting changes by comparing data snapshots is an important requirement for difference queries, active databases, and version and configuration management. In this paper we focus on detecting meaningful changes in hierarchically structured data, such as nested-object data. This problem is much more challenging than the corresponding one for relational or flat-file data. In order to describe changes better, we base our work not just on the traditional “atomic” insert, delete, update operations, but also on operations that move an entire sub-tree of nodes, and that copy an entire sub-tree. These operations allows us to describe changes in a semantically more meaningful way. Since this change detection problem is NP -hard, in this paper we present a heuristic change detection algorithm that yields close to “minimal” descriptions of the changes, and that has fewer restrictions than previous algorithms. Our algorithm is based on transforming the change detection problem to a problem of computing a minimum-cost edge cover of a bipartite graph. We study the quality of the solution produced by our algorithm, as well as the running time, both analytically and experimentally.Keywords
This publication has 9 references indexed in Scilit:
- Meaningful change detection in structured dataPublished by Association for Computing Machinery (ACM) ,1997
- Change detection in hierarchically structured informationPublished by Association for Computing Machinery (ACM) ,1996
- Exact and approximate algorithms for unordered tree matchingIEEE Transactions on Systems, Man, and Cybernetics, 1994
- Fast algorithms for the unit cost editing distance between treesJournal of Algorithms, 1990
- An O(NP) sequence comparison algorithmInformation Processing Letters, 1990
- Simple Fast Algorithms for the Editing Distance between Trees and Related ProblemsSIAM Journal on Computing, 1989
- AnO(ND) difference algorithm and its variationsAlgorithmica, 1986
- On the complexity of the Extended String-to-String Correction ProblemPublished by Association for Computing Machinery (ACM) ,1975
- The String-to-String Correction ProblemJournal of the ACM, 1974