Binary-Tree Histograms with Tree Indices
- 20 August 2002
- book chapter
- conference paper
- Published by Springer Science and Business Media LLC in Lecture Notes in Computer Science
- p. 861-870
- https://doi.org/10.1007/3-540-46146-9_85
Abstract
In many application contexts, like statistical databases, transaction recording systems, scientific databases, query optimizers, OLAP, and so on, data are summarized as histograms of aggregate values. When the task of reconstructing range queries on original data from aggregate data is performed, a certain estimation error cannot be avoided, due to the loss of information in compressing data. Error size strongly depends both on how histograms partition data domains and on how estimation inside each bucket is done. We propose a new type of histogram, based on an unbalanced binary-tree partition, suitable for providing quick answers to hierarchical range queries, and we use adaptive tree-indexing for better approximating frequencies inside buckets. As the results from our experiments demonstrate, our histogram behaves considerably better than state-of-the-art histograms, showing smaller errors in all considered data sets at the same storage space.Keywords
This publication has 9 references indexed in Scilit:
- Estimating Range Queries Using Aggregate Data with Integrity Constraints: A Probabilistic ApproachLecture Notes in Computer Science, 2001
- Improving Temporal Joins Using HistogramsLecture Notes in Computer Science, 2000
- Fast approximate query answering using precomputed statisticsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Wavelet-based histograms for selectivity estimationPublished by Association for Computing Machinery (ACM) ,1998
- Improved histograms for selectivity estimation of range predicatesPublished by Association for Computing Machinery (ACM) ,1996
- Balancing histogram optimality and practicality for query result size estimationPublished by Association for Computing Machinery (ACM) ,1995
- A universal-scheme approach to statistical databases containing homogeneous summary tablesACM Transactions on Database Systems, 1993
- Implications of certain assumptions in database performance evauationACM Transactions on Database Systems, 1984
- Access path selection in a relational database management systemPublished by Association for Computing Machinery (ACM) ,1979