Algorithms for discovering bucket orders from data

Abstract
Ordering and ranking items of different types are important tasks in various applications, such as query processing and scientific data mining. A total order for the items can be misleading, since there are groups of items that have practically equal ranks.We consider bucket orders, i.e., total orders with ties. They can be used to capture the essential order information without overfitting the data: they form a useful concept class between total orders and arbitrary partial orders. We address the question of finding a bucket order for a set of items, given pairwise precedence information between the items. We also discuss methods for computing the pairwise precedence data.We describe simple and efficient algorithms for finding good bucket orders. Several of the algorithms have a provable approximation guarantee, and they scale well to large datasets. We provide experimental results on artificial and a real data that show the usefulness of bucket orders and demonstrate the accuracy and efficiency of the algorithms.

This publication has 17 references indexed in Scilit: