Efficient Clustering of Uncertain Data

1 December 2006

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE International Conference on Data Mining (ICDM)

No. 15504786,p. 436-445
https://doi.org/10.1109/icdm.2006.63

Abstract

We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UK-means algorithm, which is based on the traditional K-means algorithm. In UK-means, an object is assigned to the cluster whose representative has the smallest expected distance to the object. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration computation. We study various pruning methods to avoid such expensive expected distance calculation.

Keywords

This publication has 12 references indexed in Scilit:

Hierarchical Density-Based Clustering of Uncertain Data
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Density-based clustering of uncertain data
Published by Association for Computing Machinery (ACM) ,2005
Speeding-Up Hierarchical Agglomerative Clustering in Presence of Expensive Metrics
Lecture Notes in Computer Science, 2005
Querying imprecise data in moving object environments
IEEE Transactions on Knowledge and Data Engineering, 2004
PXML: a probabilistic semistructured data model and algebra
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
OPTICS
Published by Association for Computing Machinery (ACM) ,1999
Updating and Querying Databases that Track Mobile Units
Distributed and Parallel Databases, 1999
Generalized Minkowski metrics for mixed feature-type data analysis
IEEE Transactions on Systems, Man, and Cybernetics, 1994
A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters
Journal of Cybernetics, 1973
A new approach to clustering
Information and Control, 1969

Cited by 112 articles