Symmetry in data mining and analysis: A unifying view based on hierarchy
- 30 July 2009
- journal article
- research article
- Published by Pleiades Publishing Ltd in Proceedings of the Steklov Institute of Mathematics
- Vol. 265 (1), 177-198
- https://doi.org/10.1134/s0081543809020175
Abstract
Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. The data sets themselves are explicitly linked as a form of representation to an observational, or otherwise empirical, domain of interest. “Structure” has long been understood as symmetry which can take many forms with respect to any transformation, including point, translational, rotational, and many others. Symmetries directly point to invariants that pinpoint intrinsic properties of the data and of the background empirical domain of interest. As our data models change, so too do our perspectives on analyzing data. The structures in data surveyed here are based on hierarchy, represented as p-adic numbers or an ultrametric topology.Keywords
Other Versions
This publication has 64 references indexed in Scilit:
- Number theory as the ultimate physical theoryP-Adic Numbers, Ultrametric Analysis, and Applications, 2010
- Gene expression from polynomial dynamics in the 2-adic information spaceChaos, Solitons, and Fractals, 2009
- Wavelets and spectral analysis of ultrametric pseudodifferential operatorsSbornik: Mathematics, 2007
- K‐means clustering: A half‐century synthesisBritish Journal of Mathematical and Statistical Psychology, 2006
- Biclustering algorithms for biological data analysis: A surveyIEEE/ACM Transactions on Computational Biology and Bioinformatics, 2003
- Теория всплесков как $p$-адический спектральный анализИзвестия Российской академии наук. Серия математическая, 2002
- Kolmogorov-Sinai Entropy Rate versus Physical EntropyPhysical Review Letters, 1999
- Hierarchical trees can be perfectly scaled in one dimensionJournal of Classification, 1988
- Counting dendrograms: A surveyDiscrete Applied Mathematics, 1984
- Techniques for Structuring Database RecordsACM Computing Surveys, 1983