Stochastic proximity embedding

9 June 2003

journal article
Published by Wiley in Journal of Computational Chemistry

Vol. 24 (10), 1215-1221
https://doi.org/10.1002/jcc.10234

Abstract

We introduce stochastic proximity embedding (SPE), a novel self‐organizing algorithm for producing meaningful underlying dimensions from proximity data. SPE attempts to generate low‐dimensional Euclidean embeddings that best preserve the similarities between a set of related observations. The method starts with an initial configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates so that their distances on the map match more closely their respective proximities. The magnitude of these adjustments is controlled by a learning rate parameter, which decreases during the course of the simulation to avoid oscillatory behavior. Unlike classical multidimensional scaling (MDS) and nonlinear mapping (NLM), SPE scales linearly with respect to sample size, and can be applied to very large data sets that are intractable by conventional embedding procedures. The method is programmatically simple, robust, and convergent, and can be applied to a wide range of scientific problems involving exploratory data analysis and visualization. © 2003 Wiley Periodicals, Inc. J Comput Chem 24: 1215–1221, 2003

Keywords

This publication has 16 references indexed in Scilit:

Combinatorial informatics in the post-genomics era
Nature Reviews Drug Discovery, 2002
Multidimensional scaling of combinatorial libraries without explicit enumeration
Journal of Computational Chemistry, 2001
Nonlinear Mapping Networks
Journal of Chemical Information and Computer Sciences, 2000
Three-dimensional alpha shapes
ACM Transactions on Graphics, 1994
The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure‐Property Modeling
Reviews in Computational Chemistry, 1991
Learning representations by back-propagating errors
Nature, 1986
Improving the efficiency of Sammon's nonlinear mapping by using clustering archetypes
Electronics Letters, 1978
A Nonlinear Mapping for Data Structure Analysis
IEEE Transactions on Computers, 1969
Nonmetric multidimensional scaling: A numerical method
Psychometrika, 1964
A Stochastic Approximation Method
The Annals of Mathematical Statistics, 1951

Cited by 128 articles