Stochastic proximity embedding

Abstract
We introduce stochastic proximity embedding (SPE), a novel self‐organizing algorithm for producing meaningful underlying dimensions from proximity data. SPE attempts to generate low‐dimensional Euclidean embeddings that best preserve the similarities between a set of related observations. The method starts with an initial configuration, and iteratively refines it by repeatedly selecting pairs of objects at random, and adjusting their coordinates so that their distances on the map match more closely their respective proximities. The magnitude of these adjustments is controlled by a learning rate parameter, which decreases during the course of the simulation to avoid oscillatory behavior. Unlike classical multidimensional scaling (MDS) and nonlinear mapping (NLM), SPE scales linearly with respect to sample size, and can be applied to very large data sets that are intractable by conventional embedding procedures. The method is programmatically simple, robust, and convergent, and can be applied to a wide range of scientific problems involving exploratory data analysis and visualization. © 2003 Wiley Periodicals, Inc. J Comput Chem 24: 1215–1221, 2003

This publication has 16 references indexed in Scilit: