Data Visualization With Multidimensional Scaling

1 June 2008

journal article
Published by Taylor & Francis Ltd in Journal of Computational and Graphical Statistics

Vol. 17 (2), 444-472
https://doi.org/10.1198/106186008x318440

Abstract

We discuss methodology for multidimensional scaling (MDS) and its implementation in two software systems, GGvis and XGvis. MDS is a visualization technique for proximity data, that is, data in the form of N × N dissimilarity matrices. MDS constructs maps (“configurations,” “embeddings”) in IR^k by interpreting the dissimilarities as distances. Two frequent sources of dissimilarities are high-dimensional data and graphs. When the dissimilarities are distances between high-dimensional objects, MDS acts as a (often nonlinear) dimension-reduction technique. When the dissimilarities are shortest-path distances in a graph, MDS acts as a graph layout technique. MDS has found recent attention in machine learning motivated by image databases (“Isomap”). MDS is also of interest in view of the popularity of “kernelizing” approaches inspired by Support Vector Machines (SVMs; “kernel PCA”). This article discusses the following general topics: (1) the stability and multiplicity of MDS solutions; (2) the analysis of structure within and between subsets of objects with missing value schemes in dissimilarity matrices; (3) gradient descent for optimizing general MDS loss functions (“Strain” and “Stress”); (4) a unification of classical (Strain-based) and distance (Stress-based) MDS. Particular topics include the following: (1) blending of automatic optimization with interactive displacement of configuration points to assist in the search for global optima; (2) forming groups of objects with interactive brushing to create patterned missing values in MDS loss functions; (3) optimizing MDS loss functions for large numbers of objects relative to a small set of anchor points (“external unfolding”); and (4) a non-metric version of classical MDS. We show applications to the mapping of computer usage data, to the dimension reduction of marketing segmentation data, to the layout of mathematical graphs and social networks, and finally to the spatial reconstruction of molecules.

Keywords

This publication has 20 references indexed in Scilit:

Visualization Methodology for Multidimensional Scaling
Journal of Classification, 2002
Interactive High-Dimensional Data Visualization
Journal of Computational and Graphical Statistics, 1996
Graphical Sensitivity Analysis for Multidimensional Scaling
Journal of Computational and Graphical Statistics, 1994
Molecular conformations from distance matrices
Journal of Computational Chemistry, 1993
The integration of multidimensional scaling and multivariate analysis with optimal transformations
Psychometrika, 1992
An evaluation of computational strategies for use in the determination of protein structure from distance constraints obtained by nuclear magnetic resonance
Progress in Biophysics and Molecular Biology, 1991
Using distance information in the design of large multidimensional scaling experiments.
Psychological Bulletin, 1979
Stable calculation of coordinates from distance information
Acta Crystallographica Section A, 1978
Nonmetric multidimensional scaling: A numerical method
Psychometrika, 1964
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
Psychometrika, 1964

Cited by 200 articles