Novel Dimension Reduction Techniques for High-Dimensional Data Using Information Complexity

Abstract

This tutorial introduces and develops two computationally feasible intelligent feature extraction techniques that address potentially daunting statistical and combinatorial problems. The first part of the tutorial employs a three-way hybrid of probabilistic principal component analysis (PPCA) to reduce the dimensionality of the dependent variables, multivariate regression (MVR) models that account for misspecification of the distributional assumption to determine a predictive operating model for glass composition for automobiles, and the genetic algorithm (GA) as the optimizer, along with the misspecification-resistant form of Bozdogan’s information measure of complexity (ICOMP) as the fitness function. The second part of the tutorial is devoted to dimension reduction via a novel adaptive elastic net regression model. We used the adaptive elastic net (AEN) model to reduce the dimension of a Japanese stock index called TOPIX as a response to build a best predictive model when we have a “large p, small n” problem. Our results show the remarkable dimension reduction in both of these real-life examples of wide data sets, which demonstrates the versatility and the utility of the two proposed novel statistical data modeling techniques.

Keywords

BOOKS

This publication has 21 references indexed in Scilit:

Shrinkage Algorithms for MMSE Covariance Estimation
IEEE Transactions on Signal Processing, 2010
Least angle regression
The Annals of Statistics, 2004
Information complexity criteria for detecting influential observations in dynamic multivariate linear models using the genetic algorithm
Journal of Statistical Planning and Inference, 2003
Akaike's Information Criterion and Recent Developments in Information Complexity
Journal of Mathematical Psychology, 2000
Physics from Fisher Information
Published by Cambridge University Press (CUP) ,1998
Informational complexity criteria for regression models
Computational Statistics & Data Analysis, 1998
Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions
Psychometrika, 1987
On the maximum-entropy approach to undersized samples
Applied Mathematics and Computation, 1984
Empirical Bayes Estimation of the Multivariate Normal Covariance Matrix
The Annals of Statistics, 1980
Adaptive Control Processes
Published by Walter de Gruyter GmbH ,1961

Cited by 5 articles