Novel Dimension Reduction Techniques for High-Dimensional Data Using Information Complexity
- 1 October 2016
- book chapter
- Published by Institute for Operations Research and the Management Sciences (INFORMS)
- p. 140-170
- https://doi.org/10.1287/educ.2016.0154
Abstract
This tutorial introduces and develops two computationally feasible intelligent feature extraction techniques that address potentially daunting statistical and combinatorial problems. The first part of the tutorial employs a three-way hybrid of probabilistic principal component analysis (PPCA) to reduce the dimensionality of the dependent variables, multivariate regression (MVR) models that account for misspecification of the distributional assumption to determine a predictive operating model for glass composition for automobiles, and the genetic algorithm (GA) as the optimizer, along with the misspecification-resistant form of Bozdogan’s information measure of complexity (ICOMP) as the fitness function. The second part of the tutorial is devoted to dimension reduction via a novel adaptive elastic net regression model. We used the adaptive elastic net (AEN) model to reduce the dimension of a Japanese stock index called TOPIX as a response to build a best predictive model when we have a “large p, small n” problem. Our results show the remarkable dimension reduction in both of these real-life examples of wide data sets, which demonstrates the versatility and the utility of the two proposed novel statistical data modeling techniques.Keywords
This publication has 21 references indexed in Scilit:
- Shrinkage Algorithms for MMSE Covariance EstimationIEEE Transactions on Signal Processing, 2010
- Least angle regressionThe Annals of Statistics, 2004
- Information complexity criteria for detecting influential observations in dynamic multivariate linear models using the genetic algorithmJournal of Statistical Planning and Inference, 2003
- Akaike's Information Criterion and Recent Developments in Information ComplexityJournal of Mathematical Psychology, 2000
- Physics from Fisher InformationPublished by Cambridge University Press (CUP) ,1998
- Informational complexity criteria for regression modelsComputational Statistics & Data Analysis, 1998
- Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensionsPsychometrika, 1987
- On the maximum-entropy approach to undersized samplesApplied Mathematics and Computation, 1984
- Empirical Bayes Estimation of the Multivariate Normal Covariance MatrixThe Annals of Statistics, 1980
- Adaptive Control ProcessesPublished by Walter de Gruyter GmbH ,1961