Maximization of Mutual Information for Supervised Linear Feature Extraction

4 September 2007

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 18 (5), 1433-1441
https://doi.org/10.1109/tnn.2007.891630

Abstract

In this paper, we present a novel scheme for linear feature extraction in classification. The method is based on the maximization of the mutual information (MI) between the features extracted and the classes. The sum of the MI corresponding to each of the features is taken as an heuristic that approximates the MI of the whole output vector. Then, a component-by-component gradient-ascent method is proposed for the maximization of the MI, similar to the gradient-based entropy optimization used in independent component analysis (ICA). The simulation results show that not only is the method competitive when compared to existing supervised feature extraction methods in all cases studied, but it also remarkably outperform them when the data are characterized by strongly nonlinear boundaries between classes.

Keywords

This publication has 12 references indexed in Scilit:

A Fixed-Point Algorithm for Finding the Optimal Covariance Matrix in Kernel Density Modeling
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Discriminative Components of Data
IEEE Transactions on Neural Networks, 2005
Estimating mutual information
Physical Review E, 2004
Analyzing Neural Responses to Natural Signals: Maximally Informative Dimensions
Neural Computation, 2004
10.1162/153244303322753742
Applied Physics Letters, 2000
Blind signal separation: statistical principles
Proceedings of the IEEE, 1998
Nonlinear Component Analysis as a Kernel Eigenvalue Problem
Neural Computation, 1998
Using mutual information for selecting features in supervised neural net learning
IEEE Transactions on Neural Networks, 1994
Sliced Inverse Regression for Dimension Reduction
Journal of the American Statistical Association, 1991
Sliced Inverse Regression for Dimension Reduction
Journal of the American Statistical Association, 1991

Cited by 66 articles