An Asymptotic Statistical Theory of Polynomial Kernel Methods

1 August 2004

journal article
Published by MIT Press in Neural Computation

Vol. 16 (8), 1705-1719
https://doi.org/10.1162/089976604774201659

Abstract

The generalization properties of learning classifiers with a polynomial kernel function are examined. In kernel methods, input vectors are mapped into a high-dimensional feature space where the mapped vectors are linearly separated. It is well-known that a linear dichotomy has an average generalization error or a learning curve proportional to the dimension of the input space and inversely proportional to the number of given examples in the asymptotic limit. However, it does not hold in the case of kernel methods since the feature vectors lie on a submanifold in the feature space, called the input surface. In this letter, we discuss how the asymptotic average generalization error depends on the relationship between the input surface and the true separating hyperplane in the feature space where the essential dimension of the true separating polynomial, named the class, is important. We show its upper bounds in several cases and confirm these using computer simulations.

Keywords

This publication has 11 references indexed in Scilit:

Statistical Mechanics of Support Vector Networks
Physical Review Letters, 1999
Network information criterion-determining the number of hidden units for an artificial neural network model
IEEE Transactions on Neural Networks, 1994
A universal theorem on learning curves
Neural Networks, 1993
Statistical Theory of Learning Curves under Entropic Loss Criterion
Neural Computation, 1993
Four Types of Learning Curves
Neural Computation, 1992
A statistical approach to learning and generalization in layered neural networks
Proceedings of the IEEE, 1990
The AdaTron: An Adaptive Perceptron Algorithm
Europhysics Letters, 1989
What Size Net Gives Valid Generalization?
Neural Computation, 1989
A theory of the learnable
Communications of the ACM, 1984
Generalization as search
Artificial Intelligence, 1982

Cited by 7 articles