Rademacher penalties and structural risk minimization
- 1 July 2001
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory
- Vol. 47 (5), 1902-1914
- https://doi.org/10.1109/18.930926
Abstract
We suggest a penalty function to be used in various problems of structural risk minimization. This penalty is data dependent and is based on the sup-norm of the so-called Rademacher process indexed by the underlying class of functions (sets). The standard complexity penalties, used in learning problems and based on the VC-dimensions of the classes, are conservative upper bounds (in a probabilistic sense, uniformly over the set of all underlying distributions) for the penalty we suggest. Thus, for a particular distribution of training examples, one can expect better performance of learning algorithms with the data-driven Rademacher penalties. We obtain oracle inequalities for the theoretical risk of estimators, obtained by structural minimization of the empirical risk with Rademacher penalties. The inequalities imply some form of optimality of the empirical risk minimizers. We also suggest an iterative approach to structural risk minimization with Rademacher penalties, in which the hierarchy of classes is not given in advance, but is determined in the data-driven iterative process of risk minimization. We prove probabilistic oracle inequalities for the theoretical risk of the estimators based on this approach as well.Keywords
This publication has 27 references indexed in Scilit:
- Microchoice bounds and self bounding learning algorithmsPublished by Association for Computing Machinery (ACM) ,1999
- Risk bounds for model selection via penalizationProbability Theory and Related Fields, 1999
- Self bounding learning algorithmsPublished by Association for Computing Machinery (ACM) ,1998
- From Model Selection to Adaptive EstimationPublished by Springer Science and Business Media LLC ,1997
- A new look at independenceThe Annals of Probability, 1996
- Concept learning using complexity regularizationIEEE Transactions on Information Theory, 1996
- On the method of bounded differencesPublished by Cambridge University Press (CUP) ,1989
- Martingale Inequalities and NP-Complete ProblemsMathematics of Operations Research, 1987
- Exponential Bounds for Large DeviationsTheory of Probability and Its Applications, 1974
- Weighted sums of certain dependent random variablesTohoku Mathematical Journal, 1967