Probable networks and plausible predictions — a review of practical Bayesian methods for supervised neural networks

1 August 1995

journal article
review article
Published by Informa UK Limited in Network: Computation in Neural Systems

Vol. 6 (3), 469-505
https://doi.org/10.1088/0954-898x/6/3/011

Abstract

Bayesian probability theory provides a unifying framework for data modelling. In this framework the overall aims are to find models that are well-matched to the data, and to use these models to make optimal predictions. Neural network learning is interpreted as an inference of the most probable parameters for the model, given the training data. The search in model space (i.e., the space of architectures, noise models, preprocessings, regularizers and weight decay constants) can then also be treated as an inference problem, in which we infer the relative probability of alternative models, given the data. This review describes practical techniques based on Gaussian approximations for implementation of these powerful methods for controlling, comparing and using adaptive networks.

This publication has 14 references indexed in Scilit:

Fast Exact Multiplication by the Hessian
Neural Computation, 1994
Exact Calculation of the Hessian Matrix for the Multilayer Perceptron
Neural Computation, 1992
Probabilistic Displays
Published by Springer Science and Business Media LLC ,1991
Sequential updating of conditional probabilities on directed graphical structures
Networks, 1990
From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics
Published by Springer Science and Business Media LLC ,1990
The Vapnik-Chervonenkis Dimension: Information versus Complexity in Learning
Neural Computation, 1989
Arithmetic coding for data compression
Communications of the ACM, 1987
Learning representations by back-propagating errors
Nature, 1986
An Information Measure for Classification
The Computer Journal, 1968
Probability, Frequency and Reasonable Expectation
American Journal of Physics, 1946

Cited by 410 articles