Accelerating Learning of Neural Networks with Conjugate Gradients for Nuclear Power Plant Applications

Abstract
The method of conjugate gradients is used to expedite the learning process of feedforward multilayer artificial neural networks and to systematically update both the learning parameter and the momentum parameter at each training cycle. The mechanism for the occurrence of premature saturation of the network nodes observed with the backpropagation algorithm is described, suggestions are made to eliminate this undesirable phenomenon, and the reason by which this phenomenon is precluded in the method of conjugate gradients is presented. The proposed method is compared with the standard backpropagation algorithm in the training of neural networks to classify transient events in nuclear power plants simulated by the Midland Nuclear Power Plant Unit 2 simulator. The comparison results indicate that the rate of convergence of the proposed method is much greater than the standard backpropagation, that it reduces both the number of training cycles and the CPU time, and that it is less sensitive to the choice of initial weights. The advantages of the method are more noticeable and important for problems where the network architecture consists of a large number of nodes, the training database is large, and a tight convergence criterion is desired.