Advanced Techniques in Neu-Net Pattern Recognition
Duy-Ky Nguyen, PhD
1. Introduction
In a problem of neu-net pattern recognition, the weights will be adjusted to minimize the error between the target output and the actual output of the network. Let e and W be the error and the weight respectively, by the Taylor series we have
(Eq 1)
let
(Eq 2)
then our concern is to find dW to minimize the error, ie. to find the step a and the direction D.
In a conventional back-propagation neu-net, the steepest descent method is used
(Eq 3)
thus a is a constant learning rate and regardless whether the error decreases or not.
In the rest of this note, the subscript 0 is used to indicate the previous value.
In a neu-net using momentum, we have
(Eq 4)
thus the steepest descent direction is used only if the error is decreased and the references are saved as .
In a neu-net using an adaptive learning rate, we have
(Eq 5)
thus the learning rate is increased to speed up the minimization when the error is decreased.
In a neu-net using the conjugate-gradient method,
(Eq 6)
and
(Eq 7)
where
(Eq 8)
2. Algorithms
2.1. Conventional Back-Propagation Neu-Net (Steepest Descent Method)
|
2.2. Adaptive Learning and Momentum Method
Since these 2 technique are the same algorithm, we can combine to increase the efficiency of the training process
|
2.3. Conjugate-Gradient Method
|
3. Results
3.1. Conventional Back-Propagation (Steepest Descent Method)
Match_Y = 0.68, Match_N = 0.73, Lr = 1, Time = 18,455 secs, Max_Epo = 50,000 W1 = -25.6981 13.1625 -13.5674 -8.9737 -2.6699 -5.9629 4.7318 -18.6604 -22.5422 -19.2870 -9.1153 12.7793 -15.1942 1.8693 -9.3501 4.1438 9.7182 9.7889 26.4317 -10.0165 -24.8998 0.5489 26.4171 23.7182 14.1325 5.6102 -5.4414 -10.0111 7.9857 6.2426 -1.7086 0.1737 -13.3965 -6.6569 4.6359 1.2563 W2 = 10.2649 -10.2672 8.7073 -8.7094 9.7530 -9.7555 -8.0035 8.0054 20.7595 -20.7646 14.4361 -14.4400 -9.5297 9.5319 -17.1067 17.1109 -9.5158 9.5181 -2.4113 2.4113 |
2.2. Adaptive Learning and Momentum Method
Match_Y = 0.95, Match_N = 0.91, Inc=1.05, Dec=0.65, Rng=1.05, Time=19,180 secs, Max_Epo=50,000 W1 = 30.1517 1.4745 38.5317 2.5479 29.6417 20.5364 31.0879 18.8949 0.3245 -13.0759 -9.6609 16.3852 -19.7405 -36.2840 -13.4237 20.3825 6.4666 27.0246 13.5611 27.4320 -32.1468 -18.1448 43.8710 31.9648 -1.0889 -6.9690 16.9912 -12.3050 -7.7939 10.3239 5.0209 -19.4327 -1.3463 6.4137 8.6233 -1.0839 W2 = 14.9808 -14.9792 -35.7706 35.7662 -25.6525 25.6493 21.9180 -21.9153 20.5466 -20.5440 31.6891 -31.6852 11.7592 -11.7578 -16.6265 16.6245 26.2006 -26.1973 9.0423 -9.0414 |
2.3. Conjugate-Gradient Method
Match_Y = 0.95, Match_N = 0.90, Time = 1202 secs, Max_Epo = 400 W1 = -16.2951 129.0563 -67.7482 5.0991 514.4826 185.5577 -262.4099 -138.5543 147.9772 12.6406 -225.8085 -22.5818 -5.3014 -253.1423 -167.9039 137.2420 115.1892 -151.4484 -13.0009 320.9614 -239.0545 -1.3889 64.4047 91.1890 -167.4286 -47.8050 -10.3508 4.6095 -146.1623 12.7232 0.9706 -134.7563 -7.4632 61.5902 82.9180 -154.1382 W2 = 249.9770 -283.6238 188.2003 -209.7258 -228.0709 264.5136 409.4496 -458.0132 311.6662 -369.5323 -141.4773 157.0332 |