Advanced Techniques in Neu-Net Pattern Recognition

Duy-Ky Nguyen, PhD

1. Introduction

In a problem of neu-net pattern recognition, the weights will be adjusted to minimize the error between the target output and the actual output of the network. Let e and W be the error and the weight respectively, by the Taylor series we have

(Eq 1)

let

(Eq 2)

then our concern is to find dW to minimize the error, ie. to find the step a and the direction D.

In a conventional back-propagation neu-net, the steepest descent method is used

(Eq 3)

thus a is a constant learning rate and regardless whether the error decreases or not.

In the rest of this note, the subscript 0 is used to indicate the previous value.

In a neu-net using momentum, we have

(Eq 4)

thus the steepest descent direction is used only if the error is decreased and the references are saved as .

In a neu-net using an adaptive learning rate, we have

(Eq 5)

thus the learning rate is increased to speed up the minimization when the error is decreased.

In a neu-net using the conjugate-gradient method,

(Eq 6)

and

(Eq 7)

where

(Eq 8)

2. Algorithms

2.1. Conventional Back-Propagation Neu-Net (Steepest Descent Method)

  1. Choose >
    Choose W randomly
  2. >
  3. >
  4. If> then Terminate
    Else Loop back to (2)

2.2. Adaptive Learning and Momentum Method

Since these 2 technique are the same algorithm, we can combine to increase the efficiency of the training process

  1. Choose >
    Choose W randomly
  2. Initialize > >
  3. Update and Compute > > >
  4. If > >
    Else > >
  5. If > then Terminate
    Else Loop back to (3)

2.3. Conjugate-Gradient Method

  1. Choose >
    Choose W randomly
  2. Initialize > >
  3. Update and Compute > > >
  4. Cubic Line Search If > >
    Else if > >
    Else > > >
  5. Update Previous Values > > >
  6. If > then Terminate
    Else Loop back to (3)

3. Results

3.1. Conventional Back-Propagation (Steepest Descent Method)

Match_Y = 0.68, Match_N = 0.73, Lr = 1, Time = 18,455 secs, Max_Epo = 50,000

W1 =
  -25.6981   13.1625  -13.5674   -8.9737   -2.6699   -5.9629    4.7318  -18.6604  -22.5422
  -19.2870   -9.1153   12.7793  -15.1942    1.8693   -9.3501    4.1438    9.7182    9.7889
   26.4317  -10.0165  -24.8998    0.5489   26.4171   23.7182   14.1325    5.6102   -5.4414
  -10.0111    7.9857    6.2426   -1.7086    0.1737  -13.3965   -6.6569    4.6359    1.2563

W2 =
   10.2649  -10.2672
    8.7073   -8.7094
    9.7530   -9.7555
   -8.0035    8.0054
   20.7595  -20.7646
   14.4361  -14.4400
   -9.5297    9.5319
  -17.1067   17.1109
   -9.5158    9.5181
   -2.4113    2.4113

2.2. Adaptive Learning and Momentum Method

Match_Y = 0.95, Match_N = 0.91, Inc=1.05, Dec=0.65, Rng=1.05, Time=19,180 secs, Max_Epo=50,000

W1 =
   30.1517    1.4745   38.5317    2.5479   29.6417   20.5364   31.0879   18.8949    0.3245
  -13.0759   -9.6609   16.3852  -19.7405  -36.2840  -13.4237   20.3825    6.4666   27.0246
   13.5611   27.4320  -32.1468  -18.1448   43.8710   31.9648   -1.0889   -6.9690   16.9912
  -12.3050   -7.7939   10.3239    5.0209  -19.4327   -1.3463    6.4137    8.6233   -1.0839

W2 =
   14.9808  -14.9792
  -35.7706   35.7662
  -25.6525   25.6493
   21.9180  -21.9153
   20.5466  -20.5440
   31.6891  -31.6852
   11.7592  -11.7578
  -16.6265   16.6245
   26.2006  -26.1973
    9.0423   -9.0414

2.3. Conjugate-Gradient Method

Match_Y = 0.95, Match_N = 0.90, Time = 1202 secs, Max_Epo = 400

W1 =
  -16.2951  129.0563  -67.7482    5.0991  514.4826  185.5577 -262.4099 -138.5543  147.9772
   12.6406 -225.8085  -22.5818   -5.3014 -253.1423 -167.9039  137.2420  115.1892 -151.4484
  -13.0009  320.9614 -239.0545   -1.3889   64.4047   91.1890 -167.4286  -47.8050  -10.3508
    4.6095 -146.1623   12.7232    0.9706 -134.7563   -7.4632   61.5902   82.9180 -154.1382

W2 =
  249.9770 -283.6238
  188.2003 -209.7258
 -228.0709  264.5136
  409.4496 -458.0132
  311.6662 -369.5323
 -141.4773  157.0332