2 layer newral network is added to n42 This network is simple newral network which can trained throught gradient descent optimization calculation. It is the same algorithm to the one of denoised autoencoder used by n42. So implementation itself was not diffucult. The code is showed below.

/**
 *   Training weight parameters with supervised learning
 *
 *   @method train
 *   @param  lr {float}  learning rate
 *   @param  input {Matrix} input data (option)
 *   @param  label {Matrix} label data (option)
 */
NN.prototype.train = function(lr, input, label) {
    var self = this;
    self.x     = (input != undefined)? input : self.input;
    self.label = (label != undefined)? label : self.label;
    
    var x = self.x;
	
	// Get hidden layer value
    var y = self.getHiddenValues(x);

	// The output of this network
    var z = self.getOutput(y);

	// The error of output layer. 
    var lH2 = self.label.subtract(z);

	// Restortion to the error of each hidden layer unit
    var sigma = lH2.x(self.W2.transpose());
    var lH1 = [];
    for (var i = 0; i < sigma.rows(); i++) {
        lH1.push([]);
        for (var j = 0; j < sigma.cols(); j++) {
            lH1[i].push(sigma.e(i+1, j+1) * y.e(i+1, j+1) * (1 - y.e(i+1, j+1)));
        }
    }

	// Make sylvester matrix
    lH1 = $M(lH1);

	// lW1 is the weight matrix from input layer to hidden layer
    var lW1 = x.transpose().x(lH1);

	// lW2 is the weight matrix from hidden layer to output layer
    var lW2 = y.transpose().x(lH2);

	// Add gradient to weight matrix respectively
    self.W1 = self.W1.add(lW1.x(lr));
    self.W2 = self.W2.add(lW2.x(lr));

	// vBias is the input layer bias parameters
    self.vBias = self.vBias.add(utils.mean(lH2, 0).x(lr));

	// hBias is the hidden layer bias parameters
    self.hBias = self.hBias.add(utils.mean(lH1, 0).x(lr));
}

Trying.

var input = $M([
  [1.0, 1.0, 0.0, 0.0],
  [1.0, 1.0, 0.2, 0.0],
  [1.0, 0.9, 0.1, 0.0],
  [1.0, 0.98, 0.02, 0.0],
  [0.98, 1.0, 0.0, 0.0],
  [0.0, 0.0, 1.0, 1.0],
  [0.0, 0.1, 0.8, 1.0],
  [0.0, 0.0, 0.9, 1.0],
  [0.0, 0.0, 1.0, 0.9],
  [0.0, 0.0, 0.98, 1.0]
]);

var label = $M([
  [1.0, 0.0],
  [1.0, 0.0],
  [1.0, 0.0],
  [1.0, 0.0],
  [1.0, 0.0],
  [0.0, 1.0],
  [0.0, 1.0],
  [0.0, 1.0],
  [0.0, 1.0],
  [0.0, 1.0]
]);

var nn = new NN(input, 4, 10, 2, label);

for (var i = 0; i < 100000; i++) {
  // 0.1 is learning rate
  nn.train(0.1);
}

var data = $M([
  [1.0, 1.0, 0.0, 0.0],
  [0.0, 0.0, 1.0, 1.0]
]);

console.log(nn.predict(data));
// [0.9999597224429988, 0.000040673558435336644]
// [0.0000455181928397141, 0.9999544455271699]

Activation function

This prediction seems rather good, but the activation function was changed to sigmoid function, not softmax function in this case. In the multi class categorizing problem, soft max function usually used to predict. But I can get good result with sigmoid function rather than softmax function. With softmax function, the result is below.

// [0.5242635012777253, 0.47573649872227464]
// [0.2690006890629063, 0.7309993109370937]

Further trying

Umm, this is not the result I want to get. I can’t grasp why the result is not correct sufficiently. I want to keep tracking whether there are any problems in my program. And with this network, I want to try kaggle mnist problem. Now n42 is run to train mnist data. It takes a lot of time. If any good result is obtained through this process, I will introduce this blog. Welcome feedback, thank you!!