Skip to content

Classify data by training your own neural network

Note

I wrote the original article in spanish. So some references to resources and code are in spanish.

Introduction

A neural network can be fed large amounts of data. When processing this data, the network allows us to perform classification and regression on the information. In this article, we will feed a neural network information which we will later use to make estimates based on the data used.

As an example, we will use a square. Suppose we have a square with dimensions 1000 x 1000 where 0,0 is the upper left corner and 1000,1000 is the lower right corner. We will use this .json file called coordenadas.json which contains 41 blocks of coordinates that define a point in the square and a description of them. Our goal is to feed this information to the neural network so that we can then pass it other values that are not in the .json and have it classify whether the coordinate corresponds to the upper left, lower left, upper right or lower right of the square.

Requirements

To carry out this tutorial we need:

  • A text editor. (It is recommended to use vscode).
  • Have NodeJs installed with NPM.
  • A web browser such as Firefox, Chrome or Edge Chromium.

File Creation

We create a folder and place the following files:

recursos/

index.html

index.js

Inside the recursos/ folder we will place the coordenadas.json file.

In the index.html file we put the following code, which basically brings in the ml5.js library in the head.

html
<html>
  <head>
    <meta charset="UTF-8" />
    <title>Classify data by training your own neural network</title>
    <script src="https://unpkg.com/ml5@0.4.3/dist/ml5.min.js"></script>
  </head>
  
  <body>
    <script src="index.js"></script>
  </body>
</html>

In the index.js file we put the following (code explained in the comments):

js
window.onload = () => {
  /**
   * Neural network options
   * dataUrl: url of the .json file where our data is located.
   * task: Type of task to execute in the network, for this example we will use 'classification'  
   * inputs: Input data, i.e., the data source used to feed the network.
   * outputs: Output data, i.e., the description of the input data.
   * debug: Defines whether or not to show the visualization of the neural network training in the html.
   */
  const options = {
    dataUrl: 'recursos/coordenadas.json',
    task: 'classification',
    inputs: ['x', 'y'],
    outputs: ['label'],
    debug: true,
  };

  // initialize the neural network
  const nn = ml5.neuralNetwork(options, normalize);

  // normalize the data 
  function normalize() {
    nn.normalizeData();
    train();
  }

  // train the data model
  function train() {
    /**
     * epochs: In terms of neural networks, an 'epoch' refers to a complete cycle over the training data.  
     * batchSize: Data blocks in which information will be processed.
     *
     * There is no specific number of epochs that are needed, but tests can be done to find the optimal number according to the results.
     */ 
    const trainigOptions = {
      epochs: 250,
      batchSize: 12,
    };
    nn.train(trainigOptions, classify);
  }

  // once our neural network is trained, we proceed to test how it behaves against unknown data.
  function classify() {
    // in this example we are passing the coordinates 300,350 which correspond to the lower left side.
    const input = {
      x: 300,
      y: 350,
    };
    nn.classify(input, handleResults);
  }

  function handleResults(error, results) {
    // in case of error, we put it in console
    if (error) {
      console.log(error);
      return;
      // if everything is successful, we can see in the browser console the results of the performed classification.
    } else {
      console.log(results);
    }
  }
};

Code Execution

We install a lightweight server called serve. Open the command console and type:

shell
npm install -g serve

Then, once serve is installed, in the project folder, open the console and type:

shell
serve

By default, this creates a simple server running at http://localhost:5000. By navigating to this url we can open the browser's developer tools and watch while data is processed in our neural network. When executing this exercise we can visualize a graph similar to this:

NN Training Performance

And in the console get results approximate to these:

js
[
  { label: 'izquierda-superior', confidence: 0.8469865322113037 },
  { label: 'izquierda-inferior', confidence: 0.09941432625055313 }, 
  { label: 'derecha-superior', confidence: 0.0454748310148716 },
  { label: 'derecha-inferior', confidence: 0.008124231360852718 },
];

We can see that the neural network successfully classified that the coordinates correspond to the upper left side with a confidence of 0.847 or 84.7%. We can test with different coordinates and see if the neural network's estimation, based on the data it was fed, is still accurate.

Testing for { x: 800, y: 150 }:

js
[
  { label: 'derecha-superior', confidence: 0.9378078579902649 },
  { label: 'izquierda-superior', confidence: 0.05480305105447769 },
  { label: 'derecha-inferior', confidence: 0.007157310843467712 },  
  { label: 'izquierda-inferior', confidence: 0.0002319106279173866 },
];

According to the results, we can observe that the classification was again successful with 93% confidence for the upper right side.

With these examples we can have a glimpse of the potential of neural networks in the classification of unknown information based on data used to train it.