Overview of Handwriting recognition application

The purpose of this application is to design a Neural Network that could recognize the number from 1 to 9 from the hand-writing gray image (28 pixels by 28 pixels) as shown in Figure 4.7.

Figure 4.7: Overview of handwriting recognition application.


Although handwriting recognition based on MNIST database is not well-suited for machine learning experiments, this application is selected to demonstrate that ANNHUB is able to cope with large data-set application, and the overall accuracy could achieve around 90%.

The MNIST database, which can be obtained from http://yann.lecun.com/exdb/mnist/, consists of 60,000 samples in a training set and 10,000 samples in a test set. It is a subset of a larger data-set available from National Institute of Standards and Technology (NIST).

Prepare data

Figure 4.8: Handwriting recognition data-set.

The first step is to prepare the MNIST data into supported format that can be loaded into ANNHUB. Since an image in MNIST database is in 28x28 gray-scale, it can be presented in 2D array (28x28) and its element values are within a [0;255] range. As the input layer of the Neural Network only accepts 1D array, it requires to flatten 2D array into 1D array with the length = 28x28 = 784. The output of the Neural Network is a number, from 1 to 9, that is corresponding with an image input. Figure 4.8 shows the format of the MNIST data-set in csv file

The first 784 columns are for an image input, and the last column is for output (target) value. Each row represents a sample in MNIST database. The data-set that includes both training data-set and test data-set in csv format can be found in the ANNHUB installation folder (Examples>Classification Examples>MNIST)

Figure 4.9: MNIST dataset files, training and test sets, in csv format.

Load training dataset into ANNHUB

Figure 4.10: Load MNIST training data-set

After data-sets are prepared, training data-set will be loaded into ANNHUB in the Step 1 in Figure 4.10. In this step, only a fraction of the data-set is loaded so that it gives ANNHUB enough information about the data-set format that assists to configure a recommended Neural Network structure.

Configure Neural Network

Figure 4.11: Load MNIST training data-set

Based on the training data-set, the recommended structure of the Neural Network is configured as shown in Figure 4.11. However, users can still tweak to achieve better result. In this example, Scaled Conjugate Gradient training algorithm is used. The cross entropy is used as a cost function. The Neural Network structure that has 784 input nodes, 20 hidden nodes and 1 output node is configured. The activation function for hidden layer is Tansig, and Softmax is used as the activation function for output layer. Max min max method is used for both pre-processing and post processing. The training data ratio =75%.  

For more information, please refer to Configuring Neural Network structure.

Train Neural Network

Figure 4.12: Train the Neural Network to learn MNIST features.

As shown in Figure 4.12, the Scaled Conjugate Gradient is used, the early stopping technique that utilizes validation set to determine the stopping location is automatically configured and applied during training procedure. The stopping criteria that includes 1 max fails, 0.0001 for training goal, 0.001 for gradient goal, and 300 epochs. The training process takes around 21 minutes to complete.

Better result could be achieve by tweaking the Neural Network structure, training algorithm and it parameters.

Evaluate the trained Neural Network

Figure 4.13: Evaluate the trained Neural Network.

After the Neural Network is being trained, confusion matrix and ROC curve techniques shown in Figure 4.13 are used to evaluate its performance. Both training set, validation set and test set are used in evaluation. As shown in Figure 4.6, some classes (class 1 corresponds to output that has a value as 1) have better accuracy than other classes, but the overall accuracy will still achieve around 95%.

For more information, please refer to Training Neural Network.

Test trained Neural Network with new data-set

Figure 4.14: Test the trained Neural Network with new test data-set.

Before being deployed into a real application, the trained Neural Network can be tested with a new data-set to confirm its generalization. The test data-set contains 10,000 samples that have not been used during the design process described above.  As can be seen in Figure 4.14, the trained Neural Network still can recognize correct numbers from samples in the test data-set with accuracy rate of around 90% (with very strict threshold as 0.3, that means if the predicted result is in [1.31;1.69] range, the Neural Network will make false prediction it this image is 1 or 2).

If the threshold is set to 0.49, then the overall accuracy will be 93.73%.

The ROC curve result shown in Figure 4.15 also confirm the stability of the trained model. More information, please refer to Evaluating Neural Network.

Figure 4.15: ROC curve to evaluate the trained Neural Network with new test data-set.

Deploy the trained Neural Network in Handwriting recognition application

The deployment in different programming environments can be easily done thanks to their APIs provided by ANS Center. In this example, the deployment of the trained Neural Network is in LabVIEW environment by using ANNAPI for LabVIEW. The trained Neural Network model is exported to a file with ".ann" extension. The LabVIEW code to load the trained Neural Network model and perform prediction is shown as follows,

Figure 4.17: Standalone handwriting application that use trained Neural Network to classify handwriting images