Neural networks are an important concept in the fields of artificial intelligence and machine learning. They consist of interconnected nodes, organized in layers, and mimic how the human brain works. The nodes represent the human brain’s neurons.
You can create your own simple feed-forward, multi-class classification neural network. Train it to classify handwritten digits using the MNIST dataset. You can then use computer vision to classify your own handwritten digits.
What Is Multi-Class Classification?
Multi-class classification is a type of machine learning that can classify data into more than two categories. Neural networks use the softmax classifier to distribute probability over possible classes.
You can use multi-class classification to classify handwritten images from the MNIST dataset into 10 categories. These categories will correspond to the digits 0 to 9.
Understanding the MNIST Dataset
The MNIST dataset is a popular benchmark dataset for machine learning and computer vision algorithms. It contains 70,000 grayscale handwritten images which are 28 by 28 pixels in size. The handwritten digits are in the range 0 to 9.
Before building any machine learning model, it is important to understand what your dataset contains. Understanding the dataset will enable you to perform better data preprocessing.
Preparing Your Environment
To follow this tutorial, you should be familiar with the basics of Python. You should also have a basic knowledge of machine learning. Finally, you should be comfortable using Jupyter Notebook or Google Colab.
The full source code is available in a GitHub repository.
Create a new Jupyter Notebook or sign in to Google Colab. Run this command to install the required packages:
!pip install numpy matplotlib tensorflow opencv-python
You will use:
- Matplotlib for data visualization.
- NumPy to manipulate arrays.
- TensorFlow to create and train your model.
- OpenCV to feed the model with your own handwritten digits.
Importing the Necessary Modules
Import the packages you installed in your environment. This will allow you to later call and use their functions and modules in your code.
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import cv2
The second line of code imports the Keras module from the Google TensorFlow library. You will use Keras to train your deep neural network with TensorFlow as the backend.
Loading and Viewing the Dataset
The MNIST dataset is built into Keras. Load the MNIST dataset and split it into training and test sets. You will use the training set to train your model and the test set to evaluate the accuracy of your model in classifying new unseen images.
(X_train, y_train) , (X_test, y_test) = keras.datasets.mnist.load_data()
Check the length of the training and test sets. The MNIST dataset has 60,000 images for training and 10,000 images for testing.
len(X_train)
len(X_test)
Check the shape of the first image in the MNIST dataset which should be 28 by 28 pixels. Then print its pixel values and visualize it using Matplotlib.
X_train[0].shape
X_train[0]
plt.matshow(X_train[0])
y_train[0]
The visualization output is as follows:
The visualized image shows that the first image in the dataset contains the number five.
Data Preprocessing
Before using the data in the dataset to train and test your model, you need to preprocess it. Preprocessing enhances a model's accuracy by standardizing the data.
Normalizing the Pixel Values
Normalize the pixel values of the images in the dataset by dividing each value by 255. The pixel values of the unnormalized dataset range from 0 to 255 with zero being black and 255 being white. Dividing each pixel value by 255 ensures each pixel is in the range between 0 and 1. This makes it easier for the model to learn the relevant features and patterns in the data.
X_train = X_train / 255
X_test = X_test / 255
Then print the pixel values of the first image.
X_train[0]
Notice they are now in the range between 0 and 1.
Converting the Image Matrices Into a 1D Array
The neural network’s input layer generally expects 1D inputs, so create a 1D array of the image’s pixel values. To do so, use the reshape() function with the number of raws set to the number of images in the dataset.
X_train_flattened = X_train.reshape(len(X_train), 28 * 28)
X_test_flattened = X_test.reshape(len(X_test), 28 * 28)
X_train_flattened.shape
X_train_flattened[0]
Your images are now ready to train and test the model.
Creating the Deep Neural Network Model
Create a sequential model with Tensorflow's Keras module using an input layer, two hidden layers, and an output layer. Set the input shape to 28 by 28 as this is the shape of the original images in the dataset. Use 128 nodes for the hidden layers. The output layer should have 10 neurons only as you are only classifying digits 0 to 9.
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
Compile the model using the adam optimizer, sparse_categorical_crossentropy as the loss function, and the metric to evaluate the performance of the model as accuracy. Then fit the training data into the model and set the number of epochs to five.
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5)
The model will take a few minutes to train. After model training has finished, evaluate its performance on the test set.
model.evaluate(X_test, y_test)
The evaluate function will return the loss and accuracy of the model. The model produces an accuracy of 98%.
Using the Model to Classify Your Own Handwritten Digits
To classify your own handwritten digits, you need to prepare your images to match those of the MNIST dataset. Failing to do so will lead to your model performing poorly.
To preprocess the images:
- Load the image containing the digit using OpenCV.
- Convert it to grayscale and resize it to 28 by 28 pixels.
- Flip and normalize the pixel values.
- Finally, flatten the image into a 1D array.
Pass the preprocessed image into the model for prediction and print the predicted value on the screen.
img = cv2.imread('digits/digit1.png', cv2.IMREAD_GRAYSCALE)
img_resize = cv2.resize(img, (28, 28))
img_flip = cv2.bitwise_not(img_resize)
img_normalized = img_flip.astype('float32') / 255.0
# Flatten the image into a 1D array
input_data = img_normalized.flatten().reshape( 1,28,28)
# Make a prediction using the model
prediction = model.predict(input_data)
print (f'Prediction: {np.argmax(prediction)}')
Passing a preprocessed image containing a number to the model.
The output of the model is as follows:
The model was able to classify the digit seven correctly.
Neural Networks in Chatbots
The use of Neural networks has exploded in the past few years. They have been predominantly used in natural language processing for language translation and generative AI.
More recently, there has been a rise in the number of chatbots that can communicate in a human-like manner. They use a type of neural network known as a transformer neural network. Interact with some of them and experience the power of neural networks.