Part 2: Introducing Sign Language MNIST

Welcome to the second part in out tutorial in using Vitis AI with Tensorflow and Keras. Other parts of the tutorial can be found here:

  1. Introduction
  2. Getting Started (here)
  3. Transforming Kaggle Data and Convolutional Neural Networks (CNNs)
  4. Training the neural network
  5. Optimising our neural network
  6. Converting and Freezing our CNN
  7. Quanitising our CNN
  8. Compiling our CNN
  9. Running our code on the DPU
  10. Conclusion Part 1: Improving Convolutional Neural Networks: The weaknesses of the MNIST based datasets and tips for improving poor datasets
  11. Conclusion Part 2: Sign Language Recognition: Hand Object detection using R-CNN and YOLO

Our Sign Language MNIST Github

In this part of the tutorial, we will be introducing the dataset and the tools and we will also look at how to run the program. All code is available open source on our github.

Our chosen dataset is the Sign Language MNIST from Kaggle. It is designed as a replacement for the famous MNIST digits database as MNIST is often considered too simple. We also believe this is a far more pragmatic dataset for embedded AI and hope this can provide the ground works to make embedded devices more accessible for everyone. Each image is a 28×28 greyscale image with pixel values varying between 0-255. There are 27,455 training cases and 7172 test cases. There were originally 1704 colour images, but to extend the database, new data was synthesised by modifying these images. The exact strategies can be found on the webpage.

The tools we are using:

  1. TensorFlow: Developed by Google. TensorFlow is a very popular open source platform for machine learning. We assume the user has a basic understanding of the TensorFlow toolset.
  2. Keras: An easy to use front-end API for TensorFlow. Its flexibility allows for quick experimentation, making it perfect for getting started with FPGAs. We also assume that the user has some basic knowledge of Keras
  3. Vitis AI: Vitis AI is part of Xilinx’s Vitis Unified Development Environment, which aims at making FPGAs accessible for software developers. The tool takes in TensorFlow models and converts them to run on the Deep Learning Processing Unit (DPU), which is the deep learning accelerator that is placed on the FPGA fabric.

With the introductions out of the way let’s get installing:

That’s it for installation. We can now focus on running the system and to get started we are going to use the automated scripts. In later tutorials, we will go stage by stage as to what each part of the script does, but for now we will quickly go through the whole process.

  • Launch a terminal in the repo and start the docker:
cd <Cloned-directory>/sign_language_mnist

sudo systemctl enable docker

<Vitis-AI-Installation-Directory>/Vitis-AI/ xilinx/vitis-ai-cpu:latest
  • We then need to install Keras in the docker
sudo su

conda activate vitis-ai-tensorflow

pip install keras==2.2.5

conda install -y pillow


conda activate vitis-ai-tensorflow

Our program verifies its functionality in two ways. First it takes a sample of the test images and runs them on the FPGA. By default, it will run ten images. Second, we allow a user to input their own image is by placing an image inside the test folder. We have already supplied two test images for you to try, but feel free to add more.

  • Finally, to run the process:

This will step through the entire process of creating the Neural Network model for our FPGA. At the end of the process, it will place all the files we need in a folder called deploy. To run our Neural Network we need to place the DPU on the FPGA. Fortunately, Xilinx provides pre-made images and instructions here.

  • Turn on the FPGA and access it through the UART port
  • Through the UART port we can configure the settings of the FPGA to access the board via SSH;
ifconfig eth0 netmask
scp <download-directory>/vitis-ai_v1.1_dnndk.tar.gz root@
  • Then using the terminal through the board:
tar -xzvf vitis-ai_v1.1_dnndk.tar.gz

cd vitis-ai_v1.1_dnndk

  • We then need to transfer over the deploy folder
scp <Cloned-directory>/sign_language_mnist/deploy root@
  • Finally, we can run the file:
cd sign_language_mnist/deploysource ./compile_shared.shpython3 -t 1 -b 1 -j /home/root/deploy/dpuv2_rundir/
  • We should get the following results:
Throughput: 1045.72 FPS
Custom Image Predictions:
Custom Image:  test_b  Predictions: U
Custom Image:  test_c  Predictions: F
testimage_9.png Correct { Ground Truth:  H Prediction:  H }
testimage_6.png Correct { Ground Truth:  L Prediction:  L }
testimage_5.png Correct { Ground Truth:  W Prediction:  W }
testimage_1.png Correct { Ground Truth:  F Prediction:  F }
testimage_2.png Correct { Ground Truth:  L Prediction:  L }
testimage_7.png Correct { Ground Truth:  P Prediction:  P }
testimage_4.png Correct { Ground Truth:  D Prediction:  D }
testimage_3.png Correct { Ground Truth:  A Prediction:  A }
testimage_0.png Correct { Ground Truth:  G Prediction:  G }
testimage_8.png Correct { Ground Truth:  D Prediction:  D }
Correct: 10 Wrong: 0 Accuracy: 100.00


Looks like we correctly predicted all the results from the dataset, but things did not go quite right for our custom images. In the next tutorials, we will be doing a deeper dive into what these automated scripts actually do and how we can create models for use on FPGAs. The first thing we will do is look at transforming our data for use by TensorFlow and creating a basic Convolutional Neural Network.

Related Post


Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Studio 1.10,

Chester House,

1-3 Brixton Road,


United Kingdom,



Beetlebox Limited is a

company registered in

England & Wales with

Company Number 11215854

and VAT no. GB328268288


2020 Beetlebox Limited