AI on FPGAs explained

Want to understand why there is still excitement around using AI for FPGAs in 2024? Andrew Swirski explains the three key reasons.

Transcript

“Hi I am Andrew Swirski. I am the founder of a company called Beetlebox and we specialise in DevOps for embedded solution. One question I see repeatedly on engineering forums is why AI on FPGAs is still considered relevant. Afterall, the AI industry is dominated by chips that are specialised to perform AI, whether that be a server side Nvidia GPU or a tiny edge AI processor. When I say AI in this context I mean popular machine learning algorithms such as neural networks and transformers. It’s a high volume industry dominated by specialised chips, the exact sort of place where you wouldn’t expect FPGAs. Yet companies like AMD and Altera (formerly Intel) are still placing big bets on AI on FPGAs. The question I want to answer here is why?

To do that I first need to explain What is a modern FPGA?

An FPGA is a semiconductor chip where the hardware itself is programmable. This programmable hardware means that we can get better performance than if we made the same application on a CPU (what we call acceleration), but not as great as if we manufactured devoted hardware to an application.
In the old days, this meant that you had a whole field of logic blocks that could be programmed and designers could configure these blocks into any sort of digital circuit that you wanted without the need for building a new chip every single time.
Nowadays modern FPGAs are much more sophisticated. They have blocks of specialised hardware that can perform tasks more efficiently before. So in this diagram I show example of where our blocks are placed into columns with specific purposes. In orange we have Input/ourput blocks, in blue we have RAM , in white we have a our logic blocks and in green we have arithmetic blocks. When we program the hardware, we just link these blocks in specific configuration and by hardening these blocks we get far better performance than earlier, simpler FPGAs
But it can get even more complex than that. Modern day FPGAs now can form a smaller part of a wider system known as a System on Chip. Normally the FPGA is stuck in the middle and then surrounded by a network on chip. As part of the system we see other common chips such as CPUs, real time CPUs, external memory controllers, IO and GPUs, just to name a few examples. Now what we can do is be clever and put a devoted AI engine on that chip and that leads me to my first reason.

End-to-end acceleration.

If our SoC has an AI engine not only can we accelerate that AI, but we also can also accelerate all the other processes that are happening pre and post AI. One example would be taking raw input from a camera feed that might be on a robot. Processing images on FPGA architecture has a lot of advantages, such as guaranteed low latency and so we can process that data from camera feeds to provide for that AI engine. The data can be feed to the AI and then finally we can use that data from that AI for control or decision making.
The advantage of this is that we are able to perform all our processing on a single chip which can help reduce costs as well as avoid performance penalties you take when data is transferred between two chips, such as increased latency and power consumption.
But at the same time, this feels a bit like a workaround, afterall we have just shifted the workload to a devoted AI engine rather than have the FPGA architecture perform the AI processing. Luckily the next two points might solve this problem. The first one of these is custom network hardware.

Custom AI model Hardware

With any GPU or AI chip, the AI engines on those devices are designed to be able to run any machine learning algorithm, regardless of how badly that certain network maps onto that. So in a way every GPU and AI chip needs to be a general purpose AI engine.
When we use FPGAs we can take advantage of their unique programmable hardware to create custom hardware that is specific for AI model hardware. Not only that but we can optimise that model for specific objectives. For instance, you could tell the FPGA that you want to minimise latency or that you want to optimise power efficiency. This
So let’s take this one step further instead of just optimising the hardware why don’t we optimise the software as well

Co-operative hardware-software optimisations

Most AI chips will have a software tool that can take models from frameworks such as Tensorflow or PyTorch and then optimise the software of those models to better fit the AI chip. This could be techniques like quantisation or pruning. But with FPGAs we can take this idea to the next level.
What we can do is apply a technique called automated Machine Learning or autoML. So this is where we define the task we want to achieve and then we just tell our autoML tool find the optimised neural network for that task. We can take his idea one step further and tell our autoML tool to optimise that neural network specifically for latency and then it can perform both software and hardware optimisations at the same time. It can even iterate, so it might try one proposed network, get the results and then try another based on those results.
Using FPGAs really provides a huge amount of new parameters that an autoML tool can target. Simply put this sort of technique can simply not be performed on a GPU.

Summary

What is exciting about these techniques for running AI on FPGAs, is that they are all relatively in their infancy. Many of these ideas are active research projects expically co-operative hardware-software optimisations and building custom hardware for specific AI models. We are also seeing industry take advantage of that end-to-end acceleration, especially for vision applications. We are also seeing a new wave of FPGA chips come out and so there is a lot of hope that combining these technologies with the latest generation hardware will generate results that beat current AI chips.

That’s it from me today. I hope you had a great time listening. If you are looking at using AI on FPGAs and want to automate your process please feel free to get in touch with us thorough the website in the description.”