Do I need a GPU (and what is it, anyway)?

Probably the two biggest reasons for the exponential growth in deep learning are:

  • The ability to accumulate, store, and process large datasets of all types
  • The ability to use GPUs to train deep learning models

So what exactly are GPUs and why are they so important to deep learning? Probably the best place to start is by actually looking at the CPU and why this is not optimal for training deep learning models. The CPU in a modern PC is one of the pinnacles of human design and engineering. Even the chip in a mobile phone is more powerful now than the entire computer systems of the first space shuttles. However, because they are designed to be good at all tasks, they may not be the best option for niche tasks. One such task is high-end graphics.

If we take a step back to the mid-1990s, most games were 2D, for example, platform games where the character in the game jumps between platforms and/or avoids obstacles. Today, almost all computer games utilize 3D space. Modern consoles and PCs have co-processors that take the load of modelling 3D space onto a 2D screen. These co-processors are known as GPUs.

GPUs are actually far simpler than CPUs. They are built to just do one task: massively parallel matrix operations. CPUs and GPUs both have cores, where the actual computation takes place. A PC with an Intel i7 CPU has four physical cores and eight virtual cores by using Hyper Threading. The NVIDIA TITAN Xp GPU card has 3,840 CUDA? cores. These cores are not directly comparable; a core in a CPU is much more powerful than a core in a GPU. But if the workload requires a large amount of matrix operations that can be done independently, a chip with lots of simple cores is much quicker.

Before deep learning was even a concept, researchers in neural networks realized that doing high-end graphics and training neural networks both involved workloads: large amounts of matrix multiplication that could be done in parallel. They realized that training the models on the GPU rather than the CPU would allow them to create much more complicated models.

Today, all deep learning frameworks run on GPUs as well as CPUs. In fact, if you want to train models from scratch and/or have a large amount of data, you almost certainly need a GPU. The GPU must be an NVIDIA GPU and you also need to install the CUDA? Toolkit, NVIDIA drivers, and cuDNN. These allow you to interface with the GPU and hijack its use from a graphics card to a maths co-processor. Installing these is not always easy, you have to ensure that the versions of CUDA, cuDNN and the deep learning libraries you use are compatible. Some people advise you need to use Unix rather than Windows, but support on Windows has improved greatly. This code on this book was developed on a Windows workstation. Forget about using a macOS, because they don't support NVIDIA cards.

That was the bad news. The good news is that you can learn everything about deep learning if you don't have a suitable GPU. The examples in the early chapters of this book will run perfectly fine on a modern PC. When we need to scale up, the book will explain how to use cloud resources, such as AWS and Google Cloud, to train large deep learning models.