What is deep learning?

Deep learning is a subfield within machine learning, which in turn is a subfield within artificial intelligence. Artificial intelligence is the art of creating machines that perform functions that require intelligence when performed by people. Machine learning uses algorithms that learn without being explicitly programmed. Deep learning is the subset of machine learning that uses artificial neural networks that mimic how the brain works.

The following diagram shows the relationships between them. For example, self-driving cars are an application of artificial intelligence. A critical part of self-driving cars is to recognize other road users, cars, pedestrians, cyclists, and so on. This requires machine learning because it is not possible to explicitly program this. Finally, deep learning may be chosen as the method to implement this machine learning task:

Figure 1.1: The relationship between artificial intelligence, machine learning, and deep learning

Artificial intelligence as a field has existed since the 1940s; the definition used in the previous diagram is from Kurzweil, 1990. It is a broad field that encompasses ideas from many different fields, including philosophy, mathematics, neuroscience, and computer engineering. Machine learning is a subfield within artificial intelligence that is devoted to developing and using algorithms that learn from raw data. When the machine learning task has to predict an outcome, it is known as supervised learning. When the task is to predict from a set of possible outcomes, it is a classification task, and when the task is to predict a numeric value, it is a regression task. Some examples of classification tasks are whether a particular credit card purchase is fraudulent, or whether a given image is of a cat or a dog. An example of a regression task is predicting how much money a customer will spend in the next month. There are other types of machine learning where the learning does not predict values. This is called unsupervised learning and includes clustering (segmenting) the data, or creating a compressed format of the data.

Deep learning is a subfield within machine learning. It is called deep because it uses multiple layers to map the relationship between input and output. A layer is a collection of neurons that perform a mathematical operation on its input. This will be explained in more detail in the next section, Conceptual overview of neural networks. This deep architecture means the model is large enough to handle many variables and that it is sufficiently flexible to approximate the patterns in the data. Deep learning can also generate features as part of the overall learning algorithm, rather than feature-creation being a prerequisite step. Deep learning has proven particularly effective in the fields of image-recognition (including handwriting as well as photo- or object-classification) , speech recognition and natural-language. It has completely transformed how to use image, text, and speech data for prediction in the past few years, replacing previous methods of working with these types of data. It has also opened up these fields to a lot more people because it automates a lot of the feature-generation, which required specialist skills.

Deep learning is not the only technique available in machine learning. There are other types of machine learning algorithms; the most popular include regression, decision trees, random forest, and naive bayes. For many use cases, one of these algorithms could be a better choice. Some examples of use cases where deep learning may not be the best choice include when interpretability is an essential requirement, the dataset size is small, or you have limited resources (time and/or hardware) to develop a model. It is important to realize that despite, the industry hype, most machine learning in industry does not use deep learning. Having said that, this book covers deep learning algorithms, so we will move on. The next sections will discuss neural networks and deep neural networks in more depth.