How deep learning is different from machine learning and artificial intelligence

Artificial intelligence (AI) is a very large research field, where machines show cognitive capabilities such as learning behaviours, proactive interaction with the environment, inference and deduction, computer vision, speech recognition, problem solving, knowledge representation, perception, and many others (for more information, refer to this article: Artificial Intelligence: A Modern Approach, by S. Russell and P. Norvig, Prentice Hall, 2003). More colloquially, AI denotes any activity where machines mimic intelligent behaviors typically shown by humans. Artificial intelligence takes inspiration from elements of computer science, mathematics, and statistics.

Machine learning (ML) is a subbranch of AI that focuses on teaching computers how to learn without the need to be programmed for specific tasks (for more information refer to Pattern Recognition and Machine Learning, by C. M. Bishop, Springer, 2006). In fact, the key idea behind ML is that it is possible to create algorithms that learn from and make predictions on data. There are three different broad categories of ML. In supervised learning, the machine is presented with input data and desired output, and the goal is to learn from those training examples in such a way that meaningful predictions can be made for fresh unseen data. In unsupervised learning, the machine is presented with input data only and the machine has to find some meaningful structure by itself with no external supervision. In reinforcement learning, the machine acts as an agent interacting with the environment and learning what are the behaviours that generate rewards.

Deep learning (DL) is a particular subset of ML methodologies using artificial neural networks (ANN) slightly inspired by the structure of neurons located in the human brain (for more information, refer to the article Learning Deep Architectures for AI, by Y. Bengio, Found. Trends, vol. 2, 2009). Informally, the word deep refers to the presence of many layers in the artificial neural network, but this meaning has changed over time. While 4 years ago, 10 layers were already sufficient to consider a network as deep, today it is more common to consider a network as deep when it has hundreds of layers.

DL is a real tsunami (for more information, refer to Computational Linguistics and Deep Learning by C. D. Manning, "Computational Linguistics", vol. 41, 2015) for machine learning in that a relatively small number of clever methodologies have been very successfully applied to so many different domains (image, text, video, speech, and vision), significantly improving previous state-of-the-art results achieved over dozens of years. The success of DL is also due to the availability of more training data (such as ImageNet for images) and the relatively low-cost availability of GPUs for very efficient numerical computation. Google, Microsoft, Amazon, Apple, Facebook, and many others use those deep learning techniques every day for analyzing massive amounts of data. However, this kind of expertise is not limited any more to the domain of pure academic research and to large industrial companies. It has become an integral part of modern software production and therefore something that the reader should definitively master. The book does not require any particular mathematical background. However, it assumes that the reader is already a Python programmer.