Summary

This chapter showed how to get started building and training neural networks to classify data, including image recognition and physical activity data. We looked at packages that can visualize a neural network and we created a number of models to perform classification on data with 10 different categories. Although we only used some neural network packages rather than deep learning packages, our models took a long time to train and we had issues with overfitting.

Some of the basic neural network models in this chapter took a long time to train, even though we did not use all the data available. For the MNIST data, we used approx. 8,000 rows for our binary classification task and only 6,000 rows for our multi-classification task. Even so, one model took almost an hour to train. Our deep learning models will be much more complicated and should be able to process millions of records. You can now see why specialist hardware is required for training deep learning models.

Secondly, we see that a potential pitfall in machine learning is that more complex models will be more likely to overfit the training data, so that evaluating performance in the same data used to train the model results in biased, overly optimistic estimates of the model performance. Indeed, this can even make a difference as to which model is chosen as the best. Overfitting is also an issue for deep neural networks. In the next chapter, we will discuss various techniques used to prevent overfitting and obtain more accurate estimates of model performance. 

In the next chapter we will look at building a neural network from scratch and see how it applies to deep learning. We will also discuss some methods to deal with overfitting.