Fitting the model

In our previous MNIST example, we went over the least number of architectural decisions to get our code running. This lets us cover a deep learning workflow quite quickly, but at the expense of efficiency. You will recall that we simply used the fit parameter on our model and passed it our training features and labels, along with two integers denoting the epochs to train the model for, and the batch size per training iteration. The former simply defines how many times our data runs through the model, while the latter defines how many learning examples our model will see at a time before updating its weights. These are the two paramount architectural considerations that must be defined and adapted to the case at hand. However, there are several other useful arguments that the fit parameter may take.