书名：Hands-On Neural Networks with Keras
作者名：Niloy Purkait
本章字数：355字
更新时间：2025-04-04 14:37:33

Size experiments

Now we will perform some short experiments by varying the size of our network and gauging our performance. We will train six simple neural networks on Keras, each progressively larger than the other, to observe how these separate networks learn to classify handwritten digits. We will also present some of the results from the experiments. All of these models were trained with a constant batch size (batch_size=100), the adam optimizer, and sparse_categorical_crossentropy as a loss function, for the purpose of this experiment.

The following fitting graph shows how increasing our neural network's complexity (in terms of size) impacts our performance on the training and test sets of our data. Note that we are always aiming for a model that minimizes the difference between training and test accuracy/loss, as this indicates the minimum amount of overfitting. Intuitively, this simply shows us how much our networks learning benefits if we allocate it more neurons. By observing the increase in accuracy on the test set, we can see that adding more neurons does help our network to better classify images that it has never encountered before. This can be noticed until the sweet spot, which is where the training and test values are the closest to each other. Eventually, however, increases in complexity will lead to diminishing returns. In our case, our model seems to overfit the least at a dropout rate around 0.5, after which the accuracy of the training and test sets start to diverge:

To replicate these results by increasing the size of our network, we can tweak both the breadth (number of neurons per layer) and the depth of the network (number of layers in network). Adding depth to your network is done in Keras by adding layers to your initialized model by using model.add(). The add method takes the type of layer (for example, Dense()), as an argument. The Dense function takes the number of neurons to be initialized in that specific layer, along with the activation function to be employed for said layer, as arguments. The following is an example of this:

model.add(Dense(512,  activation=’softmax’))