书名：Deep Learning with Keras
作者名：Antonio Gulli Sujit Pal
本章字数：74字
更新时间：2021-07-02 23:58:05

Increasing the size of batch computation

Gradient descent tries to minimize the cost function on all the examples provided in the training sets and, at the same time, for all the features provided in the input. Stochastic gradient descent is a much less expensive variant, which considers only BATCH_SIZE examples. So, let's see what the behavior is by changing this parameter. As you can see, the optimal accuracy value is reached for BATCH_SIZE=128: