书名：Hands-On Neural Networks with Keras
作者名：Niloy Purkait
本章字数：324字
更新时间：2025-04-04 14:37:33

Compiling the model

The main architectural difference during compilation here is to do with the loss function and metric we choose to implement. We will use the MSE loss function to penalize higher prediction errors, while monitoring our model's training progress with the Mean Absolute Error (MAE) metric:

from keras import optimizers
model.compile(optimizer= opimizers.RMSprop(lr=0.001),
              loss-'mse',
              metrics=['mae'])
model.summary()
__________________________________________________________
Layer (type)                 Output Shape              Param #   
==========================================================
dense_1 (Dense)              (None, 6)                 72006     
__________________________________________________________
dense_2 (Dense)              (None, 6)                 42        
__________________________________________________________
dense_3 (Dense)              (None, 1)                 7         
==========================================================
Total params: 72,055
Trainable params: 72,055
Non-trainable params: 0
__________________________________________________________

As we saw previously, the MSE function measures the average of the squares of our network's prediction errors. Simply put, we are simply measuring the average squared difference between the estimated and actual house price labels. The squared term emphasizes the spread of our prediction errors by penalizing the errors that are further away from the mean. This approach is especially helpful with regression tasks where small error values still have a significant impact on predictive accuracy.

In our case, our housing price labels range between 5 and 50, measured in thousands of dollars. Hence, an absolute error of 1 actually means a difference of $1,000 in prediction. Thus, taking using an absolute error-based loss function might not give the best feedback mechanism to the network.

On the other hand, the choice of MAE as a metric is ideal to measure our training progress itself. Visualizing squared errors, as it turns out, is not very intuitive to us humans. It is better to simply see the absolute errors in our models' predictions, as it is visually more informative. Our choice of metric has no actual impact on the training mechanism of the model—it is simply providing us with a feedback statistic to visualize how good or bad our model is doing during the training session. The MAE metric itself is essentially a measure of difference between two continuous variables.