- Hands-On Natural Language Processing with Python
- Rajesh Arumugam Rajalingappaa Shanmugamani
- 106字
- 2025-04-04 16:33:45
Softmax
Softmax normalizes or squashes a vector of arbitrary values to a probability distribution between 0 and 1. The sum of the softmax output will be equal to 1. Therefore, it is commonly used in the last layer of a neural network to predict probabilities of the possible output classes. The following is the mathematical expression for the softmax function for a vector with j values:

Here zj represents the jth vector value and K represents the number of classes. As we can see the exponential function smoothens the output value while the denominator normalizes the final value between 0 and 1.