Machine learning: Math

Softmax

Softmax takes a vector and returns a vector with the same number of elements.

The sum of the returned elements is 1.

The returned vector is a probability distribution (which is the reason why the sum of its element is equal to 1).

Because the returned vector is a probability distribution, it can be passed as input to the cross-entropy-loss function.

The value of the returned vector is invariant to shifts in the input vector, that is: softmax(a, b, c) = softmax(a+x, b+x, c+x). This is demonstrated with the following piece of Python code:

import numpy as np

def softmax(x):

    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum()

def format_array(a):
    return ', '.join( [ '{:1.5f}'.format(n) for n in a ] )

#
# The following for loop prints 4 times
#    0.00002, 0.00065, 0.21402, 0.78531
#
for x in [0.1,  3.1, 109.2, 450.5]:
    print( format_array( softmax(np.array([ x, x+3.7, x+9.5, x+10.8  ]))))