Softmax vs LogSoftmax

In the RNNs hands-on videos, LogSoftmax was used instead of the normal Softmax function for the output layer.

How does taking the log of softmax probabilities help us??

I’m thinking that it has something to do with the Negative Log Likelihood Loss function, but I’m not sure.

Duplicate of: Why do we use nn.LogSoftmax for NLL Loss intead of nn.Softmax?