Doubt regarding Cross entropy function!

Cross entropy function= - { sum(yi_true* log(yi_hat)) + sum(yi_true*log(yi_true)) }

Actually, I have 2 doubts:

  1. When using cross entropy loss, sum(-yi*log(yi__hat)), suppose we get our model to train such that it predicts yi__hat to be 0. Then won’t this loss function be giving error? as log0 isn’t defined?

  2. And it further stated it only accounts for the losses for true classes, and for yi__true =0, it will vanish. So now, suppose, we have certain case where our model predicts yihat to be 0.9999 for a class with yi_true=0. Then technically, this seems to be an error but our cross entropy function doesn’t account for it. How can we explain this?


  1. Yes it is a problem when we have our model saying y_pred=0, but we know y_true != 0.
    Using Sigmoid or softmax, the chances of such things happening become almost impossible.
    If we use some other activations, there is a considerable chance of this happening, which is why we may have to use some regularization on top of that.
  2. Again, this can generally be the case when our model is not trained properly. But as it gets the true cases right, it tends to classify better and this 0.99 will most probably be an exception (if we introduce some data sampled from a totally different distribution from the training data’s distribution).