In **1st screenshoot (grad_w_ce & grad_b_ce)** and **2nd screenshoot right side(grad_w & grad_b)**are suppose **to be equal** because they both are gradient decent with respective to cross entropy. But why they’re diffrent in explination?

So can you please help because i got stucked there.

1 Like

It’s a fair doubt, but notice that we’re taking `grad_b = -1 * (1 - y_pred)`

only where `y==1`

.

We can simplify it as:

`grad_b = (-1 + y_pred)`

Therefore,

` grad_b = (y_pred - 1)`

which is nothing but

`grad_b = (y_pred - y)`

as y is 1.