This is the formula of loss function of Sigmoid Neurons (SSE) as shown below.

But here Mitesh Sir used MSE while training. I’ve seen SSE and MSE being used alternatively at many places. So is it like no one particular standard loss function is defined for a particular model? We can switch between SSE and MSE?The y in the first image is the output and the function is a sigmoid function. The loss is calculated below using MSE (mean squared error) function and y and y_hat values are fed into it. Besides since it is a square function, (y - y_hat)^2 and (y_hat - y)^2 will give the same result. The constant function is simply to average the total error, which does not effect the learning much as our objective is to minimise the overall loss, by optimising our parameters. You may try to use both the loss functions (with and without the constant) and compare the results. Hope this helps!!

On a fundamental view, when we plot the graph for the comparison of these two losses, we’ll have an identical graph (for a clarity, we can see SSE same as MSE, with just a constant multiplied)