In DL course I read about sigmoid neurons. Sigmoid neuron and logistic regression model seem to have the same structure that is real value-weighted inputs, summation and finally sigmoid fucntion. In Logistic Regression, we use the maximum likelihood concept for training while in the sigmoid neuron, we use the sum of squared error. And I read MSE shouldn’t be used as loss function in logistic regression unlike linear regression as it is non-convex so it won’t be able to reach the global minima (reference). But sum of squared error used in sigmoid neuron is similar to MSE, so why both are trained in such different ways? Why not sum of squared error or maximum likelihood or MSE is used in both?
Thank you in advance.