What happens in Batch Normalization when batch_size=1?

I was going through the theory video of Batch Normalization and Dropout in DL course.During the Batch Normalization Video,the instructor provided the context about it which is "Whatever batch size of Input data you are using,you will going to normalize/standardize based upon that batch size(Input layer and Hidden Layer) and that is why it is called Batch Normalization.Now my question what if we use stochastic gradient descent which has batch size of 1.Here in this case i can’t to normalization.What should we do in this case?

Hi @rishabhsaran,
In such a scenario, some frameworks which accept the input, perform Instance Normalization (refer below image).
For a better overview about distinguishing different types of Normalization, you can refer to: An Overview of Normalization Methods

1 Like