Weight Initialization and Learning rate scheduling

In the course, we’ve been taught about Xavier and He weight initialization and how effective they can be. But there was no demo for it. Also in the other hands on we never really used any of the two weight initialization techniques. I’ve searched the internet a lot but still can’t figure out how to do the same. Also, do we often initialize weights using Xavier/ He in real life whenever dealing with data? Because I’ve barely ever seen weight initialization in anyone’s code.
Another topic that was mentioned in the course but not covered was learning rate scheduler. It’d be great if you could maybe clarify when is it used and is it better than the normal way of using the optimizer.

Yes, we did not cover weights initialization in the hands-on sessions.
For the basic models (like Perceptron), we would have done random initialization for the model’s weights and biases.

This can be easily extended to Xavier and other simple weight initialization methods, but these weight initializations may be an overkill for such models. Also for larger models, we generally tend to use PyTorch (in our course), for which it’s very simple to make those weight initializations using torch.nn.init.

But yes, we could have showed a quick demo on how to do the above, and also we missed concepts like learning rate scheduling, weight decaying etc. since the course was already broader in terms of content diversity. We will take a note of it and ensure to have it covered in our next courses.

Thanks a lot for the feedback :slight_smile: