I was trying to build an multi-class image classifier using the VGG architecture as reference has 6-Conv layers and 3 densely connected layers, the model has around 5.4m parameters and training a single epoch is taking around 4-5 hours on colab (ps. I have used GPU at runtime, added dropout regularly at MaxPooling, also used Batch gradient descent and various other optimizing practices)
I would be happy if you provide me with some suggestions to train the model faster if possible
What’s the size of your dataset?
The training set has a total of 19.5k images from a total of 20 classes.
I also resized the images at the input size to 64x64
But to my surprise only the first epoch took 5:07 hours after that every epoch was converged in a maximum time of 170-190 secs.
The initial delay might be due to mounting the drive and accessing data directly from it.
Oh, so you’ve stored your data in your drive and you’re mounting it?
Definitely this is the main reason!
A better option you can try is upload your data on kaggle, and create a private notebook to avoid this.