I was working on a kaggle kernel to train a deep learning model. I wanted to know the difference between gpu usage and gpu memory percentage?
I am initially sending the net to the gpu device then I am using a dataloader to get a batch of tensors and sending it to the gpu for training the network. My GPU usage percentage keeps fluctuating and the GPU memory remains constant. what does this indicate? How do i know that my model is being trained on the GPU ? And are there any methods to train models on the GPU efficiently?
Note : The memory that my batch size occupies is huge and hence i cant load the entire data into the GPU for training
Some suggestions for working with limited GPU memory:
- You can reduce the batch size.
- You can try out progressive training, and train your network in sub stages rather than loading the complete network in one go.
- Avoid using extra tensors for storing an intermediate output.