Help needed with the code shown below (sequence models DL#110)

In the deep learning course, in the 10th module that is sequence models, we are practically shown how to implements RNNs, LSTMs and GRUs. So in the RNN part of the practical videos, we set up the full training setup which will effectively be used for any of the three. I am attaching a snippet of the code.


In the training setup, we’re looping through the number of batches which are 10 in this case. I am unable to understand what the following code in the loop means:
‘’’
for i in range(n_batches):

    loss_arr[i+1] = (loss_arr[i]*i + train(net, opt, criterion, batch_size))/(i + 1)

‘’’

It’d be great if someone could clear as to why the loss is being multiplied (loss_arr[i] * i) and why we’re dividing and adding the following to it (train(net, opt, criterion, batch_size)). And also why are we dividing the whole by i + 1 at each step.
Thanks in advance.

Let me expand that code snippet so that it looks simple:

for batch_count in range(n_batches):
    current_batch_loss = train(net, opt, criterion, batch_size)
    total_loss_till_prev_batch = loss_arr[batch_count] * batch_count
    total_loss_till_now = total_loss_till_prev_batch + current_batch_loss
    current_avg_loss = total_loss_till_now/(batch_count+1)
    loss_arr[batch_count+1] = current_avg_loss

Things to remember:

  1. loss_arr[i] contains the loss averaged from batch 0 till batch i
  2. batch_count begins from zero.

Essentially, at each step, we are storing the running average loss for each batch in the loss_arr list.
Please let us know if you can understand from the above simplification.

3 Likes

Hey Gokul. Thanks for explaining. That helped a lot!