Does adding loss of aux output to the loss of main output matter in Inception model?

In the video ‘CNN Architectures using Pytorch: Inception part 2’, the loss is written as
loss = loss_fn(outputs,labels) + 0.3 * loss_fn(aux_outputs, labels), to prevent vanishing gradients in inception modules.

However, since all weights in the inception modules of main branch are frozen, and we are not utilising the aux_outputs to calculate final accuracy in any way, training the weights in the aux output branch should not have any effect on the final output. Only if some of the convolution layers of inception modules before the point where aux branch connects to the main inception branch are unfrozen (made required_grad = True), should adding loss of aux output branch have any meaning.

Is this understanding correct ?


Hi @parsar0,
Just to confirm this, did you tried experimenting both and comparing the results between the two?

I did now. Ran 2 scenarios, each twice.

a) With only output layers of aux output and output unfrozen (same as lesson). Got following results:
Epoch: 0/1, Test acc: 47.57, Train acc: 47.23
Accuracy for best model: 47.29, 46.79

Epoch: 0/1, Test acc: 47.14, Train acc: 47.01
Accuracy for best model: 47.23, 46.97

b) In addition, setting all layers of Mixed_6e module unfrozen
Epoch: 0/1, Test acc: 64.71, Train acc: 64.64
Accuracy for best model: 63.57, 62.51

Epoch: 0/1, Test acc: 65.55, Train acc: 65.71
Accuracy for best model: 64.82, 65.20

So there is a consistent improvement. Seems to work.

1 Like

Great :slight_smile:
Let me know if there’s anything i can help with is this regards.