In the video ‘CNN Architectures using Pytorch: Inception part 2’, the loss is written as
loss = loss_fn(outputs,labels) + 0.3 * loss_fn(aux_outputs, labels), to prevent vanishing gradients in inception modules.
However, since all weights in the inception modules of main branch are frozen, and we are not utilising the aux_outputs to calculate final accuracy in any way, training the weights in the aux output branch should not have any effect on the final output. Only if some of the convolution layers of inception modules before the point where aux branch connects to the main inception branch are unfrozen (made required_grad = True), should adding loss of aux output branch have any meaning.
Is this understanding correct ?