It was told that a final 1x1 convolution with D0 filters will be applied on the aggregated sum of layers after different convolution operations to reduce the final D1 layers depth to D0 layers.
But actually the no. of layers are increasing after each operation.
How 1x1 convolutions reducing the depth in GoogleNet CNN?
#The intuition behind GoogleNet