In the attached screenshot, I have two queries:-
a. Why we’re using mean to calculate accuracy? Accuracy is no. of correct predictions divided by total no. of predictions but I’m not getting why we’re taking mean here.
return (pred == y).float().mean()
b. Why we’re dividing weights by sqrt of 2 here? And what does requires_grad() function will do with the weights?
weights1 = torch.randn(2, 2) / math.sqrt(2)
weights2 = torch.randn(2, 4) / math.sqrt(2)
In the video, i’m not able to understand the explaination given. So, please provide a detailed explaination.