Using Cosine Similarity in Gradient Descent for delta(w) and delta(b)

I was trying to improve my model performance and I used the cosine rule to increase or decrease the angle between the predicted and actual y vector. Would love to know the thoughts from the community. I don’t want to bias your thoughts by revealing the resulting performance of my changed model. Enclosing the highlighted code for your reference.

I haven’t done this assignment.
Can you elaborate a bit more?
How are you applying cosine similarity? Can you share the code snippet where you actually are calculating cosine similarity?

Please check the enclosed screenshot in the original message. I have highlighted the portion where I have applied the same.