General ML Question

Hello All,
I have a question I encountered this problem where I have a plot from which i can get the x y and target values. By looking at graph I can sample the dataset. The relation between x y and target is given as target = a1cos(a2sqrt(x^2+y^2)+a3).I need to find a1,a2, and a3. Can I use this relation as a hypothesis and mse as error with gradient descent to find the best coefficients?

However I am not sure about intialising the values and learning rate . I am trying different values, but can see mse go up as i understand mse should reduce because gradient moves the point to local minimum or global minima.

Please let me know your suggestions on this

Here is the Github link for my code please let me know if It has some error, I calculated the partial derivatives sympy library :

SairamPreformatted text