How to identify the relationship of a given dataset initially to choose an appropriate model and algorithm to perform a task?

If a dataset is given how to first find out the relationship of data to identify whether the data is having a linear or non-linear relationship such that one can choose an appropriate model and algorithm to do the desired task? As a rule of thumb is it good practice to start with a linear regression model and see its accuracy and decide further?

once you have decided your x & y in the data set (i.e. you have decided what task you are gonna perform) then you select the best model on the relationship of x & y by finding the parameters through optimizing the loss function.

Yes, initially we should try to check with some linear model, and then try non-linear choices. Though there might be various rules proposed by authors, the simple it is, the better it will be. You should take a read of this article : How to Choose Between Linear and Nonlinear Regression - Statistics By Jim

1 Like

A post was split to a new topic: Polynomial Regressions vs Linear vs Non-Linear Regression