Polynomial Regression vs Linear vs Non-Linear Regression

I am bit confused now about the differences between linear and non-linear models.

From my understanding before reading this article: I thought for linear models the degree of polynomial of independent variables will only be equal to 1 and therefore a linear combination of parameters and independent variable with some constant leads to a linear function.

But in this article it is said that linear means linear parameters and the exponent of independent variable can be raised to some higher power in order to fit the data. Isn’t this non-linear? But as per the article it is still a linear model.

My question is if the degree of polynomial of an independent variable is greater than 1 doesn’t the equation become non-linear and what does linear parameters mean in linear model?

It’s a general confusion among people to confuse polynomial regression with non-linear regression.

The “linear” in linear regression means that the hypothesis learnt is a linear function of the (learnable) parameters of the model. It does not mean a linear function of the input/independent variables.

For example, both:
y = θ_{0} + θ_{1}x
y = θ_{0} + θ_{1}x + θ_{2}x^{2}

…correspond to a linear regression hypothesis, because y is always a linear function of the parameters (θ).
We sometimes call the equation (2) as polynomial (linear) regression, because the linear model has its input variables with polynomial degrees (that is, power > 1)

If you were to design a hypothesis like this:
y = θ_{0} + θ_{1}x + θ_{1}^{2}x^{2}

Even this cannot be called a non-linear regression, since you can theoretically think of replacing θ_{1}^{2} with just another parameter θ_{2} which could be learnt nearly equal to that of θ_{1}^{2} . Or in other words, polynomial modelling is not same as that of non-linear modelling.

Non-linearity is introduced in a model only when we bring in non-linear functions or transformations into the hypothesis.

Training a supervised neural network is an example of non-linear regression. Why?
Consider an NN with 1 hidden layer followed by output layer:

y = f^{out}(f^{h1}(X))

where the hidden layer is a non-linear mapping from X, which could be defined as:
f^{h1}(X) = g(WX+b)
( Non-linearity is introduced by the activation function g(z) )

This is just one easy example of non-linear regression.
In Data Science, depending upon the application, people use specific non-linear hypothesis functions for their domain data to fit their model.


Thank you. Nice explanation.

Its confusing. A polnomial function is a non-linear function, isn’t it?

Polynomial functions are non-linear in the context of independent variables (whose polynomial degree is greater than 1). But, the with respect to parameters it cannot be said so. This is what I understood. Below are some examples.

Equations 1, 2 and 3 are linear equations because y is a linear function of the parameters.
y = m_{0} + m_{1} x_{1} + m_{2} x_{2}^{2} - - - - -> equation 1
y = m_{0} + m_{1} x_{1} + m_{1}^{2} x_{2}^{2} - - - - -> equation 2
y = m_{0} + m_{1} x_{1} + m_{2} x_{2}^{2} - - - - -> equation 3 (equation 3 and 2 are same if m_{1}^{2} = m_{2})

Equations 4 and 5 are non-linear equations because y is a non-linear function of the parameters.
y = m_{1} * x_{1}^{m_{2}} - - - - -> equation 4
y = m_{1} + (m_{2} - m_{1})^{(-m_{3} x_{1}^{m_{4}})} - - - - -> equation 5

where m_{0}, m_{1}, m_{2}, m_{3}, m_{4} are the parameters in the equations 1 to 5.