Regression Models:How do you know you need a polynomial?

A polynomial term–a quadratic (squared) or cubic (cubed) term turns a linear regression model into a curve.  But because it is X that is squared or cubed, not the Beta coefficient, it still qualifies as a linear model.  This makes it a nice, straightforward way to model curves without having to model complicated non-linear models.

But how do you know if you need one–when a linear model isn’t the best model?

Well, first, a quadratic term creates a curve with one “hump”–  a U or inverted U shape.  The curve does not need to contain both sides of the U.  In can contain just part of it.

A cubic has two humps–one facing upward and the other down.  The curve goes down, back up, then back down again (or vice-versa).

There are three main situations that indicate a linear relationship may not be a good model.

1. Most important is the theoretical one.  There are some relationships that a researcher will hypothesize is curvilinear.  Clearly, if this is the case, include a polynomial term.

2. The second chance is during visual inspection of your variables.  This is one of those reasons for always doing univariate and bivariate inspections of your data before you begin your regression analyses.  (You always do this, right?) A simple scatter plot can reveal a curvilinear relationship.

3. Inspection of residuals.  If you try to fit a linear model to curved data, a scatter plot of residuals (Y axis) on the predictor (X axis) will have patches of many positive residuals in the middle, but patches of negative residuals at either end (or vice versa).  This is a good sign that a linear model is not appropriate, and a polynomial may do better.


Interpreting Linear Regression Coefficients: A Walk Through Output
Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

Reader Interactions


  1. Adjoumani jean jacques says

    Hi, i’m doing some research in animal science and fish nutrition. i have one question to ask and need response if possible. in fact i have five pratical diet: control, and four levels of betaine (0%, 0.6%, 1.2% and 1.8%) respectevily. And i want to know which level is suitable for the growth performance, so i used the four levels of betaine to make the second-degree polinomial regression without the control diet. The question is that, is it possible to make a second-degree polinomial regression with only four levels? thanks
    King regards!

  2. Habtamu says

    I am studying MSc even though I have grown old. There are many things I need to dig to bring to my mind those undergraduate courses of Maths for Engineers and Statistics. It has exactly become 20years since I gruated in BSc in Agri. Engineering.

    So, now it is useful to determine some polynomial fuctioned parameters to calculate for some missing data from the exsiting ones.

    Kindly help me get programs that work on such polynomial fuctions and software used to calculate MATRIX of more than 4 x4 determinants


  3. Gabriel Armah says

    Hi, I am using CM1,AR1,Ar4 etc datasets from NASA .I am trying to predict using Logistic Regression with Regularization. I wish to come out with a Model selection using Linear,Quadratic, upto say X(i) to power 10 or 22. Eg X0B0+X1B1+X2B2+… + XnBn…. Linear; (X0B0)squared+ (X1B1)squared+…+(XnBn)squared; (X0B0)cubed+ (X1B1)cubed+…+(XnBn)cubed; upto (X0B0)tenthpower+ (X1B1)tenthpower+…+(XnBn)tenthpower. Can you help me with the mathlab/Octave code? B is beta to be found and X is the data given. I am using a dataset size of m=10,000 for example. Is it feasible?

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.