The intercept (often labeled the constant) is the expected mean value of Y when all X=0.

Start with a regression equation with one predictor, X.

If X sometimes = 0, the intercept is simply the expected mean value of Y at that value.

If X never = 0, then the intercept has no intrinsic meaning. In scientific research, the purpose of a regression model is to understand the relationship between predictors and the response. If so, and if X never = 0, there is no interest in the intercept. It doesn’t tell you anything about the relationship between X and Y.

You do need it to calculate predicted values, though. In market research, there is usually more interest in prediction, so the intercept is more important here.

When X never =0 is one reason for centering X. If you rescale X so that the mean or some other meaningful value = 0 (just subtract a constant from X), now the intercept has a meaning. It’s the mean value of Y at the chosen value of X.

If you have dummy variables in your model, though, the intercept has more meaning. Dummy coded variables have values of 0 for the reference group and 1 for the comparison group. Since the intercept is the expected mean value when X=0, it is the mean value only for the reference group (when all other X=0).

This is especially important to consider when the dummy coded predictor is included in an interaction term. Say for example that X1 is a continuous variable centered at its mean. X2 is a dummy coded predictor, and the model contains an interaction term for X1*X2.

The B value for the intercept is the mean value of X1 only for the reference group. The mean value of X1 for the comparison group is the intercept plus the coefficient for X2.

{ 12 comments… read them below or add one }

I’d like to now why the need for a column of ones in the model to account for the intercept. I would need a basic answer, since I’m not a mathematician. Thank you.

In the X matrix, each column is the value of the X that is multiplied by that regression coefficient.

Since the intercept isn’t multiplied by any values of X, we put in 1s.

It makes all the matrix algebra work out.

Karen

quetion: if wage =-5+10*years of education and wage is measure in 1000s; how do you interpret the coeffficient and does the intercept make sende

This sounds like a homework question, so I’m going to try to answer only by getting you to think through it.

Since the intercept ALWAYS is the mean of Y (1000 of dollars or whatever the currency is) when X=0, it will only be meaningful if it’s meaningful that X=0 AND if there are examples in the data set. Is there anyone in the data set with years of education = 0?

does this mean that if education is =to zero, i.e no education, then the expected mean of y =-5

Yes.

als would like to as about, if we decrease sample by half will SSE, SSR, SST increase or decrease, a bit confused.

None would change, theoretically. Sums of Squares are not directly affected by sample size.

Hi! What happens if all of my variables can be 0 which had a significant regressions coefficient? (I have four Xs, 3 of them have a significant coefficient and can be 0 as they are either dummies or are on a scale from 0 and there are 0s in the sample, but one of the Xs cannot be 0. It’s also the one with not significant coefficient.)

Thanks!

Hi Irena,

If ANY of the Xs can’t be 0, then the intercept doesn’t mean anything. Or rather, it’s just an anchor point, but it’s not directly interpretable.

In a negative binomial regression, what would it mean if the Exp(B) value for the intercept falls below the lower limit of the 95% Confidence Interval?

Hmm, not sure I understand your question. CI for what?

{ 1 trackback }