• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

Interpreting Interactions in Linear Regression: When SPSS and Stata Disagree, Which is Right?

by Jeff Meyer Leave a Comment

Sometimes what is most tricky about understanding your regression output is knowing exactly what your software is presenting to you.

Here’s a great example of what looks like two completely different model results from SPSS and Stata that in reality, agree.

The Model

I ran a linear model regressing “physical composite score” on education and “mental composite score”.

The outcome variable, physical composite score, is a measurement of one’s physical well-being.   The predictor “education” is categorical with four categories.  The other predictor, mental composite score, is continuous and measures one’s mental well-being.

I am interested in determining whether the association between physical composite score and mental composite score is different among the four levels of education. To determine this I included an interaction between mental composite score and education.

The SPSS Regression Output

Here is the result of the regression using SPSS:

 

The results show that the mental composite score has a slope of 0.283 and is statistically significant at a p-value of 0.01.

The interaction with the first two levels of education, some graduate school and some college, are also significant at a p-value of 0.01. The third interaction with an education level of high school is not significant.

The Stata Regression Output

I then ran the exact same model using Stata:

Now my slope for mental composite score is -0.078 and is insignificant. The interaction with “some college” is also insignificant.

These results don’t match the results of SPSS.  Which software is correct?

It turns out they are both correct. The issue is they are not reporting the same measurements.

The Importance of the Base or Reference Category

Both SPSS and Stata are doing us a favor by dummy coding our categorical predictor, Education, for us.

This means of the four categories, one is considered the base, or reference category, and the other three are compared to that base.

The base category in SPSS is “some grammar school”. The coefficient for mental composite score of 0.283 found in the SPSS model is measuring the slope for this base category, “some grammar school”.

For a one unit change in mental composite score, the physical composite score of people with an education level no higher than grammar school increases on average by 0.283 units. The 95% confidence interval is a range of increase of 0.097 to 0.469 units.

The base category in Stata is “some graduate school”. The coefficient for mental composite score of -.078 is the slope for “some graduate school”.

For a one unit change in mental composite score, the physical composite score of people some graduate school decreases on average by 0.078 units. This parameter estimate is not significantly different from 0 since its 95% confidence interval ranges from a negative to a positive increase.

As noted, the coefficients of the interactions are different both in estimate and in significance. What exactly are the coefficients of the interactions measuring?

The coefficients of the interactions are measuring the difference in slope between the base category of education and the category of education stated in the interaction.

In the SPSS model education=1, some graduate school, has a slope that is -0.361 less than the base, “some grammar school”, whose slope is 0.283. Doing the math we find that “some graduate school” has a slope of -0.078.

The base index in Stata, “some graduate school”, has a slope of -.078. This is identical to SPSS’s calculation.

Instead of “doing the math”, we can have our statistical software calculate the slopes of each line for us.

A graph can help us visualize the results.

Jeff Meyer is a statistical consultant with The Analysis Factor, a stats mentor for Statistically Speaking membership, and a workshop instructor. Read more about Jeff here.

Interpreting Linear Regression Coefficients: A Walk Through Output
Learn the approach for understanding coefficients in that regression as we walk through output of a model that includes numerical and categorical predictors and an interaction.

Tagged With: dummy coding, Interactions in Regression, Interpreting Interactions, interpreting regression coefficients, slopes

Related Posts

  • Your Questions Answered from the Interpreting Regression Coefficients Webinar
  • SPSS GLM or Regression? When to use each
  • Dummy Coding in SPSS GLM–More on Fixed Factors, Covariates, and Reference Groups, Part 1
  • Interpreting Lower Order Coefficients When the Model Contains an Interaction

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

Free Webinars

Effect Size Statistics on Tuesday, Feb 2nd

This Month’s Statistically Speaking Live Training

  • January Member Training: A Gentle Introduction To Random Slopes In Multilevel Models

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.