Why report estimated marginal means in SPSS GLM?

by Karen

I recently was asked whether to report means from descriptive statistics or from the Estimated Marginal Means with SPSS GLM.

The Estimated Marginal Means in SPSS GLM tell you the mean response for each factor, adjusted for any other variables in the model.  They are found in the Options button.  (These are the same as the LSMeans in SAS GLM).

If all factors (aka categorical predictors) were manipulated, these factors should be independent.

Or at least they will be if you randomly assigned subjects to conditions well.

In this situation only, the estimated marginal means will be the same as the straight means you got from descriptive statistics.

If however, you have a covariate in the model that was measured, not manipulated, things are a little different.  The estimated marginal means will now be adjusted for the covariate.

This, of course, is the reason for including the covariate in the model–you want to see if your factor still has an effect, beyond the effect of the covariate.  You are interested in the adjusted effects in both the overall F and in the means.

In SPSS, the Estimated Marginal Means adjust for the covariate by reporting the means of Y for each level of the factor at the mean value of the covariate.  You can change this default using syntax, but not through the menus.

For example, in this syntax, the EMMEANS statement will report the marginal means of Y at each level of the categorical variable X at the mean of the Covariate V.

UNIANOVA Y BY X WITH V
/METHOD=SSTYPE(3)
/INTERCEPT=INCLUDE
/EMMEANS=TABLES(X) WITH(V=MEAN)
/CRITERIA=ALPHA(.05)
/DESIGN=X V.

If instead,  you wanted to evaluate the effect of X at a specific value of V, say 50, you can just change the EMMEANS statement to:

/EMMEANS=TABLES(X) WITH(V=50)

Another good reason to use syntax.

Editor’s Update: If you want to learn more about Estimated Marginal Means, how to implement and interpret them, as well as the other options in SPSS GLM, check out our workshop on Running Regressions and ANCOVAs in SPSS GLM.  It’s now available in a home study version.


Bookmark and Share

Send to Kindle

{ 36 comments… read them below or add one }

Vandana Menon June 5, 2009 at 12:20 pm

How can I get standard deviations for adjusted estimated marginal means?

Reply

admin June 15, 2009 at 10:28 pm

Vandana,

Most statistical software should give you the standard errors along with the EMM. I know both SPSS and SAS do (SAS calls them LSMeans) in GLM.

Karen

Reply

Katy May 14, 2010 at 5:56 pm

Through some trial and error today I discovered that SPSS doesn’t seem to give the standard error of the mean in the EMM. They are reporting a standard error, but it seems to be based only on sample size and not on standard deviation. Is there any way to get the SEM or the actual standard deviation for estimated marginal means with a covariate? When I try to calculate the stdev from the standard error provided in EMM, I get the same stdev for each group, which seems doubtful. I’m now worried about the legitimacy of using standard error from SPSS EMM in post-hoc t-tests if it is not really the standard error of the mean–anyone have insight on this?

Reply

Karen May 14, 2010 at 8:55 pm

Hi Katy,

That’s a great question. It threw me for a loop when I first discovered it too, but it’s actually not a problem.

The standard errors in the estimated marginal means are all based on the Mean Squared Error (MSE) in the overall ANOVA table. It reports them this way based on the ANOVA assumption that all groups have equal variance.

If that assumption is true, it’s inefficient to report separate estimates of the same population variance. So rather than report the variance separately for each group mean, it uses the average variance of all the groups.

Reply

Far April 18, 2013 at 7:05 am

Fixed factorD Mean Std. Error
Level1 88.742 .751
Level2 88.872 .832
Level3 89.664 .738
Hi there
My design is a factorial design. Factor C with 4 levels and Factor D with 3 levels.
In my final result table I would like to report just one SEM for each fixed factor. The table which you see above is estimated marginal means table after GLM, univariate analysis in SPSS. May I know please which one of these std errors is my SEM for FacorD?
In advance thank you so much for your help and consideration.
Cheers.
the table should be like this

level1 level2 level3 SE
dependent variable 98.11 98.44 97.65 0.265

Reply

Karen April 19, 2013 at 2:39 pm

Hi Far,

Hmm, usually the estimated marginal means give just one std error across a factor, but the descriptives give multiple values.

I’m not sure what’s going on there.

Reply

Katy May 16, 2010 at 8:14 pm

Thanks so much Karen!! That helps a lot!

Reply

Bill May 26, 2010 at 10:16 am

Hi,

Very usefull article thank you,
i have an additional question.
Why the ‘estimated marginal means’ adjusted for a measured covariate are not the same with the means of the new variable NewY which is obtained after saving the unstandardized predicted values ?

Reply

Janelle August 11, 2010 at 9:16 am

I need to somehow obtain a SD from the Marginal Means SE because I have a problem where I have overlapping samples (I have three types of a disease where people may have more than 1 type of this disease) and I’m testing differences between these 3 disease types. I have a way to compute a variance of the differences between overlapping samples but I need to be able to obtain SD rather than a SE. Can anyone help?

Reply

Guido October 6, 2010 at 12:19 pm

Hi,

Nice article! I have an additional question. Is it o.k. if the estimated marginal means have a negative value (on a measure that can’t be negative). Or should my alarm bells be ringing? The data, however, does make sense to me (a ran a GLM with a categorical between factor, one repeated factor, and two continuous predictors). Thanks for any hints!

Reply

Karen October 6, 2010 at 12:45 pm

Hi Guido,

It is possible to get a negative EMMean if the DV can’t be negative, if for example, you asked for the EMMean at the value of a continuous predictor that doesn’t actually exist in the data.

But unless you specifically did something like that, my alarm bells would be ringing. I would just check into it and make sure it’s estimating what you think it is.

Karen

Reply

Guido October 6, 2010 at 1:22 pm

Hi Karen,

Thanks very much for you very quick response! This is exactly what I did; I asked for the means at high and low levels of the continuous predictors. And the data makes sense (theoretically and replicating earlier finding in which I used different paradigm). Thanks again!

Best, Guido

Reply

orna October 31, 2010 at 1:12 pm

Hi Karen, thank you for the informative post!

I recently ran a repeated measures analysis, and I’m not sure which means I should report. I have 2 independent variables (1 within subject, and one between), and the cells are similar in size.

Should I report the estimated marginal means, or should I report the means and SD’s from the descriptive tables? (From some reason, the descriptive do not include overall means and SD’s for the between-subject variable).

Thank you,
orna

Reply

Karen November 1, 2010 at 1:03 pm

Hi Orna,

Report the Estimated Marginal Means. If your independent variables are independent of each other, they shouldn’t differ from the descriptives anyway. And if they do, the EMMeans are the ones you’re interested in.

Karen

Reply

orna November 1, 2010 at 3:09 pm

Thank you very much! :)

Reply

Otto November 17, 2010 at 5:59 pm

It seems, that SPSS 18 doesn’t adjust the Estimated Marginal Means for a Repeated Measures ( Within-Subject)-Variable.

Reply

Karen November 18, 2010 at 10:57 am

Otto, that’s not surprising if you ran it in GLM Repeated Measures. In that approach, the within subject variable is actually made up of multiple variables–one response for each level of the variable (the wide format).

If you ran it in Mixed, it would adjust for the within subject variable, since it is able to account for the within-subject variable as a single variable. It requires setting up the data differently (the long format).

Reply

Sigrid February 21, 2011 at 4:40 am

Hi Karen,
I performed a Gamma GLM and was asked to produce adjusted estimates for my dependent variable because the results should be interpretable for a similar population. I am confused. What I am actually asked for?
I computed model-based estimates as well as robust ones and they did hardly differ. Hence I chose robust estimates since they would allow for errors in incorrectly specified covariance structure. Somehow I have the feeling that this does not address the question. Could you please tell me what I am actually have to do?
Thanks, Sigrid

/EMMEANS SCALE=ORIGINAL
/EMMEANS TABLES=vdichotom1 SCALE=ORIGINAL COMPARE=vdichotom1 CONTRAST=PAIRWISE
PADJUST=SEQBONFERRONI
/EMMEANS TABLES=vdichotom2 SCALE=ORIGINAL COMPARE=vdichotom2 CONTRAST=PAIRWISE
PADJUST=SEQBONFERRONI

Reply

Karen February 25, 2011 at 11:20 am

Hi Sigrid,

I can really only guess what they’re asking for, but it sounds like it isn’t about the standard errors.

The EMMeans adjust for other terms in the model, but that won’t make them interpretable for a similar population.

One thing I just saw in consulting, which I’ve never seen before, is the researcher added a weight command before running her glm. It seemed strange to me because none of the reasons for weighting applied (missing data, complex sample, nonconstant variance).

It turns out that she weighted so that the results would be adjusted to be representative of the population. She had equal n’s in her three samples (it was an experiment), but these samples come from populations that aren’t equally observed in the population.

This seemed strange to me, since she wasn’t estimating the overall population mean, just the mean for each group, but it might be very important in her field in ways I’m not familiar with. Could it be something like that?

Reply

Sigrid March 28, 2011 at 7:13 am

Hi Karen,
thank you for your answer.
I do not think that my situation is comparable to the one you mention. The problem that I suppose they want me to address is, that they would wish to be able to apply my results to all possible pobulations and not just mine – that is representative for my country only. So they (and I) are wondering whether there is a way to make general comments on the results of my calculations.
If you have any idea on how to do it, it would be a great help to me.
Thanks, Sigrid

Reply

alex July 13, 2011 at 10:13 am

Thanks for the content.

I have a related question: I want to know how using SPSS to generate a scatter plot of my data taking corrected for the covariates.

I have a single predictor variable (X) that I am interested in its effect on a single response variable (Y). But I have several covariates and one factor variable.

Can I plot the effect of X on Y taking into account 4 covariates and 1 factor?

Reply

Karen July 15, 2011 at 9:08 am

Hi Alex,

Yes, but you’ll have to do it in two steps.

The first step is to run a regression model regressing Y on the 4 covariates and 1 factor (without X). Save the residuals, which is easy to do in GLM with a /SAVE Resid subcommand.

Those residuals are literally the distribution of Y after controlling for all those covariates. It’s what’s still not explained by those covariates.

Now plot X vs. Residuals.

Reply

Kathy July 25, 2011 at 9:07 am

Hi Karen,
I need to report the standard deviation with my marginal means instead of standard error. Is there anyway to calculate that via spss?

Thanks

Reply

Karen July 27, 2011 at 9:28 pm

Hi Kathy,

I believe the easiest way is to get the descriptives. They won’t be adjusted means, but the standard deviations will be there too.

Either check the descriptives box under the Options button or use /Print Descriptives in syntax.

Reply

Alessio Toraldo November 9, 2012 at 2:01 pm

Dear all
thank you for the useful posts. I have a related problem.
I have to run a GLM analysis with factors A, B and a covariate C.
I wished to know what the EMM of AxB are when C=0, and you already solved my problem, by suggesting the syntax to obtain such information.
However, I also wish to have the significance values for the main effect of A, the main effect of B, and for the interaction AxB, *all computed at C=0.*
SPSS, by default, gives you the ANOVA output table (with all F, df, p-values, etc) with effects of factors and interactions computed for the *average* values of the covariate. Instead, I would need to have the table referring to a specific covariate value (C=0, see above). Do you know how to do it?
Thank you for any suggestions.
Alessio

Reply

Maya February 19, 2013 at 4:02 pm

It’s great to have a plot of marginal means, but how can I add SD or SE to that plot. Can anyone help.
Maybe there is a syntax or something that can help?

Thanks.

Reply

Karen February 20, 2013 at 4:54 pm

Hi Maya,

I don’t know that you can do it within the GLM plot. But you can export the EMMeans table, with standard errors, and plot those.

Karen

Reply

Mariska November 7, 2013 at 6:48 am

Hi,

For a meta-analysis, we need a mean and standard deviation (sd) to calculate effect sizes. We have estimated standardized means and standard errors (se) from SPSS, but no standard deviations. Is it correct to apply the formula sd = se * sqrt(n) on our se from our adjusted analysis to calculate the standard deviation? Thank you for your help!

Mariska

Reply

Karen November 11, 2013 at 3:28 pm

It depends on exactly which procedure you’re using. Your means are standardized? Hmm.

If you’re using, say the estimated marginal means, realize that those are based on the assumption that all groups have the same variance. So those std errors aren’t unique. I’m not sure if you need unique sd’s for meta-analysis.

Reply

Ian December 9, 2013 at 11:34 am

Lots of good advice on this subject, thanks! One issue however: isn’t the rote calculation of EMMs for groups, after adjustment for covariates, equivalent to doing ANCOVA without first testing for heterogeneity of slopes by the significance of the covariate X categorical interaction term?

Reply

Karen December 23, 2013 at 1:36 pm

Hi Ian,

Sure, you don’t want to do any rote data analysis. I encourage people to think about what each result is really telling them, not follow rules.

Yes, you should absolutely test for that interaction, but it’s still useful to use the EMMeans if the lines are not parallel. See this: http://www.theanalysisfactor.com/ancova-assumptions-when-slopes-are-unequal/

Reply

Claudia February 13, 2014 at 1:32 pm

I am a little confused:
- First, you write about “FACTORS (aka categorical predictors)” that were manipulated and not measured.
- Then, you write about “a COVARIATE in the model that was measured, not manipulated… The estimated marginal means will now be adjusted for the covariate.”
- But what if I have a measured factor, which I do not treat as a covariate, but as an independent factor?
- Would this sentence also be right:? “If however, you have an IV in the model that was measured, not manipulated, things are a little different. The estimated marginal means will now be adjusted for the IV.”

Reply

Karen February 14, 2014 at 1:50 pm

Hi Claudia,

Good question. From the model’s mathematical point of view, there is no difference between variables that are manipulated or observed. Observed variables are more likely to be correlated, whereas manipulated ones are more likely to be independent. Beyond that, there is no difference in how SPSS estimates a manipulated or observed variable.

The model only cares if it’s categorical or continuous.

So yes, you would still treat a measured factor as a factor. The only thing that differs is how you will interpret the results. The estimated marginal means will be adjusted for any other predictors, factors or covariates, in the model.

Reply

Andreea May 15, 2014 at 2:11 pm

Hello ,
I have a related problem: I want to obtain predicted means of outcome adjusted for various other factors (use the model to predict the outcome at mean values of the co-variates). However I have both categorical and continuous confounders, so I cannot do mean for categorical ones, maybe mode. Is there an easier way in GLM to do this taking into account that some of my predictors are categorical? Initially I was planning to do it in a linear regression, do dummies for my categorical variables, and then work out the modal value of the categorical predictors and add them to the first free row for the corresponding variable at the bottom of the file. Do the same – with the mean – for any continuous predictors. Then in the Linear regression dialogue box select the ‘Save’ and check the ‘Unstandardized predicted values’ and ‘Mean prediction intervals’ boxes. It will save the predicted value plus confidence intervals in that row in the datasheet.
However I was hoping estimated marginal means will help me work around all those steps, but how does it account for the categorical predictors? thank you!

Reply

Omar May 26, 2014 at 10:54 am

Hello Karen,

At the moment, if I want to know the EMMs evaluated at multiple values of the covariate, I create separate EMM tables. e.g. extending your example:

UNIANOVA Y BY X WITH V
/METHOD=SSTYPE(3)
/INTERCEPT=INCLUDE
/EMMEANS=TABLES(X) WITH(V=50)
/EMMEANS=TABLES(X) WITH(V=100)
/EMMEANS=TABLES(X) WITH(V=150)
/CRITERIA=ALPHA(.05)
/DESIGN=X V.

Is it possible to do this within a single table?

Thanks

Reply

Karen May 27, 2014 at 9:33 am

Hi Omar,

Not that I know of. The way you’re doing it is the way I do it.

Reply

Leave a Comment

{ 1 trackback }

Previous post:

Next post: