• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
The Analysis Factor

The Analysis Factor

Statistical Consulting, Resources, and Statistics Workshops for Researchers

  • Home
  • About
    • Our Programs
    • Our Team
    • Our Core Values
    • Our Privacy Policy
    • Employment
    • Guest Instructors
  • Membership
    • Statistically Speaking Membership Program
    • Login
  • Workshops
    • Online Workshops
    • Login
  • Consulting
    • Statistical Consulting Services
    • Login
  • Free Webinars
  • Contact
  • Login

Mixed Models: Can you specify a predictor as both fixed and random?

by Karen Grace-Martin 15 Comments

One of the most confusing things about mixed models arises from the way it’s coded in most statistical software.  Of the ones I’ve used, only HLM sets it up differently and so this doesn’t apply.

But for the rest of them—SPSS, SAS, R’s lme and lmer, and Stata, the basic syntax requires the same pieces of information.

1.       The dependent variable

2.       The predictor variables for which to calculate fixed effects and whether those are categorical or continuous.  Each software has a different way of specifying them, but they all need to know that.

3.       The predictor variables for which to calculate random effects, the level at which to calculate those effects, and if there are multiple random effects, the covariance structure of those effects.

The confusion comes in when we specify the same predictor in both the fixed and random parts.  The syntax makes it look like we’re specifying the same predictor as both fixed and random.

But we’re not. It’s not only okay, it’s often the only way to write the model appropriately.

Let’s take a very simple example.  This is the same model I use in my free webinar Random Intercept and Random Slope Models.  If you haven’t seen it and want more detail, you can get the recording here.

The basic idea, though, is we’re comparing the economic growth over 5 decades between Rural and Metropolitan counties.

Economic growth is the outcome, measured in thousands of jobs (JobsK). JobsK is continuous.

County indicates from which county the observations come.  Each county has up to 5 measurements, and this is why we need the mixed model—to account for the inherent correlation among the multiple observations from the same county. County is categorical.

Time indicates number of decades since 1960, and ranges from 0 to 4. Treated as continuous.

And Rural is an indicator (aka dummy) variable for whether the county is rural. Rural is categorical.

SPSS

MIXED JobsK BY Rural WITH Time
/FIXED =Rural Time Rural*Time
/RANDOM Intercept Time|Subject(COUNTY) covtype(UN).

SAS

Proc mixed;
Class rural county;
Model JobsK=rural|time/solution;
Random int time/subject=county type=un;
run;

R’s lme

>model<-lme(JobsK~rural*time, random=~time|County,data=countylong, na.action=na.omit)

Stata

mixed JobsK c.Time##Rural||County:Time,variance reml cov(un)

You can see here that Time is listed in the fixed portion of the model, which appears in SPSS’s Fixed statement, SAS’s model statement, before the || in Stata, and before the comma in R.

And it’s also listed in the random portion, which appears in SPSS’s and SAS’s Random statement, after the || in Stata, and after the comma in R.

It looks like we’re treating Time as both fixed and random.  If we’re not, then what the heck are we doing?

The fixed portion is doing exactly what a linear model does.  It fits an overall regression line over time.  Since we have both Rural and a Rural*Time interaction, it actually fits two regression lines—one for the rural counties and one for the metropolitan counties.  The coefficient we get for Rural measures the difference in their intercepts and the coefficient for the interaction measures the difference in their slopes.

Just to emphasize: This fixed effect for time measures the overall effect for time across all counties.  It’s often called the population average effect, because it’s an estimate of the effect of time for the population of all counties.

Okay, so what is that random effect of time?  Aren’t we making Time random as well as fixed?

As I said earlier, no.

A key part of the random statement is the identification of the Subject.  In this example, it’s County.  It’s really County that is a random factor in the model and we’re specifying two random effects for those Counties—an intercept and a slope over Time.

The random slope for Time at the County level means that the slope across time varies across Counties.  In other words, the effect of Time on Jobs (the slope) is different for different values of County.

If you are thinking that it sounds like we’re really fitting an interaction between Time and County, then you would be correct. We are.

Because this slope is a random effect, we don’t measure this interaction through a regression coefficient as we would if it were fixed.

Instead, we measure how much each County’s slope differs from the population average slope, then find the variance of these difference measures.  That’s the variance estimate for the random slope.

If that variance comes out to 0, it indicates that the slope of Time on Jobs is actually the same for all counties—they don’t vary from each other.

Now of course, we’re not doing these steps directly.  But that is basically what the model is doing, through a lot of complicated statistical algorithms.

So, to reiterate the central point: Time in the fixed statement measures the overall effect of time on jobs across all counties.  Time in the random statement measures the variance in the effects of time on jobs across counties.  It looks the same in the syntax, but it’s actually a very different concept.

 

Random Intercept and Random Slope Models
Get started with the two building blocks of mixed models and see how understanding them makes these tough models much clearer.

Tagged With: fixed effect, linear mixed model, random effect, Random Factor, Repeated Measures

Related Posts

  • Multilevel, Hierarchical, and Mixed Models–Questions about Terminology
  • Statistical Software Access From Home
  • Member Training: What’s the Best Statistical Package for You?
  • The Difference Between Random Factors and Random Effects

Reader Interactions

Comments

  1. Leo says

    August 30, 2020 at 9:44 am

    Karen, thanks for this explanation. So, are there situations where the same variable needs not appear in both FE and RE? What if I have a level-2 predictor that does not vary at level 1 at all (eg. state-level policy attributes that apply to everybody in the same state)? Thank you.

    Reply
  2. shalini says

    July 17, 2020 at 11:20 pm

    This is great content! Nicely and clearly written. Thank you so much for sharing these freely!

    Reply
  3. Dasha says

    April 20, 2020 at 10:21 am

    Hi Karen, I am wondering whether it is possible that, for instance, the fixed time effect does not turn out to be significant but the random effect for time does? in such a case, would it be ok to state the model as follows? Thank you!

    model<-lme(JobsK~rural, random=~time|County,data=countylong)

    Reply
    • Karen Grace-Martin says

      April 20, 2020 at 1:05 pm

      Hi Dasha,

      Totally possible.

      The fixed effect is testing whether the average effect for all counties (for example for Time) = 0. The random effect is testing whether the effect of time is the same for all counties (variance among the counties=0).

      Reply
  4. Ganesh Sharma says

    November 15, 2019 at 10:31 am

    Hi Karen,
    Thats very helpful. Glad to find this content on web.

    Thankyou very much.

    Reply
  5. Salah Lotfi says

    February 14, 2019 at 5:56 pm

    Hi Karen,
    This was very intuitive and helpful. Thanks for taking time and making difficult topics easy to understand.

    Reply
  6. Alexander says

    November 14, 2018 at 12:18 pm

    Hi Karen, I have a question concerning this topic. When I build my model, would I introduce the random slopes last or first, compared to the fixed effects? In my example the final model itself works, but if I introduce factor1 as a random slope first, adding it as a fixed factor after that does not significantly increase model fit. If I add it as a fixed factor first, it significantly increases model fit (p<.000).
    I would be interested in the justification of using a model like that. Which way would you argue?

    Reply
  7. Patrick says

    May 10, 2018 at 3:08 pm

    Hey Karen,

    you are producing high quality content. I’m really glad I found this website. Same goes for the webinars.

    Do you think you could write a few lines about covariance structures in repeated measures mixed models some day? That’d be really great 🙂

    Patrick

    Reply
    • Karen Grace-Martin says

      May 17, 2018 at 10:47 am

      Hi Patrick,

      That is actually a difficult topic because there are two different places in mixed models where you specify covariance structures–the G matrix and the R matrix. Unfortunately, the structures that make sense in one don’t make sense in the other. And also unfortunately, I have found that understanding the difference between them is the biggest stumbling block for understanding mixed models in repeated measures. We spend hours on it in my Analyzing Repeated Measures workshop.

      People have a lot of lightbulb moments in that workshop because we’re so deliberate about how we teach it. But it really does take hours to explain.

      All that said, I have written a few things on the topic here:

      https://www.theanalysisfactor.com/covariance-matrices/
      https://www.theanalysisfactor.com/mixed-models-repeated-measures-g-side-r-side/
      https://www.theanalysisfactor.com/unstructured-covariance-matrix-when-it-does-and-doesn%e2%80%99t-work/

      Reply
  8. Meredith says

    March 27, 2018 at 1:42 pm

    This is very helpful! Thank you!

    Reply
  9. skan says

    April 25, 2017 at 5:23 am

    Hello.

    Is it possible to have a model such as (in lme4 notation)
    Y ~ x + ID + (1|ID)
    Where a variable appears both as the fixed effect and as the subject of the random effect?

    Reply
    • Karen Grace-Martin says

      May 17, 2018 at 10:50 am

      Hi Skan,
      No. That is specifying the variable as both fixed and random. What you can do is include a level-1 variable in both the fixed portion and as a random slope across subject.

      Reply
  10. SChang says

    July 12, 2016 at 10:58 pm

    Thanks for a very nice answer to the question!!

    Reply
  11. PRAVATA KUMAR DASH says

    April 17, 2016 at 10:40 pm

    Hi Karen,

    Thank you very much for the explanation. However, one question always crops up in my mind: I have a response variable which is sales at sku level. Suppose in a random effect model we are trying to get random effects for a media variable on different skus (10 skus) using SAS. But the covariance parameter is not significant and hence there is no random effect. Then we try interaction effect sku*Media_TV in the model statement in SAS. If it comes out to be significant, we have interaction effect and hence different slope coefficients for different skus. How do we interpret this and what is the difference between interaction effect and random effect here, from a business point of view?

    Reply
    • Karen Grace-Martin says

      May 17, 2018 at 10:52 am

      Pravata,

      I’m not following your design well enough to actually give you advice.

      I will say though, not to use the p-values for covariance parameter estimates. They’re considered unstable unless you have an enormous data set.

      Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

Primary Sidebar

This Month’s Statistically Speaking Live Training

  • February Member Training: Choosing the Best Statistical Analysis

Upcoming Workshops

  • Logistic Regression for Binary, Ordinal, and Multinomial Outcomes (May 2021)
  • Introduction to Generalized Linear Mixed Models (May 2021)

Read Our Book



Data Analysis with SPSS
(4th Edition)

by Stephen Sweet and
Karen Grace-Martin

Statistical Resources by Topic

  • Fundamental Statistics
  • Effect Size Statistics, Power, and Sample Size Calculations
  • Analysis of Variance and Covariance
  • Linear Regression
  • Complex Surveys & Sampling
  • Count Regression Models
  • Logistic Regression
  • Missing Data
  • Mixed and Multilevel Models
  • Principal Component Analysis and Factor Analysis
  • Structural Equation Modeling
  • Survival Analysis and Event History Analysis
  • Data Analysis Practice and Skills
  • R
  • SPSS
  • Stata

Copyright © 2008–2021 The Analysis Factor, LLC. All rights reserved.
877-272-8096   Contact Us

The Analysis Factor uses cookies to ensure that we give you the best experience of our website. If you continue we assume that you consent to receive cookies on all websites from The Analysis Factor.
Continue Privacy Policy
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

Non-necessary

Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.

SAVE & ACCEPT