Likert Scale Items as Predictor Variables in Regression

by Karen Grace-Martin 32 Comments

I was recently asked about whether it’s okay to treat a likert scale as continuous as a predictor in a regression model. Here’s my reply. In the question, the researcher asked about logistic regression, but the same answer applies to all regression models.

1. There is a difference between a likert scale item (a single 1-7 scale, eg.) and a full likert scale , which is composed of multiple items. If it is a full likert scale, with a combination of multiple items, go ahead and treat it as numerical.

2. If it is a a single item, it is probably fine to treat it as numerical. There is more justification for this if it has 7 or more values, but even with 5 you may be okay.

3. There are NO assumptions about the distribution of the predictor (independent) variables in any regression. However, parameter estimates generally are only interpretable for nominal categories or numerical quantities.

The coefficient is interpreted as the difference in the mean of Y, the outcome, for each one-unit difference in X, the predictor. If the predictor is categorical and dummy coded, a one-unit difference simply refers to switching from one category to the other. If the predictor is numerical, a one-unit difference should be meaningful.

Ordinal predictor variables have to be treated as either nominal unordered categories or numerical. In the former case, you are throwing away information about the ordering. In the latter, you’re making assumptions about the differences between the scale items. If those distances can be reasonably considered equal and meaningful, then it is reasonable to treat the predictor as numerical (i.e., if a one-unit change from 1 to 2 is roughly equivalent to a one-unit change from 3 to 4).

For more information and some nice references on using likert scales see my post on “Can Likert Scales Ever be Considered Continuous?”

Four Critical Steps in Building Linear Regression Models

While you’re worrying about which predictors to enter, you might be missing issues that have a big impact your analysis. This training will help you achieve more accurate results and a less-frustrating model building experience.

Comments

Debru Getachew says

July 25, 2021 at 11:24 am

HI, I am Debru Getachew. How can I do regression analysis using multile Independent Variable of Liker scale data with one dependent variable of Financial ratio (ROA) of five years time serious data?

Reply
Umar isah says

January 9, 2021 at 12:15 pm

Compliments of the season sir, I am working on acceptability,Quality,affordability(all independent variables) and success in delivery (dependent variable).All questions are on four point likert scale.I want to know if it is possible to use logistics analysis as predictor of the of the dependent variable from the various independent.

Reply
- Karen Grace-Martin says
  
  December 6, 2021 at 1:08 pm
  
  Yes, you could use logistic regression. You might find this helpful:https://www.theanalysisfactor.com/decide-between-multinomial-and-ordinal-logistic-regression-models/
  
  Reply
Amal George says

June 17, 2020 at 9:34 am

I’m working on:
Dependent Variable – Likert Scale
Independent Variable – Demographic Factors(age, income,gender,marital status,district and education)

Which test should I run in SPSS to find the relation between them?

Reply
- Novel says
  
  November 11, 2020 at 7:15 am
  
  was it order logistic
  
  Reply
Iyasu Demeke says

February 26, 2020 at 5:55 am

I used a 7-likert scale outcome(yield) variable for adoption of technology impact on household income. So, how can I use this likert rate in the multinomial logistic regression and multinomial endogenous switching regression model. Is the regression results sensible and interpretable?

Reply
Hira Haseeb says

January 6, 2020 at 4:09 pm

hi I have 16 predictor variables among which only 1 categorical variable is in lickert scale. I am expected to run a logit model

Reply
- Karen Grace-Martin says
  
  January 24, 2020 at 10:11 am
  
  Hira, that should work. You just have to either dummy-code that variable yourself or, depending on which software procedures you’re using, tell your software that it’s categorical.
  
  Reply
William says

April 22, 2018 at 8:03 am

Hi,

I am writing my dissertation paper and I am struggeling with which type of regression analysis I should use, I created a questionnaire in order to understand the relationship of some factors with the # of ideas implemented, therefore I have the following scenario:

Dependent variable:
Qty of Ideas implemented: ( is the number of ideas that the employees have implemented, e.g an employee can have 20 ideas implemented and another one can have 2.

Independent variable:

Motivation (first factor)
Q1 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q2 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q3 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Creativity (second factor)
Q1 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q2 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q3 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q4 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
.
.
.
And so on…

Someone can help me :'( ?

Reply
- William says
  
  April 22, 2018 at 8:07 am
  
  I forgot to mention that I am using SPSS to analyze my data 🙂 and my intention is combine the different questions under each factors. Should I calculate the mean???? Or Do you have a better way to approach this analysis?
  
  Thanks in advance! 🙂
  
  Reply
- Karen Grace-Martin says
  
  May 15, 2018 at 11:50 am
  
  Hi WIlliam,
  
  It looks like you have a lot going on here. We can help you, but I’d need a lot of clarification and it might take a bit to explain. I think you will need a count model because of your DV, but you also seem to need some sort of PCA or FA for your IV.
  
  I would strongly suggest joining our membership program, Statistically Speaking. We have a lot of resources there–webinars on count models, EFA, and PCA, as well as weekly Q&A sessions. For students it’s just $29/month for everything. Here’s the info: https://www.theanalysisfactor.com/membership-program/
  
  Reply
  - Adriane says
    
    January 8, 2025 at 12:57 pm
    
    Olá, estou com a mesma dúvida, como você fez?
    
    Reply
    - Karen Grace-Martin says
      
      January 22, 2025 at 11:19 am
      
      Adriane, I’m not sure how William did it, but really the first step is probably an exploratory factor analysis.
      
      Reply
Matthew says

September 12, 2017 at 5:33 am

I disagree. The more numbers the more problematic in the interpretation of the coefficient. In fact, I would say that coefficients are rather meaningless if there are a large number of categories. The most meaningful category to me would be a simple dummy =1 if agree or =0 if disagree. If we have a large number of categories then I would seriously question if the latent utility model driving the observed decision can be broken down farther than what can be observed through actions. In other words, if the person is given two choices, take-it or leave-it then it would be impossible to tell to what degree that the person felt about the choice simply by asking them.

Reply
Danai Korre says

May 10, 2017 at 12:11 pm

Hello,
Do you have any academic references to support that?

Best,

Reply
Sarah says

January 14, 2017 at 1:50 pm

Say I have a concept like personal enrichment. Enrichment is a combination of other variables for example how happy I feel, If I consider myself wealthy, if I am sociable etc… say 5 variables in total. Each of these variables is measured using a 6 point likert-like satisfaction scale. I would like to combine these variables into a single variable called enrichment. However, to use the variable enrichment in a linear regression it needs to be a scale variable. My question is this…. how do I do it??

Reply
sh says

June 19, 2016 at 5:02 am

Hi
I have age and experience, and retirement as independent variables and professionalism measure on the Likert scale 1-5. I want to know which dependent variable weighs more on professionalism. Please let me know how to run multiple regression. I have tried it with SPSS several times but failed. The problem is that the items (dependent variable) are not a single variable. How can SPSS handle that?

Reply
learner says

February 16, 2016 at 5:45 am

Hi, my data includes students’ grades (X, XII and college) and their responses for different factors like study habits(3 questions), personality traits (5 questions) etc on 5-point scale (5 – Always, 4 – Often, 3 – Sometimes, 2 – Rarely, 1 – Never)
My objective is to study “factors affecting academic performance of students”

1. Should I consider median for each question and then compare it for male and female group by using chi-square test? Or use t- test for means of each question?
2. To measure the effect of these factors on students grads, how to use regression ? DV is grade but how compute IDV ?
Please help. I am stuck ..

Reply
- Conde says
  
  December 5, 2016 at 7:19 am
  
  Hello im running into the same trouble in the exact same research. did you figure it out, can you help me?
  
  Reply
imran says

January 26, 2016 at 1:28 am

i am conducting a research..
asking students if they want to continue studies after mba on likert scale likely, highly likely, unlikely, highly unlikely etc (this is my dependent variable). while my independent variables are 20 questions again checked with likert scale (strongly agree, agree, nutral, disagree, strongly disagree).
i want to check which question i.e. which variable is strongly impacting student’s decision to continue studies. what test should i apply??
ordinal regression, factor analysis, spearman correlation or anyother??
pls give your expert advice.

Reply
boitumelo says

September 9, 2015 at 9:01 am

hi, I don’t know if this is correct but i am using log linear models to look at patterns of response against particular Likert scale questions. i want to know if the there are trends in responses for all my demographics. am i using the correct procedure for this? I can go for ANOVA but I want to do away with assumptions that come with it, hence my choice for the current test. I figured that if i am going to be running ANOVA and Chi square simultaneously in order to elicit the best resuts based on my judgment on the skewedness of the distribution i would be waiting time. can i confidently say Log linear model can be treated as a non parametric version of ANOVA since Kruskal wallis can only take one factor at a time?
I am a little confused here

Reply
Wais Qarani says

April 14, 2015 at 3:25 am

Hello Colleagues;

My participants will self report their managerial competencies using the below scale (50 managerial competencies in five main categories):

1=Very poor
2=Poor
3=Good
4=Very good
5=Excellent

How I can plan the analysis?

Reply
Fer says

August 11, 2014 at 11:54 pm

Hello Karen,

if my DV and IV’S are likert scales (1=stongly desagree – 5=strongly agree) with multiple items, can i run a ACP for each construct, making scores, testing the cronbah and after that just run a multiple regression ?

Reply
Jun says

March 19, 2014 at 1:36 pm

Hello Karen,

Can I run multiple linear or logistic regression if one or more of my IV is ordinal in nature?

Reply
- Karen says
  
  April 4, 2014 at 9:48 am
  
  Yes, either. You just can’t treat the IV as ordinal.
  
  Reply
Cynthia George says

December 17, 2013 at 10:19 am

Hello.. i want to do regression test between the job satisfaction (DV) and work related stress (IV).
my DV is in the 5 likert scale.. and my IV is in the 5 likert scale too. is it possible to run it?

Reply
- Karen says
  
  December 23, 2013 at 1:25 pm
  
  Hi Cynthia,
  
  The simplest approach would be to do a Spearman correlation, if you don’t have any other covariates to control for. Technically, both of those likert items are ordinal.
  
  Reply
  - Christina Sebastian says
    
    January 3, 2022 at 3:19 am
    
    I’m running similar analysis but my IV is 1 to 6pt likert while my DV is 0 to 5. Will these value differences cause an issue? Do I need to recode my DV to match the IV?
    
    Additionally, I have multiple categories within the DV and IV to analyze. My IV is an average of 3 likert scale categories (1 to 6) while my DV is an average of 4 likert scans categories (0 to 5). There are another 6 nominal and 4 ordinal IV to access.
    
    This far I have had the best results using Spearman. Multiple regression and MANOVA each present errors because 2 of the IV perfectly predict the 3rd and so forth. Any suggestions? Is Spearman sufficient?
    
    Reply
Muhammad says

September 18, 2013 at 3:18 pm

tel me more about the use of likert scale 1-5 and which regression model will be applied and how on spss

Reply
- Karen says
  
  September 25, 2013 at 10:25 am
  
  Hi Muhammad, which regression is applied depends on the dependent variable. Here is more info: https://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/
  
  Reply
Michelle says

May 23, 2012 at 9:19 pm

Thanks a lot!!!

Reply
Wynand says

May 12, 2011 at 7:35 am

I know this article is 2 years old but the references provided were invaluable in justifying my statistical methodology! Thank you so much for taking the effort to write this.

Reply

Reader Interactions

Comments

Leave a Reply Cancel reply