# Likert Scale Items as Predictor Variables in Regression

I was recently asked about whether it’s okay to treat a likert scale as continuous as a predictor in a regression model.  Here’s my reply.  In the question, the researcher asked about logistic regression, but the same answer applies to all regression models.

1. There is a difference between a likert scale item (a single 1-7 scale, eg.) and a full likert scale , which is composed of multiple items.  If it is a full likert scale, with a combination of multiple items, go ahead and treat it as numerical.

2. If it is a a single item, it is probably fine to treat it as numerical.  There is more justification for this if it has 7 or more values, but even with 5 you may be okay.

3. There are NO assumptions about the distribution of the predictor (independent) variables in any regression.  However, parameter estimates generally are only interpretable for nominal categories or numerical quantities.

The coefficient is interpreted as the difference in the mean of Y, the outcome, for each one-unit difference in X, the predictor.  If the predictor is categorical and dummy coded, a one-unit difference simply refers to switching from one category to the other.  If the predictor is numerical, a one-unit difference should be meaningful.

Ordinal predictor variables have to be treated as either nominal unordered categories or numerical.  In the former case, you are throwing away information about the ordering.  In the latter, you’re making assumptions about the differences between the scale items.  If those distances can be reasonably considered equal and meaningful, then it is reasonable to treat the predictor as numerical (i.e., if a one-unit change from 1 to 2 is roughly equivalent to a one-unit change from 3 to 4).

For more information and some nice references on using likert scales see my post on “Can Likert Scales Ever be Considered Continuous?”

Four Critical Steps in Building Linear Regression Models
While you’re worrying about which predictors to enter, you might be missing issues that have a big impact your analysis. This training will help you achieve more accurate results and a less-frustrating model building experience.

### Related Posts

1. Debru Getachew says

HI, I am Debru Getachew. How can I do regression analysis using multile Independent Variable of Liker scale data with one dependent variable of Financial ratio (ROA) of five years time serious data?

2. Umar isah says

Compliments of the season sir, I am working on acceptability,Quality,affordability(all independent variables) and success in delivery (dependent variable).All questions are on four point likert scale.I want to know if it is possible to use logistics analysis as predictor of the of the dependent variable from the various independent.

3. Amal George says

I’m working on:
Dependent Variable – Likert Scale
Independent Variable – Demographic Factors(age, income,gender,marital status,district and education)

Which test should I run in SPSS to find the relation between them?

4. Iyasu Demeke says

I used a 7-likert scale outcome(yield) variable for adoption of technology impact on household income. So, how can I use this likert rate in the multinomial logistic regression and multinomial endogenous switching regression model. Is the regression results sensible and interpretable?

5. Hira Haseeb says

hi I have 16 predictor variables among which only 1 categorical variable is in lickert scale. I am expected to run a logit model

• Hira, that should work. You just have to either dummy-code that variable yourself or, depending on which software procedures you’re using, tell your software that it’s categorical.

6. William says

Hi,

I am writing my dissertation paper and I am struggeling with which type of regression analysis I should use, I created a questionnaire in order to understand the relationship of some factors with the # of ideas implemented, therefore I have the following scenario:

Dependent variable:
Qty of Ideas implemented: ( is the number of ideas that the employees have implemented, e.g an employee can have 20 ideas implemented and another one can have 2.

Independent variable:

Motivation (first factor)
Q1 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q2 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q3 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Creativity (second factor)
Q1 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q2 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q3 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
Q4 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)
.
.
.
And so on…

Someone can help me :'( ?

• William says

I forgot to mention that I am using SPSS to analyze my data 🙂 and my intention is combine the different questions under each factors. Should I calculate the mean???? Or Do you have a better way to approach this analysis?

• Hi WIlliam,

It looks like you have a lot going on here. We can help you, but I’d need a lot of clarification and it might take a bit to explain. I think you will need a count model because of your DV, but you also seem to need some sort of PCA or FA for your IV.

I would strongly suggest joining our membership program, Statistically Speaking. We have a lot of resources there–webinars on count models, EFA, and PCA, as well as weekly Q&A sessions. For students it’s just \$29/month for everything. Here’s the info: https://www.theanalysisfactor.com/membership-program/

7. Matthew says

I disagree. The more numbers the more problematic in the interpretation of the coefficient. In fact, I would say that coefficients are rather meaningless if there are a large number of categories. The most meaningful category to me would be a simple dummy =1 if agree or =0 if disagree. If we have a large number of categories then I would seriously question if the latent utility model driving the observed decision can be broken down farther than what can be observed through actions. In other words, if the person is given two choices, take-it or leave-it then it would be impossible to tell to what degree that the person felt about the choice simply by asking them.

8. Sarah says

Say I have a concept like personal enrichment. Enrichment is a combination of other variables for example how happy I feel, If I consider myself wealthy, if I am sociable etc… say 5 variables in total. Each of these variables is measured using a 6 point likert-like satisfaction scale. I would like to combine these variables into a single variable called enrichment. However, to use the variable enrichment in a linear regression it needs to be a scale variable. My question is this…. how do I do it??

9. sh says

Hi
I have age and experience, and retirement as independent variables and professionalism measure on the Likert scale 1-5. I want to know which dependent variable weighs more on professionalism. Please let me know how to run multiple regression. I have tried it with SPSS several times but failed. The problem is that the items (dependent variable) are not a single variable. How can SPSS handle that?

10. learner says

Hi, my data includes students’ grades (X, XII and college) and their responses for different factors like study habits(3 questions), personality traits (5 questions) etc on 5-point scale (5 – Always, 4 – Often, 3 – Sometimes, 2 – Rarely, 1 – Never)
My objective is to study “factors affecting academic performance of students”

1. Should I consider median for each question and then compare it for male and female group by using chi-square test? Or use t- test for means of each question?
2. To measure the effect of these factors on students grads, how to use regression ? DV is grade but how compute IDV ?

11. imran says

i am conducting a research..
asking students if they want to continue studies after mba on likert scale likely, highly likely, unlikely, highly unlikely etc (this is my dependent variable). while my independent variables are 20 questions again checked with likert scale (strongly agree, agree, nutral, disagree, strongly disagree).
i want to check which question i.e. which variable is strongly impacting student’s decision to continue studies. what test should i apply??
ordinal regression, factor analysis, spearman correlation or anyother??

12. boitumelo says

hi, I don’t know if this is correct but i am using log linear models to look at patterns of response against particular Likert scale questions. i want to know if the there are trends in responses for all my demographics. am i using the correct procedure for this? I can go for ANOVA but I want to do away with assumptions that come with it, hence my choice for the current test. I figured that if i am going to be running ANOVA and Chi square simultaneously in order to elicit the best resuts based on my judgment on the skewedness of the distribution i would be waiting time. can i confidently say Log linear model can be treated as a non parametric version of ANOVA since Kruskal wallis can only take one factor at a time?
I am a little confused here

13. Wais Qarani says

Hello Colleagues;

My participants will self report their managerial competencies using the below scale (50 managerial competencies in five main categories):

1=Very poor
2=Poor
3=Good
4=Very good
5=Excellent

How I can plan the analysis?

14. Fer says

Hello Karen,

if my DV and IV’S are likert scales (1=stongly desagree – 5=strongly agree) with multiple items, can i run a ACP for each construct, making scores, testing the cronbah and after that just run a multiple regression ?

15. Jun says

Hello Karen,

Can I run multiple linear or logistic regression if one or more of my IV is ordinal in nature?

16. Cynthia George says

Hello.. i want to do regression test between the job satisfaction (DV) and work related stress (IV).
my DV is in the 5 likert scale.. and my IV is in the 5 likert scale too. is it possible to run it?

• Karen says

Hi Cynthia,

The simplest approach would be to do a Spearman correlation, if you don’t have any other covariates to control for. Technically, both of those likert items are ordinal.

• Christina Sebastian says

I’m running similar analysis but my IV is 1 to 6pt likert while my DV is 0 to 5. Will these value differences cause an issue? Do I need to recode my DV to match the IV?

Additionally, I have multiple categories within the DV and IV to analyze. My IV is an average of 3 likert scale categories (1 to 6) while my DV is an average of 4 likert scans categories (0 to 5). There are another 6 nominal and 4 ordinal IV to access.

This far I have had the best results using Spearman. Multiple regression and MANOVA each present errors because 2 of the IV perfectly predict the 3rd and so forth. Any suggestions? Is Spearman sufficient?

17. Wynand says

I know this article is 2 years old but the references provided were invaluable in justifying my statistical methodology! Thank you so much for taking the effort to write this.

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.