I was recently asked about whether it’s okay to treat a likert scale as continuous as a predictor in a regression model. Here’s my reply. In the question, the researcher asked about logistic regression, but the same answer applies to all regression models.

1. There is a difference between a likert scale item (a single 1-7 scale, eg.) and a full likert scale , which is composed of multiple items. If it is a full likert scale, with a combination of multiple items, go ahead and treat it as numerical.

2. If it is a a single item, it is probably fine to treat it as numerical. There is more justification for this if it has 7 or more values, but even with 5 you may be okay.

3. There are NO assumptions about the distribution of the predictor (independent) variables in any regression. However, parameter estimates generally are only interpretable for nominal categories or numerical quantities.

The coefficient is interpreted as the difference in the mean of Y, the outcome, for each one-unit difference in X, the predictor. If the predictor is categorical and dummy coded, a one-unit difference simply refers to switching from one category to the other. If the predictor is numerical, a one-unit difference should be meaningful.

Ordinal predictor variables have to be treated as either nominal unordered categories or numerical. In the former case, you are throwing away information about the ordering. In the latter, you’re making assumptions about the differences between the scale items. If those distances can be reasonably considered equal and meaningful, then it is reasonable to treat the predictor as numerical (i.e., if a one-unit change from 1 to 2 is roughly equivalent to a one-unit change from 3 to 4).

For more information and some nice references on using likert scales see my post on “Can Likert Scales Ever be Considered Continuous?”

{ 21 comments… read them below or add one }

Hi,

I am writing my dissertation paper and I am struggeling with which type of regression analysis I should use, I created a questionnaire in order to understand the relationship of some factors with the # of ideas implemented, therefore I have the following scenario:

Dependent variable:

Qty of Ideas implemented: ( is the number of ideas that the employees have implemented, e.g an employee can have 20 ideas implemented and another one can have 2.

Independent variable:

Motivation (first factor)

Q1 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Q2 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Q3 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Creativity (second factor)

Q1 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Q2 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Q3 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

Q4 (strongly disagree=1, disagree=2, neutral=3, agree=4, strongly agree= 5)

.

.

.

And so on…

Someone can help me :'( ?

I forgot to mention that I am using SPSS to analyze my data 🙂 and my intention is combine the different questions under each factors. Should I calculate the mean???? Or Do you have a better way to approach this analysis?

Thanks in advance! 🙂

Hi WIlliam,

It looks like you have a lot going on here. We can help you, but I’d need a lot of clarification and it might take a bit to explain. I think you will need a count model because of your DV, but you also seem to need some sort of PCA or FA for your IV.

I would strongly suggest joining our membership program, Statistically Speaking. We have a lot of resources there–webinars on count models, EFA, and PCA, as well as weekly Q&A sessions. For students it’s just $29/month for everything. Here’s the info: https://www.theanalysisfactor.com/membership-program/

I disagree. The more numbers the more problematic in the interpretation of the coefficient. In fact, I would say that coefficients are rather meaningless if there are a large number of categories. The most meaningful category to me would be a simple dummy =1 if agree or =0 if disagree. If we have a large number of categories then I would seriously question if the latent utility model driving the observed decision can be broken down farther than what can be observed through actions. In other words, if the person is given two choices, take-it or leave-it then it would be impossible to tell to what degree that the person felt about the choice simply by asking them.

Hello,

Do you have any academic references to support that?

Best,

Say I have a concept like personal enrichment. Enrichment is a combination of other variables for example how happy I feel, If I consider myself wealthy, if I am sociable etc… say 5 variables in total. Each of these variables is measured using a 6 point likert-like satisfaction scale. I would like to combine these variables into a single variable called enrichment. However, to use the variable enrichment in a linear regression it needs to be a scale variable. My question is this…. how do I do it??

Hi

I have age and experience, and retirement as independent variables and professionalism measure on the Likert scale 1-5. I want to know which dependent variable weighs more on professionalism. Please let me know how to run multiple regression. I have tried it with SPSS several times but failed. The problem is that the items (dependent variable) are not a single variable. How can SPSS handle that?

Hi, my data includes students’ grades (X, XII and college) and their responses for different factors like study habits(3 questions), personality traits (5 questions) etc on 5-point scale (5 – Always, 4 – Often, 3 – Sometimes, 2 – Rarely, 1 – Never)

My objective is to study “factors affecting academic performance of students”

1. Should I consider median for each question and then compare it for male and female group by using chi-square test? Or use t- test for means of each question?

2. To measure the effect of these factors on students grads, how to use regression ? DV is grade but how compute IDV ?

Please help. I am stuck ..

Hello im running into the same trouble in the exact same research. did you figure it out, can you help me?

i am conducting a research..

asking students if they want to continue studies after mba on likert scale likely, highly likely, unlikely, highly unlikely etc (this is my dependent variable). while my independent variables are 20 questions again checked with likert scale (strongly agree, agree, nutral, disagree, strongly disagree).

i want to check which question i.e. which variable is strongly impacting student’s decision to continue studies. what test should i apply??

ordinal regression, factor analysis, spearman correlation or anyother??

pls give your expert advice.

hi, I don’t know if this is correct but i am using log linear models to look at patterns of response against particular Likert scale questions. i want to know if the there are trends in responses for all my demographics. am i using the correct procedure for this? I can go for ANOVA but I want to do away with assumptions that come with it, hence my choice for the current test. I figured that if i am going to be running ANOVA and Chi square simultaneously in order to elicit the best resuts based on my judgment on the skewedness of the distribution i would be waiting time. can i confidently say Log linear model can be treated as a non parametric version of ANOVA since Kruskal wallis can only take one factor at a time?

I am a little confused here

Hello Colleagues;

My participants will self report their managerial competencies using the below scale (50 managerial competencies in five main categories):

1=Very poor

2=Poor

3=Good

4=Very good

5=Excellent

How I can plan the analysis?

Hello Karen,

if my DV and IV’S are likert scales (1=stongly desagree – 5=strongly agree) with multiple items, can i run a ACP for each construct, making scores, testing the cronbah and after that just run a multiple regression ?

Hello Karen,

Can I run multiple linear or logistic regression if one or more of my IV is ordinal in nature?

Yes, either. You just can’t treat the IV as ordinal.

Hello.. i want to do regression test between the job satisfaction (DV) and work related stress (IV).

my DV is in the 5 likert scale.. and my IV is in the 5 likert scale too. is it possible to run it?

Hi Cynthia,

The simplest approach would be to do a Spearman correlation, if you don’t have any other covariates to control for. Technically, both of those likert items are ordinal.

tel me more about the use of likert scale 1-5 and which regression model will be applied and how on spss

Hi Muhammad, which regression is applied depends on the dependent variable. Here is more info: https://www.theanalysisfactor.com/when-dependent-variables-are-not-fit-for-glm-now-what/

Thanks a lot!!!

I know this article is 2 years old but the references provided were invaluable in justifying my statistical methodology! Thank you so much for taking the effort to write this.