A very common question is whether it is legitimate to use Likert scale data in parametric statistical procedures that require interval data, such as Linear Regression, ANOVA, and Factor Analysis. A typical Likert scale item has 5 to 11 points that indicate the degree of agreement with a statement, such as 1=Strongly Agree to 5=Strongly Disagree. It can be a 1 to 5 scale, 0 to 10, etc.

The issue is that despite being made up of numbers, a Likert scale item is in fact a set of ordered categories.

One camp maintains that as ordered categories, the intervals between the scale values are not equal. Any mean, correlation, or other numerical operation applied to them is invalid. Only nonparametic statistics should be used on Likert scale data (i.e. Jamieson, 2004).

The other group maintains that while technically the Likert scale item is ordered, using it in parametric tests IS valid in some situations. For example, Lubke & Muthen (2004) found that it is possible to find true parameter values in factor analysis with Likert scale data, if assumptions about skewness, number of categories, etc., were met. Likewise, Glass et al. (1972) found that F tests in ANOVA could return accurate p-values on Likert items under certain conditions.

Meanwhile, the debate rages on.

What is a researcher with integrity supposed to do? In the absence of a definitive answer, these are my recommendations:

- Understand the difference between a Likert type item and a Likert Scale. A true Likert scale, as Likert defined it, is made up of many items, which all measure the same attitude. But many people use the term Likert Scale to refer to a single item. Confusion about what a Likert Scale is, no doubt, has contributed to the debate.
- Proceed with caution. Research the consequences of using
*your*procedure on Likert scale data from*your*study design. The fact that everyone uses it is not sufficient justification. There are some circumstances and procedures for which it is more egregious than others. - At the very least, insist that the item have at least 5 points (7 is better), that the underlying concept be continuous, and that there be some indication that the intervals between points are approximately equal. Make sure the other assumptions (normality & equal variance of residuals, etc.) be met.
- When you can, run the nonparametric equivalent to your test. If you get the same results, you can be confident about your conclusions.
- If you do choose to use Likert data in a parametric procedure, make sure you have strong results before making claims. Use a more stringent alpha level, like .01 or even .005, instead of .05. If you have p-values of .001 or .45, it’s pretty clear what the result is, even if parameter estimates are slightly biased. It’s when p-values are close to .05 that the effect of bending assumptions is unclear.
- Consider the consequences of reporting inaccurate results. Will anyone ever read your paper? Will your research be published? Will it be used to shape public policy or affect practices? The answers to these questions can inform the seriousness of potential problems.

**References:**

Carifio, J. & Perla, R. (2007). Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes. *Journal of Social Sciences, 2*, 106-116. http://www.scipub.org/fulltext/jss/jss33106-116.pdf

Glass, Peckham, and Sanders (1972). Consequences of failure to meet assumptions underlying the analyses of variance and covariance, *Review of Educational Research, 42*, 237-288.

Jamieson, S. (2004). Likert scales: how to (ab)use them. *Medical Education, 38*, 1212-1218.

Lubke, Gitta H.; Muthen, Bengt O. (2004). Applying Multigroup Confirmatory Factor Models for Continuous Outcomes to Likert Scale Data Complicates Meaningful Group Comparisons. *Structural Equation Modeling, 11*, 514-534.

**Missing Data**,

**Mixed Models**,

**Structural Equation Modeling**,

**Data Mining**,

**Effect Size Statistics**, and much more...

{ 33 comments… read them below or add one }

Hi Karen,

I’m just a lowly practitioner in the government sector, but my work sometimes informs decisions that affect real people, so I have some aspirations to getting things right. My problem is that while I trust the statistical tools applied to them, I have more fundamental issues with Likert scales, in that I find it very hard to be convinced that they validly meet a basic definition of ‘meaningful evidence’ of anything. Happy to explain why, but won’t bore you with that unless you are interested. My real issue is simply that ultimately valid or not, it is easy to demonstrate that likert scale data points are of very low data quality compared to many other forms of ‘qualitative’ data. This type of quasi-ordinal data construct has not actually been around for very long in the bigger scheme of things, but many social scientists appear to be obsessed with them to the point of failing to look for something better (and indeed, getting very cross when anyone even suggests that we may be able to construct qualitative measures with far better data quality). I can understand this from a ‘convenience’ perspective, but from a ‘science’ perspective it is like being climbing out to the end of a branch and then getting stuck. Whether he was right or wrong to ‘invent’ Likert scales is not that relevant, new ideas are always needed and researchers are right to trial them. What I don’t think the late Prof. Likert would be happy to hear is that obsession with the convenience of ‘deriving data’ from Likert scales is impeding new paradigm shifts and improvements in qualitative research. I have been doing my job for quite a while, and have had extensive personal experience with the ‘passion’ the two camps you mention in your article defend their respective positions. What if they are both missing the point? I guess the point of my posting is just to look for your reaction to my position, and I am hoping it is not the normal one I get when I suggest that there may be better tools for qualitative research than Likert Scales, which is excommunication as a heretic.

Cheers,

Colin

I have likert scale responses (1-5 rating) If I need to check the normality distribution of responses , how do i do that? k-s test?

Pls how can you help me to understand how to determine the independent and dependant when using likert to test hypothesis? If possible send to me a lecture video to help me out.

Thank you

Hello! Thank you so much for your post, it was very helpful. I wonder if you can help me with my question. I conducted a study, and i need to use a MANOVA test since i have 4 dependent variables to compare within two groups. 3 out of 4 dependent variables consist from 4 different questions measured by 5 points Likert scale. So i used the median command to combine them and to obtain my dependent variables. As a results i obtained different scales for each dependent variable such as: 2.50/ 3/ 3.50/ 4/ 4.50/ 5; 3/ 3.50/ 4/ 4.50/ 5; 1/ 2/ 3/ 4/ 5. I do understand why i have such a results. However, my question is if i can use such a scales directly for my MANOVA? or should i recode them in some way? It is required for MANOVA to use dependent variables with the continuous scales, but how can i prove i have such for my test if they r measured in 5 points Likert scale? Is it possible? I will really appreciate if you can help me. Thanks you!

Here is another article you could cite in support of using parametric tests with Likert scales, and even items.

Norman, G. (2010). Likert scales, levels of measurement and the “laws” of statistics. Advances in health sciences education, 15(5), 625-632. http://link.springer.com/article/10.1007/s10459-010-9222-y

Thank you for the link, Mr. Bruce 🙂

Hi – great article, thank you! I’ve got a 20-item scale, each question in the form of a Likert (ranging from 0-3) – that means there are 80 possible scores. I’d like to use this variable as a dependent var in some sort of regression with several predictor variables. Technically this score is “ordinal” since it came from a sum of Likert scales, but an ordinal regression with 80 possible “categories” is a bit much. The score is also distributed very much like a binomial distribution and not at all normal (could I use negative binomial regression perhaps?). Any advice on how to regress this would be appreciated!

Thanks a lot, Karen! 🙂

Dear Karen,

thanks for your really helpful article. In point 3 you say that in order to (more or less) safely consider an ordinal predictor continuous it should have at least 5 to 7 points. Can you name a referance for that statement?

Thanks and best regards, Max

Hi Max,

The Lubke & Muthen article listed above discusses it. It’s been a while since I read it, but I believe they did it in the context of factor analysis, not predictor variables in regression.

My dependent variable is measured on a 5 point likert scale and independent variable is measured on a 5 point likert scale. Is it appropriate to run linear regression analysis on such data? What if the same for multiregression?

Can anyone suggest a academic study which supports the fact that different points in Likert scale(5, 7, 8, 11) can use in the same study.

Hi,

I ran PCA on likert scale variables (answers ranged 0-6). I found skewness and kurtosis ok on all variables. When I write up my paper, do I need to justify using likert scale items in PCA or is it just so common no one justifies it anymore? Would I use Lubke article above to cite to show how I could use likert data as cotinuous?

Thanks

Can i use regression analysis for my 5 point likert scale data. And can i link 5 point likert scale data with contonuous varuables??

i have R value 1 how can reduce this value R VALUE plz help

Hi, loved your article, really helpful. I have a question. I have some confusion over what is exactly summed. If you have a questionnaire with say 7 likert items (questions), is the summed amount for the population the interval data? For example if it was a job satisfaction survey and I wanted to compare satisfaction between males and females, could I sum the overall score for females then males and compare means as interval data? I think I am confused at what point the data is summed to become interval data, is at individual level or population? The other thing, as likert is closed question survey, the research method is quantative, but at a statistical level it is qualitative, is this right? Thanks, this is for a research assignment due in soon, I have to explain my data analysis method, so desperately need advise!

Hi Kerry,

Well, first of all, you don’t have population data, just samples. But you would sum scores for individuals. Now, there is more to it as to whether it’s truly interval. To be sure I would suggest reading more in the psychometric literature. As for the quantitative/qualitative, if you need that for an assignment, I think you need to figure that out on our own. 🙂

Hi Kerry, i’m a bit late adding to this question (which should have also given you ample time to research the literature itself though and you might even know more about it than me at the moment :).

So just wanted to say a bit about the tip of the iceberg that I’m aware of. I’m a ‘mere’ researcher in that sense but also with some background in teaching of statistics to bachelor students during my psychology major which already brought me into contact of some of the more controversial techniques out there regarding whether treating Likert scale (orindal, at the very least in terms of how the Likert item is coded) as an interval scale is at all justified. Since then statistics has always become more of a background thing, and I do admit we treat it as an interval variable in many cases at cognitive psychology, but always with that nagging voice at the back of my head stemming from me lecturing students that it’s wrong even if often done, telling me that it’s not right a right way of treating the data at all (in most cases).

Now using Likert (!)items(!) as continuous predictors is very hard to justify in my opinion. The following might all be old news by now or even just plain wrong but it makes a lot of intuitive sense to me. Because I haven’t been keeping myself up to date on many statistical debates in academia, these are primarily my own thoughts and conclusions which I think are hard to deny (not impossible certainly but still quite hard to convince me otherwise I think. But feel free to provide counterarguments..).

1. People differ in the way they interpret these labels attached to a response option (and consequently the ‘distance’ it would translate to on the underlying true thing or trait that is being measured by these questionnaires – e.g. Extroversion or Optimism which I personally think are very good examples as in my experience extroversion and Optimism can most certainly not be reduced to 2 categories (or however many categories you would like to create). With the exception perhaps being your outliers, but using outliers as group seems very risky especially because you usually don’t know for sure what causes those outliers and you simply won’t gave enough of them in your sample :b.

2. Added to that you can’t even assume that the difference between the different Likert item points is consistent ac, neither within nor across participants. The difference in ‘agreement’ one person think there might be (or experienced subjectively) between strongly agreeing and simply agreeing completely, might perhaps be smaller than the point at which the opinion changes from “slightly (dis)agreeing” to the “neutral” labeled data point; or if there is no neutral the difference between slightly agreeing versus slightly disagreeing. The reason being that the change in attitude is no longer just about ‘strength of agreement or disagreement, but also a change in what you claim to hold true. The leap from slight disagreement to to slight agreement for instance is for this very reason I believe somewhat if not much larger than the leap from agreeing strongly to agreeing completely. Similarly is the change from slightly (dis)agreeing to neutral also qualitatively different kind of leap in opinion, namely having one, however slight it might be, and having no real opinion on the matter. Now it becomes not even about the stength of agreement and disgreements but a sudden absence of opinoin onj a statement which would be though to measure something else therefore., At least to me this makes makes a lot of sense intuitively. I have no way to prove any of this but to me it seems assuming that those likert points are at pretty much the same distance from each other on an equivalent continuous measurements seems a little ridiculous to be fair (yet we all still keep making these assumptions for the sake of convenience, at least at the moment – I think this will change pretty soon though)

4. Some more pragmatic (:P) advice. Once you reach 80 categories so you can do it the ‘ANOVA-way’, I think it’s best to deal with by just learning simple and multiple regression :). They are not hard to learn and interpret even if some older researchers seem to or might think they would be hard to learn, cost valuable time and so on. Not regression, not at all. Neither are the non-parametric version very hard to hard to understand either. Its more about getting used to the way the results are reported bad how yo interpret the non-parametric based statistics.. I really see no good excuse for not keeping up to date with current ways of analyzing data, especially since they have many apparent advantages and seem easy to learn. I bet you can learn how to do regression in less than a day for example – heck first year bachelor students learn the basics nowadays in 1 or two 2-hour meetings. Small price to pay for more reliable results.

4. A bit off topic, but still thought worth mentioning si the multilevel analysis approach to analyzing your data. It migth also solve some of the above issues as the source of the problem seems to simply come down to grouping people together into categories, and therefore getting rid of the individual variability.On the other hand it si the very fact that there are continuous variables in the analysis that allow you to capture individual differences so wouldn’t really solve the median split and Likert scaling issues after all I guess.. Then again you should ask yourself, once you are capable of doing multilevel analysis of your data, which from my point of view and what I learned about it is quite a superior way of analyzing data than with either any kind of anova or regression. I’d highly recommend looking into it, and consider it a possible solution to your question you had 3 years ago ;), but also for future research decision.

Also when I started my major (in psychology, later cog-psychology/neuroscience we did (and the students still still do learn about the theory of regression (also different versions, such as e.g. logistic regression). And importantly which technique is most suited to the data they want to analyze because of the very reason that you would loose valuable ‘data’ (n this case variance, or sample size depending on how you approach it) when reducing a cont. var. to a cat.var in your model.

I think the newer generations of researchers will for that reason alone be much more skilled in handling such problems due to this extra knowledge, and possible courses in multilevel analysis which is becoming increasingly popular over here).

And as old habits die hard for those who know nothing nut ANOVA, it seem like the best approach to solving ‘the problem’ is to just let time take care of it and to just always be criticial of results that used median split, or likert scales being treated as interval variables, and draw you own more ‘hesistant’ or careful conclusions. Trying to change their opinions and way of doing research seems like a waste of time and energy

Perhaps it’s different in the US if regression has still not become part of the standard statistics course 😉 I can hardly imagine that to be true though. But if it is, then, well step up guys. Those are essential skills to have in analyzing data..!

So on the individual level, learn some more techniques that can handle other types of data 😉

On the university level have students learn them as well as part of their education if it isn’t part of it already.

And in a global sense Time will take care of the stubborn ones findinh it hard to let go of the old ways.

Best,

Dominique

i have a question. please reply. i am doing statistical analysis. my independent variables are 5 point likert scale. and dependent variable is binary. should i use binary logistic regression? what options should i select?

Hi Faisal,

I would need a lot more information to actually suggest an analysis. If your outcome is binary, then indeed logistic regression is one possibility. But it depends on a lot of other questions, including “what is it you want to test?”

Some papers have it that one can combine likert type questions into likert scale by summing the responses under each construct to form scores which reduced the data from ordinal scale to interval scale in which parametric test can be conducted like ANOVA, Regression etc.

What about that?

Thanks

how can i use data collected using likert scale for doing corelation

Depends what you mean by likert scale. If you mean something like a 1-5 scale item, your best bet would be a spearman rank correlation. No assumptions of normality there.

how can we convert the data into an independent variable so that i can use factor analysis, as i am new to this software can somebody help me in this?? have collected data on customer satisfaction on a 5-pint likert scaling..please help

Rohail, I’m not entirely sure what you’re asking, but generally people do use likert data for factor analysis.

Hi

I have done factor analysis on the data collected (likert items). Now i am lost as to how should i proceed further. SPSS has given me 9 factors out of 50 ordinal variables.

Can i apply regression or Anova on such data?

Also can i subdivide a factor into two or more factors by doing factor analysis again on those items which constitute a factor (originally computed) e.g. Brand image can be subdivided as quality, product attributes, so and so forth. Can i do so?

Hi Harleen,

There’s a lot to using the results from factor analysis in other analyses. More than I could ever answer here (it’s a book, really).

I would strongly suggest getting this book, even if you don’t use SAS. A Step-by-Step Approach to Using the SAS System for Factor Analysis and Structural Equation Modeling by Larry Hatcher. He really explains everything step-by-step.

I recently suggested it to a client who needed to use Factor Analysis, and she said it cleared up all her confusion.

Karen

Can someone tell me why Firm’s age is used as a proxy for information asymmetry. You can post your response here or email to me @ zikoseni@yahoo.com.

Thanks

My question is: if our data were parametric, can we use Likert Scale data in Factor Analysis directly? Otherwise, to identifying important variables in my study with my Likert Scale data, what should I do?

Thank you

Alireza,

So are you asking if you can use Factor Analysis for Likert Scale data?

Theoretically, Likert items do not meet the assumptions for a Factor Analysis. That Lubke and Muthen paper referenced above, however, found that in some situations, the results are quite valid. I would suggest reading that paper and seeing if your data fit the situations where it works well.

Karen

I am a student. Can someone help me to locate a statistical software (free) to run data I gathered using Likert Scale. I am working on asymmetric information in the capital market. I can be reached via zikoseni@yahoo.com. Thank you

Hi Zik,

Just to up, if you need free, you have two choices:

PSPP is an opensource version of SPSS base. Easy to use, but limited.

R requires more programming, but can do much, much more.

That Carifio and Perla paper looks handy – ta for sharing!

{ 6 trackbacks }