The Kappa Statistic or Cohen’s* Kappa is a statistical measure of inter-rater reliability for categorical variables. In fact, it’s almost synonymous with inter-rater reliability.
Kappa is used when two raters both apply a criterion based on a tool to assess whether or not some condition occurs. Examples include:
How do you know your variables are measuring what you think they are? And how do you know they’re doing it well?
A key part of answering these questions is establishing reliability and validity of the measurements that you use in your research study. But the process of establishing reliability and validity is confusing. There are a dizzying number of choices available to you.
Some variables are straightforward to measure without error – blood pressure, number of arrests, whether someone knew a word in a second language.
But many – perhaps most – are not. Whenever a measurement has a potential for error, a key criterion for the soundness of that measurement is reliability.
Think of reliability as consistency or repeatability in measurements. (more…)
Inter Rater Reliability is one of those statistics I seem to need just seldom enough that I forget all the details and have to look it up every time.
Luckily, there are a few really great web sites by experts that explain it (and related concepts) really well, in language that is accessible to non-statisticians.
So rather than reinvent the wheel and write about it, I’m going to refer you to these really great sites:
If you know of any others, please share in the comments. I’ll be happy to add to the list.