The Kappa Statistic or Cohen’s* Kappa is a statistical measure of inter-rater reliability for categorical variables. In fact, it’s almost synonymous with inter-rater reliability.
Kappa is used when two raters both apply a criterion based on a tool to assess whether or not some condition occurs. Examples include:
Some variables are straightforward to measure without error – blood pressure, number of arrests, whether someone knew a word in a second language.
But many – perhaps most – are not. Whenever a measurement has a potential for error, a key criterion for the soundness of that measurement is reliability.
Think of reliability as consistency or repeatability in measurements. (more…)
Inter Rater Reliability is one of those statistics I seem to need just seldom enough that I forget all the details and have to look it up every time.
Luckily, there are a few really great web sites by experts that explain it (and related concepts) really well, in language that is accessible to non-statisticians.
So rather than reinvent the wheel and write about it, I’m going to refer you to these really great sites:
If you know of any others, please share in the comments. I’ll be happy to add to the list.