- radiologists’ ratings of disease presence or absence on an X-ray
- researchers rate the amount of bullying occurring in an observed classroom
- coders sort qualitative responses into different response categories
It’s well established in research that multiple raters need to rate the same stimuli to ensure ratings are accurate. There are a number of ways to measure the agreement among raters using measures of reliability. These differ depending on a host of details, including: the number of raters; whether ratings are nominal, ordinal, or numerical; and whether one rating can be considered a “Gold Standard.”
In this webinar, we will discuss these and other issues in measures of inter and intra rater reliability, the many variations of the Kappa statistic, and Intraclass correlations.
Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.
About the Instructor
Audrey first realized her love for research and, in particular, data analysis in a career move from clinical psychology to research in dementia. As the field of genetic epidemiology and statistical genetics blossomed, Audrey moved into this emerging field and analyzed data on a wide variety of common diseases believed to have a strong genetic component including hypertension, diabetes and psychiatric disorders. She helped develop software to analyze genetic data and taught classes in the US and Europe.
Audrey has worked for Case Western Reserve University, Cedars-Sinai, University of California at San Francisco and Johns Hopkins. Audrey has a Master’s Degree in Clinical Psychology and a Ph.D. in Epidemiology and Biostatistics.
You'll get access to this training webinar, 100+ other stats trainings, a pathway to work through the trainings that you need — plus the expert guidance you need to build statistical skill with live Q&A sessions and an ask-a-mentor forum.