ftyers at prompsit.com
Wed Oct 1 12:03:46 CEST 2008
El mié, 01-10-2008 a las 11:35 +0200, Steen, G.J. escribió:
> Dear all,
> one way to measure inter-coder reliability is Cohen's kappa. But this
> can only be applied to pairs of raters, at least in the standard use.
> One solution to the problem of having more than two coders is to
> average Cohen's kappas across all possible pairs of raters, but I am
> not sure how this is looked upon in the testing community.
> Another solution to this problem appears to be Fleiss' kappa, which
> can accommodate more raters in one reliability analysis. What sort of
> experience do you have with this statistic? And are there any software
> packages that include it (since SPSS does not seem to have it)?
> Any advice will be greatly appreciated.
There are Java and Python implementations of Fleiss' kappa on Wikibooks:
There are some other statistics outlined on Wikipedia (which I suppose
you've already seen):
The choice of statistic largely depends on the experiment.
More information about the Corpora