one way to measure inter-coder reliability is Cohen's kappa. But this can only be applied to pairs of raters, at least in the standard use.
One solution to the problem of having more than two coders is to average Cohen's kappas across all possible pairs of raters, but I am not sure how this is looked upon in the testing community.
Another solution to this problem appears to be Fleiss' kappa, which can accommodate more raters in one reliability analysis. What sort of experience do you have with this statistic? And are there any software packages that include it (since SPSS does not seem to have it)?
Any advice will be greatly appreciated.
Professor of Language Use and Cognition Director, Language, Cognition, Communication program Faculty of Arts, 11A-35 Department of Language and Communication VU University Amsterdam De Boelelaan 1105 1081 HV Amsterdam
T: ++31-20-5986433 F: ++31-20-5986500
http://www.let.vu.nl/staf/gj.steen/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2340 bytes Desc: not available Url : https://mailman.uib.no/public/corpora/attachments/20081001/89ea1dcd/attachment.txt