Let me confess to a moment of embarrassment that I've been anxious about for years: following SENSEVAL-1 I did a (tiny) experiment to establish inter-annotator agreement, and came up with the 95% figure cited by John.
On experience since, I think the findings were not sound, and it is most unusual to get a figure that high, and I regret having published it (and, worse, having put it in the title of a short paper from EACL-99)
For either automatic WSD, or even for the gold standard, I agree entirely with John:
Miss Elliott, my high-school English teacher, wouldn't give
> anyone a gold star [for work like that]
On 16 July 2013 01:59, John F Sowa <sowa at bestweb.net> wrote:
> On 7/15/2013 6:15 PM, Kilian Evang wrote:
>> Off the top of my head, here's two relevant studies on inter-rater
>> reliability for WSD, one for the case of expert annotators and one for
>> the case of non-experts:
> From the abstract at the pointy end of this pointer:
>> The exercise identifies the state-of-the-art for fine-grained word sense
>> disambiguation, where training data is available, as 74–78% correct, with
>> a number of algorithms approaching this level of performance. For systems
>> that did not assume the availability of training data, performance was
>> markedly lower and also more variable. Human inter-tagger agreement was
>> high, with the gold standard taggings being around 95% replicable.
> Implication: For a 300-word page of text, a state-of-the-art program
> would have about 75 errors. That would be an average of two errors
> for 8-word sentences, or five errors for 20-word sentences.
> For the "gold" standard, there would still be 15 errors in a 300-word
> page. Miss Elliott, my high-school English teacher, wouldn't give
> anyone a gold star for 15 errors per page.
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/**corpora<http://mailman.uib.no/options/corpora>
> Corpora mailing list
> Corpora at uib.no
-- ======================================== Adam Kilgarriff <http://www.kilgarriff.co.uk/> adam at lexmasterclass.com Director Lexical Computing Ltd<http://www.sketchengine.co.uk/>
Visiting Research Fellow University of Leeds<http://leeds.ac.uk>
*Corpora for all* with the Sketch Engine <http://www.sketchengine.co.uk>
*DANTE: a lexical database for English<http://www.webdante.com>
* ======================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4357 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20130716/10a163a4/attachment.txt>