[Corpora-List] WSD / # WordNet senses / Mechanical Turk

Adam Kilgarriff adam at lexmasterclass.com
Tue Jul 16 15:40:50 CEST 2013

Re: the 0.994 accuracy result reported by Snow et al: there was precisely one word used for this task, 'president', with the 3-way ambiguity between

1) executive officer of a firm, corporation, or university 2) head of a country (other than the U.S.) 3) head of the U.S., President of the United States

Open a dictionary at random and you'll see that most polysemy isn't like that. The result, based on one word, provides no insight into the difficulty of the WSD task


On 16 July 2013 13:32, Benjamin Van Durme <vandurme at cs.jhu.edu> wrote:

> Rion Snow, Brendan O'Connor, Daniel Jurafsky and Andrew Y. Ng. Cheap
> and Fast - But is it Good? Evaluating Non-Expert Annotations for
> Natural Language Tasks. EMNLP 2008.
> http://ai.stanford.edu/~rion/papers/amt_emnlp08.pdf
> "We collect 10 annotations for each of 177 examples of the noun
> “president” for the three senses given in SemEval. [...]
> performing simple majority voting (with random tie-breaking) over
> annotators results in a rapid accuracy plateau at a very high rate of
> 0.994 accuracy. In fact, further analysis reveals that there was only
> a single disagreement between the averaged non-expert vote and the
> gold standard; on inspection it was observed that the annotators voted
> strongly against the original gold la-bel (9-to-1 against), and that
> it was in fact found to be an error in the original gold standard
> annotation.6 After correcting this error, the non-expert accuracy rate
> is 100% on the 177 examples in this task. This is a specific example
> where non-expert annotations can be used to correct expert
> annotations. "
> Xuchen Yao, Benjamin Van Durme and Chris Callison-Burch. Expectations
> of Word Sense in Parallel Corpora. NAACL Short. 2012.
> http://cs.jhu.edu/~vandurme/papers/YaoVanDurmeCallison-BurchNAACL12.pdf
> "2 Turker Reliability
> While Amazon’s Mechanical Turk (MTurk) has been been considered in the
> past for constructing lexical semantic resources (e.g., (Snow et al.,
> 2008; Akkaya et al., 2010; Parent and Eskenazi, 2010; Rumshisky,
> 2011)), word sense annotation is sensi- tive to subjectivity and
> usually achieves low agree- ment rate even among experts. Thus we
> first asked Turkers to re-annotate a sample of existing gold- standard
> data. With an eye towards costs saving, we also considered how many
> Turkers would be needed per item to produce results of sufficient
> quality.
> Turkers were presented sentences from the test portion of the word
> sense induction task of SemEval-2007 (Agirre and Soroa, 2007),
> covering 2,559 instances of 35 nouns, expert-annotated with OntoNotes
> (Hovy et al., 2006) senses. [...]
> We measure inter-coder agreement using Krip- pendorff’s Alpha
> (Krippendorff, 2004; Artstein and Poesio, 2008), [...]"
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- ======================================== Adam Kilgarriff <http://www.kilgarriff.co.uk/> adam at lexmasterclass.com Director Lexical Computing Ltd<http://www.sketchengine.co.uk/>

Visiting Research Fellow University of Leeds<http://leeds.ac.uk>

*Corpora for all* with the Sketch Engine <http://www.sketchengine.co.uk>

*DANTE: a lexical database for English<http://www.webdante.com>

* ======================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 5125 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20130716/fa5021e4/attachment.txt>

More information about the Corpora mailing list