[Corpora-List] WSD / # WordNet senses / Mechanical Turk

Katrin Erk katrin.erk at mail.utexas.edu
Tue Jul 16 17:03:34 CEST 2013

Some people suggest that you may need to phrase the word sense assignment task differently for Turkers than you do for expert annotators.

Biemann, C. (2012). Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource for Lexical Substitution. Proceedings of LREC 2012, Istanbul, Turkey http://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2012/Biemann_TWSI_LREC2012.pdf

Jurgens NAACL 2013: Embracing Ambiguity: A Comparison of Annotation Methodologies for Crowdsourcing Word Sense Labels. http://www.aclweb.org/anthology/N/N13/N13-1062.pdf

Jurgens writes: " Our findings show that given the appropriate annotation task, untrained workers can obtain at least as high agreement as annotators in a controlled setting, and in aggregate generate equally as good of a sense labeling"

Of course, as John and Adam pointed out, the agreement of annotators in controlled settings may not be all that gold in the first place, but still I find this quite interesting.

Cheers, Katrin

On Tue, Jul 16, 2013 at 8:40 AM, Adam Kilgarriff <adam at lexmasterclass.com> wrote:
> Re: the 0.994 accuracy result reported by Snow et al: there was precisely
> one word used for this task, 'president',
> with the 3-way ambiguity between
> 1) executive officer of a firm, corporation, or university
> 2) head of a country (other than the U.S.)
> 3) head of the U.S., President of the United States
> Open a dictionary at random and you'll see that most polysemy isn't like
> that. The result, based on one word, provides no insight into the
> difficulty of the WSD task
> Adam
> On 16 July 2013 13:32, Benjamin Van Durme <vandurme at cs.jhu.edu> wrote:
>> Rion Snow, Brendan O'Connor, Daniel Jurafsky and Andrew Y. Ng. Cheap
>> and Fast - But is it Good? Evaluating Non-Expert Annotations for
>> Natural Language Tasks. EMNLP 2008.
>> http://ai.stanford.edu/~rion/papers/amt_emnlp08.pdf
>> "We collect 10 annotations for each of 177 examples of the noun
>> “president” for the three senses given in SemEval. [...]
>> performing simple majority voting (with random tie-breaking) over
>> annotators results in a rapid accuracy plateau at a very high rate of
>> 0.994 accuracy. In fact, further analysis reveals that there was only
>> a single disagreement between the averaged non-expert vote and the
>> gold standard; on inspection it was observed that the annotators voted
>> strongly against the original gold la-bel (9-to-1 against), and that
>> it was in fact found to be an error in the original gold standard
>> annotation.6 After correcting this error, the non-expert accuracy rate
>> is 100% on the 177 examples in this task. This is a specific example
>> where non-expert annotations can be used to correct expert
>> annotations. "
>> Xuchen Yao, Benjamin Van Durme and Chris Callison-Burch. Expectations
>> of Word Sense in Parallel Corpora. NAACL Short. 2012.
>> http://cs.jhu.edu/~vandurme/papers/YaoVanDurmeCallison-BurchNAACL12.pdf
>> "2 Turker Reliability
>> While Amazon’s Mechanical Turk (MTurk) has been been considered in the
>> past for constructing lexical semantic resources (e.g., (Snow et al.,
>> 2008; Akkaya et al., 2010; Parent and Eskenazi, 2010; Rumshisky,
>> 2011)), word sense annotation is sensi- tive to subjectivity and
>> usually achieves low agree- ment rate even among experts. Thus we
>> first asked Turkers to re-annotate a sample of existing gold- standard
>> data. With an eye towards costs saving, we also considered how many
>> Turkers would be needed per item to produce results of sufficient
>> quality.
>> Turkers were presented sentences from the test portion of the word
>> sense induction task of SemEval-2007 (Agirre and Soroa, 2007),
>> covering 2,559 instances of 35 nouns, expert-annotated with OntoNotes
>> (Hovy et al., 2006) senses. [...]
>> We measure inter-coder agreement using Krip- pendorff’s Alpha
>> (Krippendorff, 2004; Artstein and Poesio, 2008), [...]"
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
> --
> ========================================
> Adam Kilgarriff adam at lexmasterclass.com
> Director Lexical Computing Ltd
> Visiting Research Fellow University of Leeds
> Corpora for all with the Sketch Engine
> DANTE: a lexical database for English
> ========================================
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Katrin Erk, Department of Linguistics The University of Texas at Austin http://www.katrinerk.com/

More information about the Corpora mailing list