[Corpora-List] Call for Participation: Multilingual Word Sense Disambiguation (SemEval 2013 - Task 12)

David Jurgens david.jurgens at gmail.com
Mon Feb 4 23:53:47 CET 2013


Call For Participation

Multilingual Word Sense Disambiguation SemEval 2013 - Task #12


The aim of this task is to evaluate Word Sense Disambiguation systems in an all-words multilingual setting.


Task 12 provides a traditional setup for evaluating Word Sense Disambiguation (WSD) systems in an all-words, multilingual setting by marking occurrences of potentially polysemous words in five different languages (English, French, German, Italian, Spanish) with sense labels provided by a multilingual sense inventory. To enable multilinguality we make use of the BabelNet sense inventory [1], a wide-coverage semantic network built by merging WordNet with Wikipedia to provide an “encyclopedic dictionary.” BabelNet concepts are lexicalized in many languages using Wikipedia’s inter-language links and the output of a state-of-the-art machine translation system. Task 12 will use a validated version of BabelNet 1.1 (http://babelnet.org) in which the Wikipedia-WordNet mappings of all senses of lemmas in the test data have been manually verified.


Participants are free to work on the full BabelNet sense inventory or to work on either of its inventory subsets, i.e. WordNet 3.0 or Wikipedia page titles. They are also free to participate using a single language of their choice or all five languages.


Following the traditional WSD “all-words” experimental setting [2], systems will be expected to link all occurrences of noun phrases within arbitrary texts in different languages to the most suitable senses in the sense inventory of their choice. For instance, given the sentence:

1. The dramatic force of Miller's play derives in part from

expressionistic techniques he used to portray Loman's psychological anguish

and guilt-ridden fantasy life.

a disambiguation system should link “Miller” to any of (1) the BabelNet synset for Arthur Miller<http://lcl.uniroma1.it/babelnet/search.jsp?word=Arthur+Miller&lang=EN>, (2) the Wikipedia sense corresponding to the page http://en.wikipedia.org/wiki/Arthur_Miller, or (3) Miller#n#3 (i.e. the third WordNet sense for Miller), depending on the participant’s choice of sense inventory. Note that the BabelNet synset will contain where applicable both the Wikipedia page and the WordNet synset in its representation.

Participants will be evaluated in groups based on their choice of sense inventory and target language. All the information about the submitted systems (such as training data, resources, etc. used by the system) will be reported in the task paper.


No training data will be provided as a part of this task; however, participants are allowed to use any freely available training data for building their system.

For annotating the test set, by mid-February we will provide a gold standard version of BabelNet 1.1 where all synsets used in the test data have been manually verified for correctness.


February 15, 2013 - Registration Deadline March 1, 2013 onwards - Start of evaluation period March 15, 2013 - End of evaluation period April 9, 2013 - Paper submission deadline [TBC] April 23, 2013 - Reviews Due [TBC] May 4, 2013 - Camera ready Due [TBC]


The Semeval-2013 Task #12 website, for signup and details, is:


If interested in the task please join our mailing list for updates:


ORGANIZERS Roberto Navigli (lastname at di.uniroma1.it), Sapienza University of Rome, Italy David Jurgens (lastname at di.uniroma1.it), Sapienza University of Rome, Italy

REFERENCES 1. Roberto Navigli & Simone Paolo Ponzetto. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193, 2012, pp. 217-250. 2. Roberto Navigli. Word Sense Disambiguation: A survey. ACM Computing Survey, 41(2), ACM Press, 2009, pp. 1-69.* -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 14310 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20130204/322ed9f8/attachment.txt>

More information about the Corpora mailing list