Semantic networks and ontologies are key resources in Natural Language Processing. Of these resources, WordNet has remained in widespread use over the past two decades, in part due to its relatively broad coverage -- yet WordNet still omits many lemmas and senses, such as those from domain specific lexicons (e.g., DNA replication, regular expression, and long shot), creative slang usages (e.g., homewrecker), or those for technology or entities that came into recent existence (e.g., selfie, mp3). Therefore, a variety of techniques have been proposed for extending the current ontology structure with new terminology and senses. However, these approaches have often been tested on relatively small datasets or without precisely measuring integration accuracy at the sense level. The aim of Task 14 is to provide robust data set and an evaluation framework for measuring the accuracy of such ontology expansion techniques.


Task 14 aims to enrich the WordNet taxonomy with new words and word senses. For a word sense which is not already defined in the WordNet sense inventory, a system in this task has to identify either:

* the WordNet synset that is a generalization of the new word sense (i.e., its hypernym), or

* the WordNet synset whose word senses are synonyms to the new word sense.

Systems are provided with a specific word sense, i.e., a word together with its definition. A system's task is to identify the WordNet synset to which the new word sense should be merged (i.e., the term is synonymous with those in the synset) or attached as a hyponym (i.e., the new word sense is a specialization of an existing word sense). For example, given the instance:

* chug (verb): to drink a large amount (especially of beer) in a single action; to chugalug.

A system would be expected to merge the lemma with the synset: gulp, quaff, swig -- “to swallow hurriedly or greedily or in one draught.”

Systems will be measured according to two criteria: (1) their ability to correctly identify the attachment/merge point in the WordNet hierarchy for a new sense based on the graph-based similarity technique of Wu and Palmer (WuP; 1994), and (2) the percentage of items from a subtask that are able to be attached or merged. A final ranking of all teams’ systems will be computed by the F1 score of WuP and Recall. See the task’s Evaluation<http://alt.qcri.org/semeval2016/task14/index.php?id=task-description> page for further details.


The Task 14 training data set is now available and contains 400 new word senses that are manually attached to the WordNet 3.0 hierarchy. Please see the task’s Data<http://alt.qcri.org/semeval2016/task14/index.php?id=data-and-tools> page for further details.


• Training data ready: September 4, 2015 • Test data ready: Dec 15, 2015 • Evaluation start: January 10, 2016 • Evaluation end: January 31, 2016 • Paper submission due: February 28, 2016 [TBC] • Paper reviews due: March 31, 2016 • Camera ready due: April 30, 2016 • SemEval workshop: Summer 2016


David Jurgens (jurgens at stanford.edu<mailto:jurgens at stanford.edu>), Stanford University, USA Mohammad Taher Pilehvar (pilehvar at di.uniroma1.it<mailto:pilehvar at di.uniroma1.it>), Sapienza University of Rome, Italy

