We are excited to invite participants to the Shared Task at the 2020 Conference on Computational Natural Language Learning (CoNLL):
Cross-Framework Meaning Representation Parsing (MRP 2020)
For background on the nature of the task and its schedule, please see:
A sample of sentences annotated with MRP graphs in five frameworks:
Any potentially interested parties, please sign up for future updates:
The goal of the task is to advance data-driven parsing into graph-structured representations of sentence meaning. All things semantic are receiving heightened attention in recent years. And despite remarkable advances in vector-based (continuous and distributed) encodings of meaning, ‘classic’ (discrete and hierarchically structured) semantic representations will continue to play an important role in ‘making sense’ of natural language. While parsing has long been dominated by tree-structured target representations, there is now growing interest in general graphs as more expressive and arguably more adequate target structures.
For the first time, this task combines formally and linguistically different approaches to meaning representation in graph form in a uniform training and evaluation setup. Participants are invited to develop parsing systems that support five distinct semantic graph frameworks—which all encode core predicate–argument structure, among other things—in the same implementation. Training and evaluation data will be provided for all five frameworks. Participants are asked to design and train a system that predicts sentence-level meaning representations in all frameworks in parallel. Architectures that utilize complementary knowledge sources (e.g. via parameter sharing and multi-task learning) are encouraged (though not required). Learning from multiple flavors of meaning representation in tandem has hardly been explored.
The task seeks to reduce framework-specific ‘balkanization’ in the field of meaning representation parsing. Expected outcomes include (a) a unifying formal model over different semantic graph banks, (b) uniform representations and framework-agnostic scoring, (c) systematic contrastive evaluation across frameworks, and (d) increased cross-fertilization via transfer and multi-task learning. We hope to engage the combined community of parser developers for graph-structured output representations, including from six prior framework-specific tasks at the Semantic Evaluation exercises between 2014 and 2019. Owing to scarcity of semantic annotations across frameworks, the shared task is organized into two tracks: (a) cross-framework MRP, regrettably limited to English for the time being, and (b) cross-lingual MRP, with one additional language for each framework.
The task combines five frameworks for graph-based meaning representation, each with its specific formal and linguistic assumptions.
+ Prague Tectogrammatical Graphs (Hajič et al., 2012) + Elementary Dependency Structures (Oepen & Lønning, 2006) + Universal Conceptual Cognitive Annotation (Abend & Rappoport, 2013) + Abstract Meaning Representation (Banarescu et al., 2013) + Discourse Representation Graphs (Bos et al., 2017)
For the shared task, we have repackaged different graph banks into a uniform and normalized abstract representation with a common serialization format (in JSON). Training data comprising semantic graphs over a total of some 3.5 million tokens in running English text is now available to participants; additional, cross-lingual data will be released in the second half of May. High-quality tokenization, PoS tagging, lemmatization, and Universal Dependency parse trees will be provided as an optional ‘companion’ resource. For all frameworks, both in- and out-of-domain evaluation data will be provided in the same unified format.
+ March 30, 2020: Availability of Starting Data Package + April 27, 2020: Initial Release of 2020 Training Data + May 25, 2020: Data Updates; Additional Languages + June 8, 2020: Closing Date for Extra Data Nominations + July 20–August 3, 2020: Evaluation Period (Held-Out Data) + September 7, 2020: Submission of System Descriptions + November 19–20, 2020: Presentation of Results at CoNLL
For each of the individual frameworks, there are common ways of evaluating the quality of parser outputs in terms of graph similarity to gold-standard target representations. There is broad similarity between the framework-specific evaluation metrics used to date, although there are some subtle differences too. In a nutshell, meaning representation parsing is commonly evaluated in terms of a graph similarity F1 score at the level of individual node–edge–node triples, i.e. ‘atomic’ dependencies.
For the shared task, we provide a (relatively straightforward) generalization of existing, framework-specific metrics that is (a) applicable across different flavors of semantic graphs, (b) distinguishes separate ‘types’ of information, (c) does not require matching node anchoring in the underlying string, but (d) takes advantage of node ordering when available. Labeled per-dependency scores, macro-averaged across all frameworks, will be the official metric for the task; but we will also provide additional cross-framework evaluation perspectives, as well as scoring in established framework-specific metrics.
We invite all possibly interested parties to self-subscribe to the mailing list for this task; the subscription link and access information for the training data are available from the task web site:
Please do not hesitate to contact the task organizers for questions or clarifications, using the joint email address provided on the task web pages. And stay safe and healthy!
Omri Abend, Lasha Abzianidze, Johan Bos, Jan Hajič, Daniel Hershcovich, Bin Li, Stephan Oepen, Tim O'Gorman, Nianwen Xue, and Dan Zeman