[Corpora-List] First Call for Shared Task Participation: Event Causality Identification with Causal News Corpus at CASE @ EMNLP 2022

ali hürriyetoglu ali.hurriyetoglu at gmail.com
Thu Apr 28 09:22:55 CEST 2022


Dear all,

We invite you to participate in the CASE-2022 Shared Task 3: Event Causality Identification with Causal News Corpus.

The task is being held as part of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2022). All participating teams will be able to publish their system description paper in the workshop proceedings published by ACL.

Workshop Website: https://emw.ku.edu.tr/case-2022/

Motivation

================================================

Causality is a core cognitive concept and appears in many natural language processing (NLP) works that aim to tackle inference and understanding. We are interested to study event causality in news, and therefore, introduce the Causal News Corpus.

The Causal News Corpus consists of 3,559 event sentences, extracted from protest event news, that have been annotated with sequence labels on whether it contains causal relations or not. Subsequently, causal sentences are also annotated with Cause, Effect and Signal spans. Our subtasks work on the Causal News Corpus, and we hope that accurate, automated solutions may be proposed for the detection and extraction of causal events in news.

Task Overview

================================================

We focused on two subtasks relevant to Event Causality Identification:

-

Subtask 1: Causal Event Classification – Does an event sentence contain

any cause-effect meaning?

-

Subtask 2: Cause-Effect-Signal Span Detection – Which consecutive spans

correspond to cause, effect or signal per causal sentence?

-

Subtask 2.1: Cause-Effect Span Detection – This subtask identifies

the spans corresponding to cause and effect per sentence.

-

Subtask 2.2: Signal Span Detection – This subtask identifies the

spans corresponding to the signal, or causal connective, per cause

and effect relation.

Participants may design solutions that work on a single, multiple or all subtasks concurrently. Participants are also allowed to combine Subtask 1 and 2 annotations for either task. However, the target labels of development and test sets should not be introduced during training in their set up in any way (E.g. even for data augmentation).

Data Content

================================================

Our work extends a prior socio-political news corpus to annotate if event-containing sentences have causal relations or not. Our data sizes and splits are described as follows:

-

Subtask 1: Causal Event Classification -- 869 news documents and 3559

English sentences were annotated with labels on whether it contains

causal relations or not. The data splits were: 2925 train, 323

development, and 311 test.

-

Subtask 2: Cause-Effect-Signal Span Detection – Positive causal

sentences from Subtask 1 were retained and annotated with

Cause-Effect-Signal spans. Of the 1957 examples available, we annotated

only 180 sentences, but intend to complete all in the future. We annotated

130 train+dev examples so far. There can be multiple relations per

sentence. The data splits were: 130 train and 13 development. We will

release more training examples closer to the test period. The test set will

include >=50 examples.

Task Repository: https://github.com/tanfiona/CausalNewsCorpus

Codalab Site: https://codalab.lisn.upsaclay.fr/competitions/2299

Subtask 1 Paper description (to appear at LREC 2022): http://arxiv.org/abs/2204.11714

Important Dates

================================================

Training data available: Apr 15, 2022

Validation data available: Apr 15, 2022

Validation labels available: Aug 01, 2022

Test data available: Aug 01, 2022

Test start: Aug 01, 2022

Test end: Aug 15, 2022

System Description Paper submissions due: Sep 07, 2022

Notification to authors after review: Oct 09, 2022

Camera ready: Oct 16, 2022

Workshop period @ EMNLP: Dec 7-8, 2022

Organization

================================================

-

Fiona Anting Tan, Institute of Data Science/National University of

Singapore, Singapore

-

Ali Hürriyetoğlu, KNAW Humanities Cluster, the Netherlands

-

Tommaso Caselli, Rijksuniversiteit Groningen, Netherlands

-

Nelleke Oostdijk, Radboud University

-

Tadashi Nomoto, National Institute of Japanese Literature, Japan

-

Onur Uca, Mersin University

-

Iqra Ameer, Centro de Investigación en Computación/ Instituto

Politécnico Nacional, Mexico

-

Hansi Hettiarachchi, Birmingham City University, United Kingdom

-

Farhana Ferdousi Liza, University of East Anglia, United Kingdom

-

Tiancheng Hu, ETH Zürich, Switzerland

Please contact Fiona Anting Tan at tan.f at u.nus.edu, with your title starting with “CNC ST”, or post questions on the Forum page in Codalab. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 26966 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20220428/3927e690/attachment.txt>



More information about the Corpora mailing list