[Corpora-List] Call for participation - TASS 2020 "Semantic Analysis in Spanish" - IberLEF - SEPLN 2020

E. Martínez Cámara emcamara at decsai.ugr.es
Mon Feb 3 12:34:24 CET 2020

Apologies for cross-postings* *


*TASS 2020: **Semantic Analysis in Spanish*

Held as part of the evaluation forum IberLEF in the XXXVI edition of the International Conference of the Spanish Society for Natural Language Processing (SEPLN 20 <http://sepln2020.sepln.org/>20)

September 22-25, 2020. Málaga, Spain

*Webpage:*http://tass.sepln.org/2020/ <http://tass.sepln.org/2020/>


*About TASS*

The workshop and shared task “Sentiment Analysis at SEPLN (TASS)” has been held since 2012, under the umbrella of the International Conference of the Spanish Society for Natural Language Processing (SEPLN). TASS was the first shared task on sentiment analysis in Twitter in Spanish. Spanish is the second language used in Facebook and Twitter [1], which calls for the development and availability of language-specific methods and resources for sentiment analysis. The initial aim of TASS was the furtherance of research on sentiment analysis in Spanish. Since 2019 it is part of IberLEF.


After several years exploring general polarity analysis, with different collections, Spanish variants and a great success of participation, in TASS 2020 we evolved towards emotion classification, with a new collection. In addition, we maintain the main task of general polarity analysis, with some novelties, which are described below.

*Task 1: General polarity at three levels*

The aim of this original task of TASS is the evaluation of polarity classification systems of tweets written in Spanish and different variants. We propose two subtasks:

*Subtask-1: Monolingual.*

Training and test using each InterTASS dataset (ES-Spain, PE-Peru, CR-Costa Rica, CH-Chile, UR-Uruguay, MX-Mexico). Any corpora or linguistic resource will be permitted.

*Subtask-2: Multilingual.*

A new test dataset will be delivered, with tweets extracted from the different subsets of Spain, Peru, Costa Rica, Chile, Uruguay, and Mexico. Again, it will be possible to use any corpora or linguistic resources.

*Challenges: *

* Lack of context: Tweets are short (up to 240 characters).

* Informal language: Misspellings, emojis, onomatopoeias are common.

* Multilinguality: The training, tests and development corpus contains

tweets written in the Spanish language spoken in Spain, Peru, Costa

Rica, Chile, Uruguay, and Mexico.

*Task 2: Emotion detection*

Understanding the emotions expressed by users on social media is a hard task due to the absence of voice modulations and facial expressions. This shared task “Emotion detection” has been designed to encourage research in this area. The task consists of classifying the emotion expressed in a tweet as ‘neutral or no emotion’ or as one of the six Ekman’s basic emotions:

* anger (also includes annoyance and rage) can be inferred

* disgust (also includes disinterest, dislike, and loathing) can be


* fear (also includes apprehension, anxiety, concern, and terror) can

be inferred

* joy (also includes serenity and ecstasy) can be inferred

* sadness (also includes pensiveness and grief) can be inferred

* surprise (also includes distraction and amazement) can be inferred

The dataset is based on events that took place in April 2019 related to different domains: entertainment, catastrophe, political, global commemoration, and global strike. Since these events are polarized, we have decided to replace the hashtags in the dataset by the keyword “HASHTAG” in order to prevent the automatic classifier from relying on hashtags to categorize the emotion associated with a tweet. Moreover, we replaced the user mentions by @USER.

*Challenges: *

* Lack of context: Tweets are short (up to 240 characters).

* Informal language: Misspellings, emojis, onomatopoeias are common.

* Multiclass classification: The dataset is labeled with seven

different classes.


* February 1, 2020: Registration open

* February 1, 2020: Release of training and development corpora

* May 1, 2020: Release of test corpora

* May 12, 2020: Deadline for evaluation

* May 25, 2020: Paper submission

* June 15, 2020: Review notification

* July 5, 2020: Camera-ready submission

* September 2020: Publication

* September 2020:  Workshop: Málaga (CEDI 2020)


Format details will be communicated shortly, according to the specifications of IberLEF organizers.

*Organizing comittee *

* Manuel García Vega (University of Jaén, Spain)

* Manuel Carlos Díaz Galiano (University of Jaén, Spain)

* Miguel Ángel García Cumbreras (University of Jaén, Spain)

* Flor Miriam Plaza del Arco (University of Jaén, Spain)

* Arturo Montejo Ráez (University of Jaén, Spain)

* Salud María Jiménez Zafra (University of Jaén, Spain)

* Eugenio Martínez Cámara (University of Granada, Spain)

* César Antonio Aguilar (Universidad Católica de Chile, Chile)

* Edgar Casasola Murillo (Universidad de Costa Rica, Costa Rica)

* Marco Antonio Sobrevilla Cabezudo (Universidade de São Paulo, Brazil)

* Luis Chiruzzo (Universidad de la República, Uruguay)

* Daniela A. Moctezuma (CentroGeo Aguascalientes, México)

*Program Committee*

* Erik Cambria (Nanyang Technological University)

* Edgar Casasola Murillo (University of Costa Rica, Costa Rica)

* Fermín Cruz Mata (University of Sevilla, Spain)

* Luis Espinosa Anke (Cardiff University, United Kingdom)

* Yoan Gutiérrez Vázquez (University of Alicante, Spain)

* Lluís F. Hurtado (Polytechnic University of Valencia, Spain)

* Salud María Jiménez Zafra (University of Jaén, Spain)

* María Victoria Luzón García (University of Granada, Spain)

* Mª. Teresa Martín Valdivia (University of Jaén, Spain)

* Manuel Montes Gómez (National Institute of Astrophysics, Optics and

Electronics, Mexico)

* Antonio Moreno Ortíz (University of Málaga, Spain)

* José Manuel Perea Ortega (University of Extremadura, Spain)

* Ferrán Pla (Universidad Politécnica de Valencia, Spain)

* Sara Rosenthal (IBM Research, U.S.A.)

* Maite Taboada (Simon Fraser University, Canada)

* L. Alfonso Ureña López (University of Jaén, Spain

-- --- Eugenio Martínez Cámara Investigador postdoctoral en Tec. del Lenguaje Humano / Postdoctoral Researcher in Natural Language Proc. Grupo de investigación SCI2S / Research group SCI2S Dpto. Ciencias de la Computación e Inteligencia Artificial / Computer Science and Artificial Intelligence department Universidad de Granada

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 41768 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200203/b5118437/attachment.txt>

More information about the Corpora mailing list