[Corpora-List] [PoliticEs at IberLEF2022] PoliticEs: Spanish Author Profiling for Political Ideology - Training set released!

Salud María Jiménez Zafra sjzafra at ujaen.es
Mon Mar 14 12:30:51 CET 2022


*******************************************************************************************************************

Training set released!

SECOND CALL FOR PARTICIPATION

IberLEF 2022 Task - PoliticEs. Spanish Author Profiling for Political Ideology

Held as part of the evaluation forum IberLEF <https://sites.google.com/view/iberlef2022> in the XXXVIII edition of the International Conference of the Spanish Society for Natural Language Processing (SEPLN 2022 <https://sepln2022.grupolys.org/>)

September 20, 2022. A Coruña, Spain

Codalab link: https://codalab.lisn.upsaclay.fr/competitions/1948

Dear All,

We are inviting researchers and students to participate in the shared-task PoliticEs: Spanish Author Profiling for Political Ideology held as part of the evaluation forum IberLEF, collocated with SEPLN 2022 Conference.

The goal of this task is to extract the political ideology of a user from a given set of tweets. It focuses on identifying gender and profession as demographic traits, and identifying the political spectrum from a binary and multi-class perspective as a psychographic trait. To the best of our knowledge, this is the first Spanish shared task focused on extracting political ideology from a text collection.

The participants will be provided development, development_test, training and test datasets in Spanish from an extension of the PoliCorpus 2020 (García-Díaz et al., 2022). The dataset was collected during 2020 and 2021 from the Twitter accounts of politicians and political journalists in Spain using the UMUCorpusClassifier (García-Díaz et al., 2020). It is composed of around 400 different users with at least 120 tweets. Each author is annotated with his or her gender (male, female) and profession (journalist, politician), and with the political spectrum on two axes: binary (left, right) and multiclass (left, moderate_left, moderate_right, right).

Moreover, in order to facilitate participation in the competition, a notebook with a baseline based on BoW will be provided. To download the data, the notebook and participate, go to https://codalab.lisn.upsaclay.fr/competitions/1948.

Today, we have released the training dataset that can be found in the "Files" subsection of the "Participate" tab. It is worth mentioning that this dataset includes all the instances that were also released during the Practice stage; so, it is not needed to combine both datasets.

Finally, remember that the CodaLab competition is open to submit your results with the development dataset provided. This dataset is also available in the same section as the training dataset.

Best regards,

The organizing committee

References

-

García-Díaz, J. A., Almela, Á., Alcaraz-Mármol, G., & Valencia-García,

R. (2020). UMUCorpusClassifier: Compilation and evaluation of linguistic

corpus for Natural Language Processing tasks. Procesamiento del Lenguaje

Natural, 65, 139-142.

-

García-Díaz, J. A., Colomo-Palacios, R., & Valencia-García, R. (2022).

Psychographic traits identification based on political ideology: An author

analysis study on Spanish politicians’ tweets posted in 2020. Future

Generation Computer Systems, 130(1), 59-74.

Important dates

-

Release of development corpora: Feb 14, 2022

-

Release of training corpora: Mar 14, 2022

-

Release of test corpora and start of evaluation campaign: Apr 18, 2022

-

End of evaluation campaign (deadline for runs submission): May 4, 2022

-

Publication of official results: May 6, 2022

-

Paper submission: May 29, 2022

-

Review notification: Jun 17, 2022

-

Camera ready submission: Jun 28, 2022

-

IberLEF Workshop (SEPLN 2022): Sep 20, 2022

Organizing committee

-

José Antonio García-Díaz (TECNOMOD, Universidad de Murcia)

-

Salud María Jiménez-Zafra (SINAI, Universidad de Jaén)

-

María-Teresa Martín Valdivia (SINAI, Universidad de Jaén)

-

Francisco García-Sánchez (TECNOMOD, Universidad de Murcia)

-

L. Alfonso Ureña-López (SINAI, Universidad de Jaén)

-

Rafael Valencia-García (TECNOMOD, Universidad de Murcia)

*******************************************************************************************************************

[image: Universidad de Jaén] <http://www.uja.es/> *Salud María Jiménez Zafra* sjzafra at ujaen.es

Universidad de Jaén Grupo de Investigación SINAI <http://sinai.ujaen.es/> | Departamento de Informática EPS Jaén, Edificio A3, Despacho 219 Campus Las Lagunillas s/n 23071 - Jaén | +34 953212992

[image: Universidad de Jaén] <http://www.uja.es/> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 24819 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20220314/6c30cb32/attachment.txt>



More information about the Corpora mailing list