HIPE Shared Task - Second Call for Participation and Training Set Release v1.0 (apologies for cross-postings)

==== HIPE: Identifying Historical People, Places and other Entities. Website: https://impresso.github.io/CLEF-HIPE-2020/ Tasks: NERC and Entity Linking on historical newspapers in French, German and English. Registration: http://clef2020-labs-registration.dei.unipd.it/ (until 26 April 2020) Evaluation period: 27-30 April 2020 Workshop venue: during CLEF conference<https://clef2020.clef-initiative.eu/>, 22-25 September 2020 2020, Greece. Twitter: @ImpressoProject<https://twitter.com/ImpressoProject/> / #HIPE / @clef_initiative<https://twitter.com/clef_initiative> / #clef2020 ====

Call for Participation We invite participation in the HIPE shared task on NE processing as part of CLEF 2020 Evaluation Labs. In the context of massive digitization of historical documents, the objective of this shared task is to assess and advance the development of robust named entity processing systems able to deal with challenging, multilingual, diachronic historical material, thereby supporting information extraction and text understanding of cultural heritage data.

Results of participating teams will appear in the working notes proceedings, published by CEUR Workshop<http://ceur-ws.org/> Proceedings and be presented in the CLEF conference in Sept 2020.

Tasks 1. Named Entity Recognition and Classification (coarse and fine-grained) 2. Entity Linking

Data The data consists of historical newspaper articles in French, German and American English originating from Swiss, Luxembourgish and American digitized newspaper archives and selected on a diachronic basis. The time span of the whole corpus goes from 1798 until 2018.

We are happy to announce the data release v1.0<https://github.com/impresso/CLEF-HIPE-2020/tree/master/data> with training and dev sets for French and German, and dev set for English (ca 24k mentions and linked entities). For statistics about the data, please visit the data page<https://impresso.github.io/CLEF-HIPE-2020//datasets.html> of the HIPE website.

Scorer The HIPE Scorer is available here: https://github.com/impresso/CLEF-HIPE-2020-scorer

Information about evaluation metrics are available on the website and in the Participation Guidelines<https://zenodo.org/record/3677171> v1.1.

Complementary data We will soon release monolingual word embeddings computed on the material of the historical newspapers the HIPE data was sampled from.

Important dates (updated) 27 April 2020, 09:00 CET: test data release for bundle 1 to 4; 30 April 2020, 23:59 CET: bundle 1 to 4 system responses due; 06 May 2020, 21:00 CET: test data release for bundle 5 (=with mention boundaries); 08 May 2020, 12:00 CET: publication of results for bundle 1-4; 09 May 2020, 23.59 CET: bundle 5 system responses due; 12 May 2020, 12:00 CET: publication of results for bundle 5; 24 May 2020: Submission of participant papers; 14 June 2020: Notification of acceptance of participant papers; 28 June 2020: Camera-ready copy of participant papers; 17-22 July 2020: CEUR-WS participant paper preview for checking by authors and lab organizers; 22-25 Sept 2020: CLEF 2020 Conference in Thessaloniki, Greece.

Covid-19 CLEF conference and CLEF Labs calendar are maintained for the moment, but as for many other events, the evolution is changing rapidly and is closely monitored by the organisers. We will relay all updates impacting HIPE calendar on the website and via the googlegroup mailing list.

For further information please visit the HIPE website<https://impresso.github.io/CLEF-HIPE-2020/> and check the participation guidelines.

With best regards, HIPE Shared Task Organizers.

(HIPE is supported by the impresso - Media Monitoring of the Past<https://impresso-project.ch/> project.)

