[Corpora-List] First CFP: Workshop on Language Technology for Digital Humanities in Central and (South-)Eastern Europe (LT4DH-CEE)

Petya Osenova petyaosenova at hotmail.com
Thu Jun 1 14:12:07 CEST 2017

Workshop on Language Technology for Digital Humanities

in Central and (South-)Eastern Europe (LT4DH-CEE)

to be held on 8 September 2017,

in conjunction with the 11th biennial Recent Advances in Natural Language Processing conference (RANLP 2017) which will take place in September 4-8, 2017, in Varna, Bulgaria.


During the last decades Digital Humanities evolved dramatically, from simple database applications to complex systems involving most recent state of the art in Computer Science. Especially Language Technology plays a major role either for processing the metadata of recorded objects or for analyzing and interpreting content.

Applying language technology methods to objects from humanities is a challenge for NLP-research: data is heterogeneous (image /text), often incomplete (e.g. OCR errors), multilingual within one document (historic documents with Latin or /and classical Greek paragraphs) and difficult to structure (paragraphs, titles, pages are somewhat different in historical texts).

Corpus-based methods, nowadays standard in NLP research cannot be often applied as the necessary large training data is missing.

Moreover requirements of tools for digital humanities, especially such tools dedicated to cultural heritage objects are different from those for tools applied to modern texts.

Thus performing research in Digital Humanities involves also adapting existent NLP Tools for historical variants of languages, developing tools for new languages, making tools robust for syntactic deviation and adapting semantic resources.

Central and Eastern Europe was always characterized by a high concentration of languages and cultures. Unfortunately, especially here many historical documents are in bad condition; many languages or dialects became extinct over the time and their written evidence is rare.

Digital Humanities seems the perfect means for preservation and investigation of this rich cultural heritage asset. However, up to now, dedicated activities seem to miss, probably also due to the lack of adequate NLP resources and tools. Thus it is imperiously necessary to evaluate existent technology, monitor current activities, network research teams in this area, all aims of proposed workshop.


We are looking for original unpublished work related (but not limited to) one of the following topics:

- Corpora for diachronic variants and the dialects of languages in Central and Eastern Europe (CEE);

- NLP Tools for documents of historic, political, philosophical, archeological content in CEE;

- Digital Humanities applications related to CEE;

- Evaluation of current frameworks (CLARIN, DARIAH) on DH-objects related to CEE;

- DH objects as Linked Open Data sets in CEE;

- DH types of resources in CEE (texts, images, artefacts, multimodal objects, etc.);

- Problematic issues related to tracking, digitizing, processing, annotating and preserving the DH objects in CEE;

- Good practices for handling under-resourced DH objects.


All submissions will be handled in the START system (the details will be given in the 2nd CFP). The reviewing process will be anonymous. Double submission is allowed, but authors will be asked to declare it at the time of submission.

Long papers should be 8 pages long plus 2 extra pages for references.

Short papers should be 6 pages long plus 2 extra pages for references. Accepted short papers will be presented either as short oral presentations or as posters.

All submissions should be formatted using the ACL based stylesheets provided for RANLP (http://lml.bas.bg/ranlp2017/submissions.php#styles).

Accepted papers will be published in the workshop proceedings and uploaded on the ACL Anthology.

Important Dates:

Paper submission deadline: July 10, 2017 Notification of acceptance: August 6, 2017 Camera-ready papers due: August 21, 2017 LT4DH-CEE Workshop: September 8, 2017

Organizing Committee

Anca Dinu, University of Bucharest, Romania

Petya Osenova, Bulgarian Academy of Sciences, Bulgaria

Cristina Vertan, University of Hamburg, Germany

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 14049 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20170601/d1f61f7c/attachment.txt>

More information about the Corpora mailing list