[Corpora-List] 1st CfP: NoDaLiDa 2017 Workshop on Processing Historical Text

Gerlof Bouma gerlof.bouma at gu.se
Fri Jan 13 17:00:42 CET 2017

** Call for Papers

NoDaLiDa 2017 Workshop on Processing Historical Language

In conjunction with NoDaLiDa 2017 in Gothenburg, we are organizing a half-day workshop on Processing Historical Language.

* Topic

Many efforts to handle older historical language materials will run into the combined problems of having a low resource language with high amounts of variation. By "low resource", we mean that there are no or few electronic resources like annotated corpora, computational morphologies, parsers, etc, that we may use as processing tools or to develop such tools. "High variation" refers to differences between materials in terms of orthography, punctuation conventions, vocabulary, morphological distinctions, word order, etc. These differences may for instance may be due to a low level of standardization, but they may also be due to materials lying far apart in terms of time and/or space, even though they are nominally from the same language. Either of these problems on their own will present a challenge for standard data-driven techniques, but together they make their employment really problematic.

Even though a historical language may be considered low-resource from a natural language processing point of view, it may still be an actively studied language in fields like philology and historical linguistics, and be well-described in grammars and dictionaries. This suggests that processing historical material should not just be a matter of trying to overcome our field's technical/methodological challenges, but also of crossing into these other fields, so that we may collaborate with their experts and take advantage of their expertise and resources. The latter, however, typically belong to a much more knowledge-driven tradition than the data-driven models that are dominant in present-day natural language processing.

In this half-day workshop, we aim to bring together researchers working on processing historical materials, with a particular focus on work that investigates the combination of data-driven and knowledge-driven modelling. By "processing", we mean a wide range of text processing tasks from different angles and at different levels, be it creating transcriptions and editions of manuscripts, constructing lexicons, tagging, parsing, or content-oriented processing such as semantic parsing, information extraction, etc.

* Information for authors

Authors are invited to submit short papers describing original, unpublished work, be it completed or in progress. The papers should be maximally 4 pages of main content, with additional pages allowed for references and appendices. All accepted papers will be presented as talks.

Submissions must use the NoDaLiDa stylesheets, to be found at http://stp.lingfil.uu.se/~bea/nodalida17/

Reviewing will be blind. Authors should take the usual precautions to avoid revealing their identity in the review version, for instance, by not giving any author names in the title section, by not referring to previous work in the first person, not referring project and web site names, etc.

Papers should be submitted electronically as PDFs at https://easychair.org/conferences/?conf=prochistlang2017

* Important Dates

Paper submission deadline: 20 March 2017 Notification of acceptance: 6 April 2017 Early-bird NoDaLiDa registration: 11 April 2017 Camera-ready papers due: 24 April 2017 Workshop: 22 May 2017 NoDaLiDa main conference: 23–24 May 2017

* More information

Please see the workshop website for more information, at http://spraakbanken.gu.se/eng/processing-historical-language

The workshop organizers can be contacted at prochistlang at svenska.gu.se (not for submissions).

Information about the host conference NoDaLiDa 2017 can be found at http://nodalida2017.se

More information about the Corpora mailing list