[Corpora-List] deadline extension: LRE special issue on how to standardize historical corpora

Eszter Simon simon.eszterke at gmail.com
Fri Aug 28 13:57:29 CEST 2015

***** Apologies for cross-posting *****

We are inviting submissions for a Special Issue of the Language Resources and Evaluation Journal, entitled “Converging Corpora: How to standardize historical corpora of typologically and genetically different languages”.


The availability of annotated language resources is becoming an increasingly important factor in more and more domains of linguistic research, since high-quality linguistic databases can provide a fertile ground for theoretical investigations. Historical corpora represent a rich source of data, but only if the relevant information is specified in a computationally retrievable and interpretable way.

Several databases of historical texts enriched with some kind of linguistic information and metadata have recently been created for various Indo-European languages, such as the Penn Corpora of Historical English, the Tycho Brahe Parsed Corpus of Historical Portuguese, or the Welsh Prose corpus and for non-Indo-European languages as well, cf. the Old Hungarian Corpus.

With the recent increase in the number of annotated historical corpora, it seems advisable to move towards a harmonized common framework and methodology. An important goal of the special issue is to highlight the issues we encounter when annotating languages with rich morphology.

Questions we would like to be addressed include:

- To what extent should the existing annotation schemes be extended for the incorporation of highly inflected languages? - How can existing schemes be extended to accomplish this? - How can the linguistic annotation of historical corpora be standardized to serve an easy-to-use data access for linguists?

We invite submissions of articles describing annotation schemes of historical corpora, attempts to standardization, and harmonized annotation frameworks.

To provide a possibility of collaboration, we organized a special workshop of the 16th Diachronic Generative Syntax conference on "Converging Corpora: How to standardize historical corpora of typologically and genetically different languages". A natural candidate for this call is an extended paper from the workshop presentations. However, we do not limit the contributions to DiGS-related works. Instead, other works presenting standardization efforts of annotation schemes of historical corpora are also welcome.

Finally, papers describing concrete historical corpora or tools adapted to old language varieties are also welcome, provided they highlight important properties of the problem of standardization and present relevant solutions.


Submissions due: *14 September 2015* Author notification of acceptance: 30 November 2015 Final manuscripts submitted: 31 March 2016


To prepare the papers, please follow the style guidelines provided by the LRE journal

To submit papers: - Go to http://www.editorialmanager.com/lrev/ - Register and login as an author. - Select "S.I. : Converging Corpora" as article type. - Follow the instructions and submit your paper.


- Tamás Váradi – Research Institute for Linguistics, Hungarian Academy of Sciences (varadi.tamas at nytud.mta.hu)

- Eszter Simon – Research Institute for Linguistics, Hungarian Academy of Sciences (simon.eszter at nytud.mta.hu)

-- DR. ESZTER SIMON Research Fellow Research Institute for Linguistics Hungarian Academy of Sciences H-1068 Budapest, Bencz˙r u. 33. Tel./Fax. +36 1 321 4830/ 129 simon.eszter at nytud.mta.hu -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4215 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150828/c850f891/attachment.txt>

More information about the Corpora mailing list