- Website: https://circse.github.io/LT4HALA/2022 - START submission site: https://www.softconf.com/lrec2022/LT4HALA/ - Place: co-located with LREC 2022, Marseille, France - Date: 25 June 2022 (post-conference workshop)
##DESCRIPTION## LT4HALA 2022 is a one-day workshop that seeks to bring together scholars who are developing and/or are using Language Technologies (LTs) for historically attested languages, so to foster cross-fertilization between the Computational Linguistics community and the areas in the Humanities dealing with historical linguistic data, e.g. historians, philologists, linguists, archaeologists and literary scholars. LT4HALA 2022 follows LT4HALA 2020 that was organized in the context of LREC 2020 (proceedings: https://aclanthology.org/volumes/2020.lt4hala-1/). Despite the current availability of large collections of digitized texts written in historical languages, such interdisciplinary collaboration is still hampered by the limited availability of annotated linguistic resources for most of the historical languages. Creating such resources is a challenge and an obligation for LTs, both to support historical linguistic research with the most updated technologies and to preserve those precious linguistic data that survived from past times. Relevant topics for the workshop include, but are not limited to: - handling spelling variation; - detection and correction of OCR errors; - creation and annotation of digital resources; - deciphering; - morphological/syntactic/semantic analysis of textual data; - adaptation of tools to address diachronic/diatopic/diastratic variation in texts; - teaching ancient languages with NLP tools; - NLP-driven theoretical studies in historical linguistics; - evaluation of NLP tools.
##SHARED TASKS## LT4HALA 2022 will hosts two shared tasks: - the second edition of EvaLatin, an evaluation campaign entirely devoted to the evaluation of NLP tools for Latin. The second edition of EvaLatin will focus on three tasks (i.e. Lemmatization, PoS tagging, and Morphological Feature Identification), each featuring three sub-tasks (i.e. Classical, Cross-Genre, Cross-Time). - the first edition of EvaHan, the first evaluation campaign for the evaluation of NLP tools for Ancient Chinese. EvaHan first edition has one task (i.e. a joint task of Word Segmentation and POS Tagging).
Training data for both shared tasks are available on the conference website: - EvaLatin 2022 training data: https://circse.github.io/LT4HALA/2022/EvaLatin#training-data - EvaHan 2022 training data: https://circse.github.io/LT4HALA/2022/EvaHan#training-data
Test data will be available on the web pages of the two shared tasks at the following URLs: - EvaLatin 2022 test data (available after March 17th): https://circse.github.io/LT4HALA/2022/EvaLatin#test-data - EvaHan 2022 test data (available after March 31st): https://circse.github.io/LT4HALA/2022/EvaHan#test-data
##SUBMISSIONS## For the workshop, we invite papers of different types such as experimental papers, reproduction papers, resource papers, position papers, survey papers. Both long and short papers describing original and unpublished work are welcome. Long papers should deal with substantial completed research and/or report on the development of new methodologies. They may consist of up to 8 pages of content plus 2 pages of references. Short papers are instead appropriate for reporting on works in progress or for describing a singular tool or project. They may consist of up to 4 pages of content plus 2 pages of references. We encourage the authors of papers reporting experimental results to make their results reproducible and the entire process of analysis replicable, by making the data and the tools they used available. The form of the presentation may be oral or poster, whereas in the proceedings there is no difference between the accepted papers. The submission is NOT anonymous. The LREC official format is requested. Each paper will be reviewed but three independent reviewers. As for EvaLatin and EvaHan, participants will be required to submit a technical report for each task (with all the related sub-tasks) they took part in. Technical reports will be included in the proceedings as short papers: the maximum length is 4 pages (excluding references) and they should follow the LREC official format. Reports will receive a light review (we will check for the correctness of the format, the exactness of results and ranking, and overall exposition). All participants will have the possibility to present their results at the workshop: we will allocate an oral session and a poster session fully devoted to the shared tasks in the afternoon.
##IMPORTANT DATES## Workshop - 8 April 2022: submission due - 29 April 2022: reviews due - 3 May 2022: notifications to authors - 24 May 2022: camera-ready (PDF) due
Shared Tasks - PLEASE NOTE THAT NO EXTENSION IS PLANNED FOR THE SHARED TASKS EvaLatin - 20 December 2021: training data available - Evaluation Window I - Task: Lemmatization - - 17 March 2022: test data available - - 23 March 2022 system results due to organizers - Evaluation Window II - Task: PoS tagging - - 24 March 2022: test data available - - 30 March 2022: system results due to organizers - Evaluation Window III - Task: Features tagging - - 31 March 2022: test data available - - 6 April 2022: system results due to organizers - 26 April 2022: reports due to organizers - 10 May 2022: short report review deadline - 24 May 2022: camera ready version of reports due to organizers EvaHan - 20 December 2021: training data available - Evaluation Window - - 31 March 2022: test data available - - 6 April 2022: system results due to organizers - 26 April 2022: reports due to organizers - 10 May 2022: short report review deadline - 24 May 2022: camera ready version of reports due to organizers
##Identify, Describe and Share your LRs!## - Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data. - As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2022 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org<http://www.islrn.org>), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.
##Workshop Organizers## Marco Passarotti, UniversitÓ Cattolica del Sacro Cuore, Milan, Italy Rachele Sprugnoli, @RSprugnoli, UniversitÓ Cattolica del Sacro Cuore, Milan, Italy
##Programme Committee## Marcel Bollmann, University of Copenhagen, Denmark Gerlof Bouma, University of Gothenburg, Sweden Flavio M. Cecchini, UniversitÓ Cattolica del Sacro Cuore, Italy Harry Diakoff, Alpheios Project, USA Stefanie Dipper, Ruhr-Universitńt Bochum, Germany Hanne Eckhoff, Oxford University, UK Margherita Fantoli, University of Leuven, Belgium Hannes A. Fellner, Universitńt Wien, Austria Heidi Jauhiainen, University of Helsinki, Finland Neven Jovanovic, University of Zagreb, Croatia Timo Korkiakangas, University of Helsinki, Finland Bin Li, Nanjing Normal University, P.R. China Eleonora Litta, UniversitÓ Cattolica del Sacro Cuore, Italy Chao-Lin Liu, National Chengchi University, Taiwan Barbara McGillivray, Turing Institute, UK Beßta Megyesi, Uppsala University, Sweden Saskia Peels, University of Groningen, The Netherlands Eva Pettersson, Uppsala University, Sweden Sophie PrÚvost, Laboratoire Lattice, France Philippe Roelli, University of Zurich, Switzerland Matteo Romanello, UniversitÚ de Lausanne, Switzerland Halim Sayoud, USTHB University, Algeria Dongbo Wang, Nanjing Agricultural University, P.R. China
##EvaLatin 2022 Organizers## Rachele Sprugnoli, UniversitÓ Cattolica del Sacro Cuore, Milan, Italy Margherita Fantoli, KU Leuven, Belgium Flavio M. Cecchini, UniversitÓ Cattolica del Sacro Cuore, Milan, Italy Marco Passarotti, UniversitÓ Cattolica del Sacro Cuore, Milan, Italy
##EvaHan 2022 Organizers## Bin Li, Nanjing Normal University, P.R. China Yiguo Yuan, Nanjing Normal University, P.R. China Minxuan Feng, Nanjing Normal University, P.R. China Chao Xu, Nanjing Normal University, P.R. China Dongbo Wang, Nanjing Agricultural University, P.R. China
##Contact## rachele.sprugnoli[AT]unipr.it<http://unipr.it> Please, write “LT4HALA” or “EvaLatin” in the subject of your e-mail. For more information on EvaHan, please write to libin.njnu[AT]gmail.com<http://gmail.com> writing “EvaHan” in the subject of the e-mail. Follow @ERC_LiLa and the hashtag #LT4HALA2022 on Twitter for updates.
Prof. Marco C. Passarotti Computational Linguistics Index Thomisticus Treebank https://itreebank.marginalia.it/ ERC Grantee, P.I. LiLa https://lila-erc.eu/ (Grant Agreement No. 769994) CIRCSE Research Centre https://centridiricerca.unicatt.it/circse_index.html
[cid:38DBA4B0-3169-48DD-B59A-4F3A679F9DD9 at lan] [cid:D415BF3A-E244-4BC4-9FB5-064066B300AD at lan] [cid:13BA173A-59CB-4F2D-9B90-DE302E870A50 at lan]
UniversitÓ Cattolica del Sacro Cuore Largo Gemelli, 1 20123 Milan, Italy marco.passarotti at unicatt.it<mailto:marco.passarotti at unicatt.it> tel. +39-02-72342380
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 20107 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20220301/5575faed/attachment.txt> -------------- next part -------------- A non-text attachment was scrubbed... Name: cropped-europe-flag.png Type: image/png Size: 3314 bytes Desc: cropped-europe-flag.png URL: <https://mailman.uib.no/public/corpora/attachments/20220301/5575faed/attachment-0003.png> -------------- next part -------------- A non-text attachment was scrubbed... Name: cropped-erc_high_res.png Type: image/png Size: 4963 bytes Desc: cropped-erc_high_res.png URL: <https://mailman.uib.no/public/corpora/attachments/20220301/5575faed/attachment-0004.png> -------------- next part -------------- A non-text attachment was scrubbed... Name: cropped-lila-logo-9.png Type: image/png Size: 3157 bytes Desc: cropped-lila-logo-9.png URL: <https://mailman.uib.no/public/corpora/attachments/20220301/5575faed/attachment-0005.png>