We are excited to announce the second edition of the shared task on Large-Scale Machine Translation Evaluation, which will be focused this year on African Languages.
Machine translation research has traditionally placed an outsized focus on a limited number of languages – mostly belonging to the Indoeuropean family. Progress for many languages, some with millions of speakers, has been held back by data scarcity issues. An inspiring recent trend has been the increased attention paid to low-resource languages. However, these modelling efforts have been hindered by the lack of high quality, standardised evaluation benchmarks.
For the second edition of the Large-Scale MT shared task, we aim to bring together the community on the topic of machine translation for a set of 24 African languages, to and from English and French. We do so by introducing a high quality benchmark, paired with a fair and rigorous evaluation procedure.
The shared task will consist of three tracks:
* a DATA TRACK, focused on contributions of novel datasets (monolingual, bilingual or multilingual) relevant to the training of MT models for this year’s set of languages;
* a CONSTRAINED TRANSLATION TRACK, evaluating the performance of translation models trained exclusively on data provided by the organisers and data accepted into the data track;
* an UNCONSTRAINED TRANSLATION TRACK, with no restrictions on the use of data or pre-trained models.
* Data track submission deadline, May 10
* Training data is released, May 17
* Evaluation period, June TBD - July TBD
* Paper submission deadline, Aug TBD
Further information regarding this task is available on https://www.statmt.org/wmt22/large-scale-multilingual-translation-task.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 7322 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20220318/a81e80a1/attachment.txt>