[Corpora-List] CfP: WMT 2020 Translation Task: Khmer and Pashto

Philipp Koehn phi at jhu.edu
Thu Apr 2 03:32:51 CEST 2020

EMNLP 2020 FIFTH CONFERENCE ON MACHINE TRANSLATION (WMT20) Call for Participation: News Translation Task: Khmer and Pashto

Test data released: June 8, 2020 Translation submission deadline: June 15, 2020

The WMT 2020 News Translation Task includes two additional languages:

Pashto and Khmer.

They are both low resource languages using a non-Latin writing system. One particular challenge for these languages is the lack of sufficient amounts of traditional training data in form of clean parallel data. None of the clean parallel data is in the same domain as the test set. Thus the exploitation of monolingual and noisy parallel data is required for good translation quality.

The deadlines for submission match those for the other languages in the news translation task. For more information, see the web page describing the shared task: https://www.statmt.org/wmt20/translation-task.html <http://www.statmt.org/wmt20/translation-task.html>

Note that the same data resources for Pashto and Khmer are also used for the parallel corpus filtering shared task: https://www.statmt.org/wmt20/parallel-corpus-filtering.html <http://www.statmt.org/wmt20/parallel-corpus-filtering.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 1592 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20200401/6d53a46f/attachment.txt>

More information about the Corpora mailing list