Arab-Acquis is a large dataset for evaluating machine translation between 22 European languages and Arabic. Arab-Acquis consists of over 12,000 sentences from the JRCAcquis (Acquis Communautaire) corpus translated twice by professional translators, once from English and once from French, and totaling over 600,000 words.
This resource was developed at the Computational Approaches to Modeling Language (CAMeL <http://www.camel-lab.com/>) Lab in New York University Abu Dhabi <http://nyuad.nyu.edu/>.
The paper describing the effort is published here:
- Nizar Habash, Nasser Zalmout, Dima Taji, Hieu Hoang and Maverick
Alzate. 2017. A Parallel Corpus for Evaluating Machine Translation between
Arabic and European Languages. In Proceedings of the Conference of the
European Chapter of the Association for Computational Linguistics (EACL),
Valencia, Spain. [PDF <http://aclweb.org/anthology/E17-2038>] [BIB
Nizar Habash Associate Professor of Computer Science New York University Abu Dhabi -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2760 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181209/9f377e33/attachment.txt>