[Corpora-List] WMT Quality Estimation task - training data available

Lucia Specia lspecia at gmail.com
Sat Feb 14 21:14:16 CET 2015

Dear all,

The data for the shared task on quality estimation is now available for download at:


This year, we have three tasks:

1) Sentence-level English-Spanish with significantly larger training sets (12.2K) for the estimation of HTER scores.

2) Word-level English-Spanish using the same data as for sentence-level, but aiming to estimate good/bad binary labels for each word in the sentence.

3) Paragraph-level English-German and German-English: a new track aiming to estimate METEOR scores for paragraphs from multiple MT systems.

Looking forward to your submissions!

Lucia Specia (University of Sheffield) Carolina Scarton (University of Sheffield) Chris Hokamp (Dublin City University)

-- www.dcs.shef.ac.uk/~lucia/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 1273 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150214/09a1c68d/attachment.txt>

More information about the Corpora mailing list