[Corpora-List] Celtic language machine translation

Francis Tyers ftyers
Fri Apr 17 14:10:07 CEST 2009


(Yes it has been a week already)

El jue, 09-04-2009 a las 09:13 +0200, Francis Tyers escribió:
> Dear corpora-list members,
>
> I would be interested in receiving information/references about any
> articles/reports/papers/etc. on the subject of machine translation for
> Celtic languages. The ones I am aware of are:
>
> * cy-en (2001) John D. Phillips: "The Bible as a Basis for Machine
> Translation". PACL.
> * kw-en (2003) Paul R. Bowden "Building a Lexicon for a Kernewek MT
> System" TALN
> * cy-en (2004) Harold Somers: "Machine Translation and Welsh: The Way
> Forward; Cyfieithu Peirianyddol a?r Gymraeg: Y Ffordd Ymlaen". Report
> for the Welsh Language Board.
> * ga-gd (2006) Kevin P. Scannell: "Machine translation for closely
> related language pairs". LREC-2006. pp.103-107.
> * cy-en (2006) Dafydd Jones & Andreas Eisele: "Phrase-based statistical
> machine translation between English and Welsh". LREC-2006. pp.75-77.
> [PDF, 54KB]
> * cy-en (2009) Francis M. Tyers & Kevin Donnelly: "apertium-cy: a
> collaboratively-developed RBMT system for Welsh to English". PBML 91,
> 2009; pp.57-66.

Aside from the above papers, I received one further reference to machine translation and Celtic languages,

* David Talbot and Miles Osborne. Modelling Lexical Redundancy for Machine Translation. ACL, Sydney, Australia 2006. http://www.iccs.inf.ed.ac.uk/~osborne/papers/acl06.pdf

NB. There is an errata. The Welsh corpus they use is _not_ in the public domain. It can be downloaded from.[1]

Which describes describes a technique that learns to collapse lexical distinctions that don't matter for translation into the target language. They apply it for several languages, including Welsh.

Many thanks to those who replied,

Francis Tyers

1. http://xixona.dlsi.ua.es/corpora/UAGT-PNAW/



More information about the Corpora mailing list