You may also look at the following papers/resources on leveraging comparable data for SMT:
(a) Language and Translation Model Adaptation using Comparable Corpora Matthew Snover, Bonnie J. Dorr, and Richard Schwartz. EMNLP 2008
(b) Dragos Stefan Munteanu and Daniel Marcu. 2005. Improving machine translation performance by exploiting non-parallel corpora. Computational Linguis- tics, 31(4):477–504.
(c) The proceedings for the workshop on Building and Using Comparable Corpora (http://comparable2009.ust.hk/). There have been two so far, I believe.
On Mon, Nov 16, 2009 at 9:41 AM, Dom Widdows <widdows at google.com> wrote:
> Dear Xiaotian,
> One paper on finding translations without parallel corpora is:
> Learning Bilingual Lexicons from Monolingual Corpora
> Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick and Dan Klein, ACL 2008
> In general I think there has been a lot of good work that uses
> language models for the target language built from large monolingual
> corpora. E.g., you can use a smaller parallel French-English corpus to
> translate into English, and a large English-only corpus to help "clean
> up" your translation to make sure your English translation is
> "reasonable English", as such. At least, that's my cartoon view of the
> general idea, I'm sure there are many experts out there who can enrich
> or correct this summary.
> Best wishes,
> On Sun, Nov 15, 2009 at 5:11 PM, Xiaotian Guo <garlickfred at gmail.com> wrote:
>> Dear Corpora Colleagues
>> The use of bilingual parallel corpora in computer-aided translation (CAT)
>> has been widely acknowledged and applied now. I just wonder whether there
>> has been substantial progress or achievement in using comparable corpora in
>> CAT. I am aware of Belinda Maia's article "Some Languages are more Equal
>> than Others: training translators in terminology and information retrieval
>> using comparable and parallel corpora" in Corpora in Translator Education,
>> 2003. Is there any other literature on this topic?
>> If you have any ideas of how comparable corpora can be used in CAT (not
>> necessarily mature), please share them with me.
>> All the best
>> Xiaotian Guo
>> SOAS & New Vision Language Centre
>> Corpora mailing list
>> Corpora at uib.no
> Corpora mailing list
> Corpora at uib.no
-- Got Blog? http://greenideas.blogspot.com