[Corpora-List] Corpus of translated material

Ralf Steinberger ralf.steinberger at jrc.it
Fri Mar 2 09:35:00 CET 2007

Dear Noemie,

For the JRC-Acquis multilingual parallel corpus with alignments (sentence or
paragraph) in all language combinations for the languages Czech, Danish,
German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian,
Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian,
Slovak, Slovene and Swedish (190 language pair combinations), see
http://langtech.jrc.it/JRC-Acquis.html. It is freely available for research
purposes. Unfortunately, we do not know the source language of the
translations, but we are told that most of the time it is English or French.

I hope this is useful for your work.

With kind regards,


Ralf Steinberger ( <mailto:Ralf.Steinberger at jrc.it> Ralf.Steinberger at jrc.it)

European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology ( <http://langtech.jrc.it/>
http://langtech.jrc.it, <http://press.jrc.it/NewsExplorer/>

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Nomi Guthmann
Sent: 01 March 2007 17:29
Subject: [Corpora-List] Corpus of translated material

Dear corpora list members,

We are doing a project concerned with corpus-based translation studies.

For this purpose, we are trying to collect a corpus of translated

material in the target language. The main requirement is to know

exactly what the source language was. Otherwise, we are happy with

data in any language and of any domain. For example, parallel corpora

(not necessarily aligned) would be an excellent resource, provided

that we know what the source language is.

We would highly appreciate any suggestions and references you may

have. I will post a summary of the replies.


Noemie Guthmann

Translation and Interpreting Studies Department

Bar Ilan University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.uib.no/public/corpora-archive/attachments/20070302/18a39c48/attachment.html

More information about the Corpora-archive mailing list