[Corpora-List] First Free English-Persian Parallel Corpus

Francis Tyers ftyers at prompsit.com
Thu Apr 15 01:02:16 CEST 2010


El dc 14 de 04 de 2010 a les 11:40 +0430, en/na Taher Pilevar va escriure:
> Please send this message to the list for the researches who are
> looking for English-Persian corpora:
>
> First Free English-Persian Parallel Corpus
>
> By Mohammad Taher Pilevar, NLP Lab, University of Tehran, Iran.
>
> 4 million tokens on each side
> Sentence Aligned
> Extracted from movie subtitles
> Text domain: informal/conversational
> Total alinged movie subtitles: 1600
>
> http://ece.ut.ac.ir/NLP/resources.htm

What is the copyright status of the corpus ? Are the subtitles all from public domain films ?

Fran



More information about the Corpora mailing list