[Corpora-List] Russian text/speech corpora (coreference; dialogue)

Anne Schumann anne.schumann at Tilde.lv
Fri Jan 27 13:56:56 CET 2012

Hi Katja,

You probably already know the Russian corpus collection, but in case you don't, here's the link: http://corpus.leeds.ac.uk/ruscorpora.html. There is a large Russian internet corpus that should contain a share of informal dialogue and there seems to be a corpus with crawls from Russian forums. The internet corpus was lemmatized with TreeTagger regarding the forums corpus I don't have information. For offline use you would need to contact the owner, Serge Sharoff. I also know that at the ICLTT at the Austrian Academy of Sciences work has been carried out on aligning Dostoevsky's "Idiot" with one (?) of its German translations. This text has a lot of informal dialogue, and since you mentioned novels it may under certain circumstances be an interesting source of information. I am, however, not familiar with the conditions for using the corpus. Here's a link with contact information: http://www.aac.ac.at/institution.html. Good luck!

Anne-Kathrin Schumann PhD student University of Vienna

More information about the Corpora mailing list