[Corpora-List] comparable corpora in multiple language

Roi Reichart roiri at ie.technion.ac.il
Sat Feb 7 19:05:20 CET 2015


I am looking for comparable corpora in as many languages as possible, but most importantly in English, Italian, German and Russian. The corpora should be suitable for vector space modeling including NN training (i.e. having Gigas of words). We have already experimented with Wikipedia so we are looking for additional corpora.

Thanks, Roi -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 424 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150207/04372a9f/attachment.txt>

More information about the Corpora mailing list