[Corpora-List] comparable corpora in multiple language

Roi Reichart roireichart at gmail.com
Sat Feb 7 19:07:45 CET 2015


I am looking for comparable corpora in as many languages as possible, but most importantly in English, Italian, German and Russian. The corpora should be suitable for vector space modeling including NN training (i.e. having Gigas of words). We have already experimented with Wikipedia so we are looking for additional corpora.

Thanks, Roi -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 671 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20150207/b10ee6dc/attachment.txt>

More information about the Corpora mailing list