Maria Ammari Esp Lecturer Duth University Greece Quoting "Horsmann, Tobias" <tobias.horsmann at uni-due.de>:
> Hi everyone,
> I am looking for part-of-speech annotated corpora in any
> Preferably hand-annotated or at least human-verified.
> I would prefer corpora that are available for direct
> download without additional "sign a licence agreement" barriers.
> Of course only material that is usable free of charge
> for research purposes so no "Data Consortium" or other resellers.
> So far I found those:
> Norwegian (http://www.nb.no/sprakbanken/show?serial=sbr-10)
> BrazPortugese Newswire (http://www.nltk.org/nltk_data/)
> Dutch Alpino (https://www.let.rug.nl/vannoord/trees/)
> Spanish (https://www.iula.upf.edu/recurs01_tbk_uk.htm)
> Polish National Corpus (http://nkjp.pl/index.php?page=14&lang=1)
> Icelandic-Historical Corpus
> Icelandic (http://www.malfong.is/index.php?lang=en&pg=mim)
> Slovene-English Parallel Corpus (http://nl.ijs.si/elan/)
> Finnish Treebank
> German Tiger
> Is anyone aware of additional corpora that can be
> directly downloaded (I need an annotated file, no web interface).
> I would appreciate suggestions to extend my current list
> and would post my final list once I am done collecting.