[Corpora-List] POS annotated corpora

Μαρία Αμάρι mammari at pme.duth.gr
Fri Jul 22 08:17:08 CEST 2016


Hello Tobias, Unfortunately, I cant help you with your annotated corpora, but I have a different suggestion, I think you might need to change the title of your mail so that your quest becomes directly visible. Good luck.

Maria Ammari Esp Lecturer Duth University Greece Quoting "Horsmann, Tobias" <tobias.horsmann at uni-due.de>:


> Hi everyone,
>
> I am looking for part-of-speech annotated corpora in any
> languages.
> Preferably hand-annotated or at least human-verified.
> I would prefer corpora that are available for direct
> download without additional "sign a licence agreement" barriers.
> Of course only material that is usable free of charge
> for research purposes so no "Data Consortium" or other resellers.
>
> So far I found those:
> Norwegian (http://www.nb.no/sprakbanken/show?serial=sbr-10)
> BrazPortugese Newswire (http://www.nltk.org/nltk_data/)
> Dutch Alpino (https://www.let.rug.nl/vannoord/trees/)
> Spanish (https://www.iula.upf.edu/recurs01_tbk_uk.htm)
> Italian-TurinTree/Parallel
> (http://www.di.unito.it/~tutreeb/treebanks.html)
> Polish National Corpus (http://nkjp.pl/index.php?page=14&lang=1)
> Icelandic-Historical Corpus
> (http://linguist.is/icelandic_treebank/Icelandic_Parsed_Historical_Corpus_(IcePaHC))
> Icelandic (http://www.malfong.is/index.php?lang=en&pg=mim)
> Slovene-English Parallel Corpus (http://nl.ijs.si/elan/)
> Finnish Treebank
> (http://www.ling.helsinki.fi/kieliteknologia/tutkimus/treebank/)
> German Tiger
> (http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger.html)
>
> Is anyone aware of additional corpora that can be
> directly downloaded (I need an annotated file, no web interface).
> I would appreciate suggestions to extend my current list
> and would post my final list once I am done collecting.
>
> Best,
> Tobias



More information about the Corpora mailing list