[Corpora-List] free manually POS-tagged corpus

Thomas Proisl thomas.proisl at fau.de
Thu Dec 6 08:03:58 CET 2012


Hi Alisa,

you might want to take a look at the manually annotated subcorpora (MASC) of the American National Corpus (http://www.americannationalcorpus.org/MASC/Download.html). MASC I is already available and – to quote from the website – consists of “80K words of data with validated annotations for token, part of speech, sentence boundary, noun chunks, verb chunks, named entities, and Penn Treebank syntax; and full-text FrameNet annotation for seventeen texts.”

Best regards, Thomas

-- FAU Erlangen-Nürnberg Department Germanistik und Komparatistik Professur für Korpuslinguistik Bismarckstr. 6, 91054 Erlangen

Fon: +49 9131 85-25908; Fax: +49 9131 85-29251 http://www.linguistik.uni-erlangen.de/~tsproisl/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121206/de460ede/attachment.asc>



More information about the Corpora mailing list