[Corpora-List] State of the Art: historical POS-Tagging EN?

Koos Wilt kooswilt at gmail.com
Thu Jun 1 18:30:42 CEST 2017

I used POS tagging in Python about two months ago for a study (of something else, but POS tagging was part of a 'linguistics techniques' ensemble). I have already forgotten exactly what it is I did, and whether it was part of NLKT proper or whether I got creative. I can drag it up and send it to you if you deem this useful.

Best regards,


2017-06-01 16:33 GMT+02:00 Herrmann, Berenike < jb.herrmann at phil.uni-goettingen.de>:

> Dear all,
> We are preparing a project on lexico-semantic analyses of 18th/19th
> Century __English-written__ texts from different written genres: __essays,
> literary texts, also letters and diaries__. It's (mainly) British English.
> I'd like to know the state of the art:
> - What out-of-the box taggers (Tree Tagger, Perceptron, TnT, Stanford,
> CLAWS, etc.) perform best on this type of data?
> - What tagger types are possibly best suited? (HMM, maximum entropy, CRF,
> etc.)
> - Are there any historical/genre-specific language models available?
> - How about tokenizers/orthographic normalization: Is either an issue for
> British English of that period?
> Any kind of pointer and/or assessment is welcome.
> Many great thanks!!!
> Very best,
> Berenike
> https://jberenike.github.io/
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2193 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20170601/146760d9/attachment.txt>

More information about the Corpora mailing list